Course Overview
- Understand the basics of genomics and bioinformatics.
- Learn to use Python for genomic data analysis.
- Develop skills to handle and manipulate genomic datasets.
- Perform various genomic analyses, including sequence alignment, variant calling, and gene expression analysis.
- Interpret and visualize genomic data.
Course Structure
The course is divided into 12 weeks, with each week focusing on specific topics. Each session includes theoretical lectures and practical coding exercises.
Week 1: Introduction to Genomics and Python Programming
Lecture:
- Introduction to genomics and bioinformatics.
- Overview of genomic data types (DNA, RNA, proteins).
- Basics of Python programming and libraries (NumPy, Pandas).
Practical:
- Setting up Python environment for bioinformatics.
- Basic Python exercises for data manipulation.
Week 2: Biological Databases and Data Retrieval
Lecture:
- Introduction to biological databases (NCBI, Ensembl, UCSC Genome Browser).
- Data formats (FASTA, FASTQ, GFF, VCF).
Practical:
- Retrieving genomic data using Biopython.
- Parsing and processing different genomic data formats.
Week 3: Sequence Alignment
Lecture:
- Concepts of sequence alignment (global vs. local alignment).
- Algorithms for sequence alignment (Needleman-Wunsch, Smith-Waterman).
Practical:
- Implementing sequence alignment using Biopython.
- Using BLAST for sequence alignment.
Week 4: Genome Assembly and Annotation
Lecture:
- Genome assembly techniques (de novo vs. reference-based assembly).
- Annotation of genomic sequences.
Practical:
- Assembling short reads into contigs.
- Annotating genomic sequences using tools like Prokka.
Week 5: Variant Calling
Lecture:
- Understanding genetic variants (SNPs, indels, structural variants).
- Variant calling workflows.
Practical:
- Performing variant calling using tools like GATK.
- Annotating and filtering variants.
Week 6: RNA-Seq Data Analysis
Lecture:
- Introduction to RNA-Seq technology and applications.
- RNA-Seq data preprocessing and quality control.
Practical:
- RNA-Seq data alignment and quantification using tools like HISAT2 and featureCounts.
- Differential gene expression analysis using DESeq2.
Week 7: Epigenomics and ChIP-Seq Data Analysis
Lecture:
- Introduction to epigenomics and ChIP-Seq technology.
- Analyzing ChIP-Seq data for identifying protein-DNA interactions.
Practical:
- ChIP-Seq data preprocessing and peak calling using MACS2.
- Visualizing ChIP-Seq data using IGV.
Week 8: Metagenomics
Lecture:
- Introduction to metagenomics and microbial community analysis.
- Techniques for metagenomic data analysis.
Practical:
- Analyzing metagenomic data using tools like QIIME.
- Taxonomic and functional profiling of microbial communities.
Week 9: Population Genomics
Lecture:
- Concepts of population genomics and evolutionary genetics.
- Analyzing population structure and genetic diversity.
Practical:
- Performing population genomic analyses using tools like PLINK.
- Visualizing population structure using PCA and ADMIXTURE.
Week 10: Structural Variants and Copy Number Variations
Lecture:
- Introduction to structural variants (SVs) and copy number variations (CNVs).
- Techniques for detecting and analyzing SVs and CNVs.
Practical:
- Detecting structural variants using tools like Manta.
- Analyzing copy number variations using CNVkit.
Week 11: Functional Genomics and Pathway Analysis
Lecture:
- Understanding functional genomics and pathway analysis.
- Techniques for integrating multi-omics data.
Practical:
- Performing gene set enrichment analysis using GSEA.
- Pathway analysis using tools like KEGG and Reactome.
Syllabus: Single Cell Genomics Data Analysis Using Python
Course Overview
This course provides an in-depth exploration of single-cell genomics data analysis using Python. Students will learn the fundamental concepts, methodologies, and tools necessary for analyzing single-cell RNA sequencing (scRNA-seq) data. The course includes theoretical lectures, hands-on coding sessions, and project-based learning to ensure practical understanding and application of single-cell genomics.
Course Objectives
- Understand the basics of single-cell genomics and its applications.
- Learn to use Python for single-cell data analysis.
- Develop skills to preprocess, analyze, and interpret single-cell RNA sequencing data.
- Perform various analyses including clustering, differential expression, and trajectory analysis.
- Visualize and interpret single-cell data.
Course Structure
The course is divided into 12 weeks, with each week focusing on specific topics. Each session includes theoretical lectures and practical coding exercises.
Week 1: Introduction to Single-Cell Genomics
Lecture:
- Overview of single-cell genomics and its significance.
- Introduction to single-cell RNA sequencing (scRNA-seq) technology.
Practical:
- Setting up Python environment for single-cell genomics.
- Basic Python exercises for data manipulation (Pandas, NumPy).
Week 2: scRNA-seq Data Generation and Preprocessing
Lecture:
- scRNA-seq experimental workflow and data generation.
- Introduction to preprocessing steps: quality control, normalization, and feature selection.
Practical:
- Downloading and preprocessing scRNA-seq data using Scanpy.
- Quality control and filtering of single-cell data.
Week 3: Normalization and Batch Correction
Lecture:
- Techniques for normalizing scRNA-seq data.
- Handling batch effects in single-cell data.
Practical:
- Normalizing scRNA-seq data using Scanpy and Seurat.
- Performing batch correction using methods like ComBat and Harmony.
Week 4: Dimensionality Reduction
Lecture:
- Introduction to dimensionality reduction techniques: PCA, t-SNE, UMAP.
- Importance of dimensionality reduction in single-cell analysis.
Practical:
- Implementing PCA, t-SNE, and UMAP using Scanpy.
- Visualizing reduced-dimensional data.
Week 5: Clustering and Cell Type Identification
Lecture:
- Clustering techniques for single-cell data (K-means, hierarchical clustering, graph-based clustering).
- Identifying and annotating cell types.
Practical:
- Performing clustering using Scanpy and Seurat.
- Annotating cell types based on marker genes.
Week 6: Differential Expression Analysis
Lecture:
- Principles of differential expression analysis in single-cell data.
- Methods for identifying differentially expressed genes.
Practical:
- Conducting differential expression analysis using Scanpy and DESeq2.
- Visualizing differentially expressed genes.
Week 7: Trajectory and Pseudotime Analysis
Lecture:
- Introduction to trajectory analysis and pseudotime inference.
- Techniques for constructing single-cell trajectories.
Practical:
- Implementing trajectory analysis using tools like Monocle and Scanpy.
- Inferring pseudotime and visualizing cell state transitions.
Week 8: Integration of Multiple Datasets
Lecture:
- Techniques for integrating multiple single-cell datasets.
- Challenges and solutions in data integration.
Practical:
- Integrating datasets using tools like Seurat and Scanpy.
- Analyzing integrated single-cell data.
Week 9: Single-Cell Multi-Omics
Lecture:
- Introduction to single-cell multi-omics (scATAC-seq, CITE-seq).
- Applications and challenges of multi-omics data integration.
Practical:
- Analyzing scATAC-seq data using Python tools.
- Integrating scRNA-seq and scATAC-seq data.
Week 10: Advanced Topics in Single-Cell Genomics
Lecture:
- Advanced topics such as spatial transcriptomics and single-cell CRISPR screens.
- Future directions in single-cell genomics.
Practical:
- Exploring spatial transcriptomics data using Python tools.
- Analyzing single-cell CRISPR screen data.
Week 11: Visualization and Interpretation of Single-Cell Data
Lecture:
- Techniques for visualizing single-cell data.
- Interpreting complex single-cell datasets.
Practical:
- Creating visualizations using Scanpy and Seurat.
- Developing interactive single-cell data visualizations using Plotly and Dash.
Week 12: Project Work and Presentations
Lecture:
- Guidelines for single-cell data analysis projects.
- Ethical considerations in single-cell genomics.
Practical:
- Students work on individual or group projects to analyze a real single-cell dataset.
- Presentations and discussions of project results.
Assessment
Weekly Assignments:
- Practical coding exercises and mini-projects.
Midterm Exam:
- Written and practical exam covering Weeks 1-6.
Final Project:
- Comprehensive project involving the analysis of a single-cell dataset, including a written report and presentation.
Recommended Resources
Books:
- “Single-Cell RNA Sequencing” by Nilanjan Mukhopadhyay and Dipankar Kumar.
- “Bioinformatics with Python Cookbook” by Tiago Antao.
Online Resources:
- Tutorials and documentation from Scanpy and Seurat.
- Online courses and tutorials on single-cell genomics.
Software Tools:
- Python libraries: Scanpy, Pandas, NumPy, Matplotlib, Seurat.
- Specific single-cell tools: Monocle, Harmony, ComBat, Plotly, Dash.
This syllabus provides a structured approach to learning single-cell genomics data analysis using Python, blending theoretical knowledge with