Advanced Bioinformatics Scripting in Python, BioPython, R & BioConductor
About Course
Writing short scripts, programs and developing softwares for various biological data analysis such as Sequences Alignment and Analysis, Genome Analysis, Proteome Analysis, Phylogenetic Analysis, Biological data visualization, MicroArray gene expression analysis, etc, requires a great deal of understanding of biological programming languages and the knowledge of how to utilize such programming languages to write the scripts.
BioCode is offering an Advanced Bioinformatics Scripting in Python, BioPython, R & BioConductor course so that you’ll learn from the very basics of biological programming in Python, BioPython & R to an advanced level understanding of Bioinformatics Scripting, even if you lack prior knowledge. You will understand various concepts related to how to write programs for MicroArray Gene Expression Analysis, ggplot2 biological data visualization & sequence retrieval, alignment, BLAST database searching & phylogenetic analysis in BioPython. You’ll also be learning complete end-to-end Linux (BASH) for Bioinformatics.
This course is for absolute beginners in bioinformatics scripting and you don’t require any prior knowledge of scripting or even bioinformatics to get started with this course. Everyday Bioinformatics analysis involves the extensive study and analysis of huge biological datasets. Linux and programming languages like Python and R have made it easy to perform analysis on huge biological data sets.
This course will include the following sections:
Section 1: Python
Description: This section will focus on making sure that the students gain an understanding of scripting in Python language and the basic functions that can be used to manipulate biological data.
Learning Outcomes: Upon completion of this section, students will be able to:
- Learn the Importance of Python in Bioinformatics.
- Understand Python Programming Language.
- Install Python Language.
- Discuss Comments in Programming Language.
- Perform Basic Input and Output Functions.
- Perform Mathematical Operations.
- Explain Strings Data Structure.
- Explain Dictionaries.
- Discuss Lists in Python.
- Describe Tuples.
- Explain Sets.
- Execute If-Else Conditions in Scripts.
- Execute While Loop and Perform Biological Data Analysis.
- Explain CSV Files.
- Read Files.
- Write Files.
- Consolidate (merge) Multiple DNA and Protein Sequences into one FASTA File.
- Describe OS Module.
- Explain Functions in Python.
- Use the “With” Statement in Python.
- Perform Error Handling.
Section 2: BioPython
Description: This section will ensure that the students will learn about the various functions that help in our biological data analysis in BioPython module provided by Python programming language.
Learning Outcomes: Upon completion of this section, students will be able to:
- Understand the BioPython module.
- Install BioPython.
- Create a Sequence Object Using Bio.Seq Class.
- Explain How a Sequence Object Behaves like a String.
- Perform Central Dogma in BioPython.
- Import UnknownSeq and MutableSeq Objects from Bio.Seq Class.
- Understand the Alphabets of Biology Using Bio.Alphabet Class.
- Explain IUPAC Module and Types of Sequence Representation.
- Concatenate Multiple Sequence Records Using Generic Alphabets.
- Create Sequence Records Using SeqRecord Module.
- Utilize the SeqRecord Module to Demonstrate the Representation of FASTA File Within BioPython.
- Utilize the SeqRecord Module to Demonstrate the Representation of GenBank File Within BioPython.
- Utilize the Formatting Feature of the SeqRecord Module.
- Compare and Read Multiple FASTA Files from Directory Using SeqRecord Module in BioPython.
- Read a Sequence File Using SeqIO class.
- Parse a Sequence File Using SeqIO class.
- Parse a Compressed Sequence File and Create a Dictionary of Sequences.
- Write Sequences and SeqRecords into Files.
- Extract Annotations and Perform Pattern-wise Sequence Data Extraction Using SeqIO module.
- Read and Parse a Multiple Sequence Alignment File using AlignIO Module.
- Write Alignment and Multiple Sequence Alignment Records using AlignIO Module.
- Convert Alignment Formats.
- Manipulate Alignments.
- Align Multiple Sequences Using the ClustalW Python Wrapper.
- Align Two Sequences Using the paiwise2 Function in BioPython.
- Read Multiple Sequence Alignment Files of a Particular Format and Map Information of Alignments.
- Format Alignments.
- Truncate the Specific Regions from the Entire Alignment (Slice Alignments).
- Query NCBI BLAST Through Python.
- Access ENTREZ Using Python.
- Parse the BLAST Results using the Bio.Blast module.
- Get the Summary of Accessions Using Esummary Function of Entrez module in BioPython.
- Download Complete Records Using EFetch Function.
- Use EGQuery Function to do Global Queries for Search Count.
- Search for Database Links of Records Using Elinks.
- Search the Entrez Database Using ESearch Function.
- Use ESpell Function to Get the Correct Spellings for your Search Terms.
- Download GenBank and Entrez Records.
- Search Taxonomy Database.
- Download PubMed Articles.
- Read a PDB (3D Structure) File Using Bio.PDB Module.
- Calculate the Distance Matrix Between Sequences for Phylogenetic Analysis.
- Convert Phylogenetic Tree Data Formats.
- Print Out the Phylogenetic Tree in ASCII.
- Read Phylogenetic Trees.
- Visualize and Manipulate Phylogenetic Trees.
- Create a Web Logo of Motifs.
- Perform MEME Analysis.
- Write Out Phylogenetic Data.
Section 3: R Language
Description: This section will focus on making sure that the students will learn about R language, various biological functions that are performed using R, and how R is used to visualize biological data using the ggplot2 package.
Learning Outcomes: Upon completion of this section, students will be able to:
- Discuss R Language.
- Install R.
- Explain Comments.
- Declare Variables and Objects.
- Use Built-in Functions and ARGS.
- Explain Samples and Replacement.
- Write their own Functions and Arguments.
- Create Customized Scripts.
- Discuss Packages in R.
- Install Bioinformatics Packages in R.
- Initialize Library to Perform R Functions.
- Get Help from Help Packages.
- Explain Atomic Vectors in R.
- Explain Integers, Doubles, Logicals, and Factors in R.
- Discuss Dim and Dimensions in R.
- Explain Attributes and Names.
- Describe Matrix and Matrices.
- Explain Arrays and Lists.
- Describe Coercion.
- Explain Data Frames.
- Load Biological Data.
- Save Biological Data.
- Perform R Notation and Select Values from Biological Datasets.
- Discuss Positive Integers for Subsetting Biological Datasets.
- Discuss Negative Integers for Subsetting Biological Datasets.
- Describe Zero Notation for Subsetting Biological Datasets.
- Explain Blank Spaces for Subsetting Biological Datasets.
- Explain Dollar Signs for Subsetting Biological Datasets.
- Modify Values in Existing Datasets.
- Explain NA (Not Available) Values in Biological Datasets.
- Figure out NA Values in Biological Datasets.
- Perform Logical Subsetting in Biological Datasets.
- Use If Else Statement in Code.
- Use For Loops and Perform Biological Data Binding.
- Use While Loops and Read Multiple Biological Datasets.
- Explain ggplot2 and its Use in Biological Data Representation.
- Describe Key Components in ggplot2.
- Visualize Human Mitochondrial Proteome.
- Facet the Human Chromosome Dataset.
- Smooth Out the Biological Data.
- Create Box Plot for Human Mitochondrial Proteome.
- Create Histograms for Human Mitochondrial Pattern Finding.
- Create Frequency Plots for Human Mitochondrial Information Frequency Mining.
- Create Bar Charts for Human Mitochondrial Knowledge Mining.
- Scale and Limit Data Visualization.
- Visualize Phylogenetic Tree.
- Save Visualizations in High Resolution.
Section 4: Linux
Description: This section will focus on making sure that the students will learn about R language
Learning Outcomes: Upon completion of this section, students will be able to:
- Discuss Linux Operating System.
- Print Working Directory in Linux.
- Make Directories in Linux.
- Change Directories in Linux.
- Move Files, Directories, and Data.
- Delete Files and Directories in Linux.
- Find the Programs Installed by the User.
- Find the Files Created by the User.
- List Files and Directories on Linux.
- Pipe and Redirect Data.
- Visualize and Inspect Text Data.
- Read the Specified Number of Lines from the Bottom
- Modify File Statistics and Create Files.
- See the Statistics of Files & Directories.
- Retrieve Genome Assemblies.
- Retrieve Bioinformatics Files.
- Create and Edit Text Files.
- Find Sequence Differences in Files.
- Compress and Archive Files Efficiently.
- Extract Compressed Content.
- Create Archives of Genome Data.
- Find Uncharacterized Proteins in the Human Genome.
- Subset Required Textual Data from Text Files.
- Sort Data.
- Find Unique Data Items.
- See the Statistics of Data Within the File.
- Copy Files and File Contents.
- Properly Visualize Delimited Datasets.
Course Content
Python
-
Why Python in Bioinformatics
09:16 -
Introduction to Python and it’s Installation
08:25 -
Comments
05:43 -
Basic Input and output
15:38 -
Mathematical Operations
07:20 -
Strings
21:51 -
Dictionaries
10:57 -
Lists
28:48 -
Lists(pt 2) and Tuples
10:38 -
Sets
07:36 -
If-Else
09:19 -
For Loop and Calculation of Molecular Weight of Proteins
10:56 -
While Loop and Biological Data Analysis
09:37 -
CSV (A special kind of file in Bioinformatics)
08:42 -
Reading Files
13:45 -
Writing Files
07:18 -
Consolidate (merge) multiple DNA and Protein Sequences into one FASTA file
09:25 -
OS Module
31:47 -
Function
26:41 -
With
08:50 -
Error Handling
15:31
BioPython
R
Linux
Exercise
Earn a certificate
Add this certificate to your resume to demonstrate your skills & increase your chances of getting noticed.