Introduction

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • What is Bioconductor and why would I use it?

Objectives
  • Describe what kinds of resources Bioconductor provides and give an example of what they might be useful for.

  • Locate Bioconductor packages and learn their intended purpose.

  • Locate Bioconductor package reference manuals and vignettes.

  • Install Bioconductor packages using biocLite.

What is Bioconductor?

Bioconductor is a collection of open-source tools that use R. It contains over 1400 R packages that can be used to analyze many types of biological data. A major benefit of Bioconductor is that the packages are usually well-documented, and there is an active community of users and developers.

Bioconductor Packages

There are three types of Bioconductor packages:

Bioconductor packages are available for many different steps in data analysis pipelines. Often, you can choose between several different Bioconductor packages for the same step in the pipeline. This allows you to try different approaches and pick which one is the best fit for your project or data.

Pipeline schematic Image from https://www.bioconductor.org/help/course-materials/2017/OSU/B1_Bioconductor_intro.html

Exploring Bioconductor Packages

A browsable list of Bioconductor packages is available in biocViews on the Bioconductor website

Take a look at some of the packages available. Later in this lesson, we’ll be looking for differential expression of genes in RNA-Seq data. Find a Bioconductor package that could be useful for that.

Solution

By browsing to Software, then BiologicalQuestion, then DifferentialExpression you’ll see 241 packages available, from ABSSeq to xps.

Bioconductor Documentation

One of the best things about Bioconductor is the documentation available for its packages. Packages have (at least) both a Reference Manual and a Vignette. The Reference Manual describes the package in a standardized way, indicating its usage and listing the arguments available. The Vignette is more like a tutorial: it provides the information you need to get started, and takes you through some example workflows.

Finding Reference Manuals and Vignettes

For example, take a look at the biocViews page for the popular differential gene expression analysis package DESeq2.

Under “Documentation” you’ll see links to both the Vignette (called Analyzing RNA-Seq data with DESeq2) and the Reference Manual.

Install Bioconductor

As with any R package, in order to use a Bioconductor package you must first install it. There is a special method of installing Bioconductor packages, called biocLite, that is different from installing other R packages. To get started, first you need to get biocLite. To do so, run this command in RStudio:

source("https://bioconductor.org/biocLite.R")
biocLite()

Now you can use BiocLite to install Bioconductor packages. BiocLite helps to manage updates and make sure that Bioconductor packages can work together (almost) seamlessly.

Key Points

  • Bioconductor is one source for biological data analysis tools.

  • Bioconductor packages are documented with reference manuals and vignettes

  • Bioconductor packages are installed with biocLite