CloudBioLinux offers genome analysis resources for cloud computing platforms such as Amazon EC2. We develop freely available, community maintained software images and data repositories for biological analysis.

Motivation

Many bioinformatics workflows involve large datasets in which high performance computing is needed. Cloud computing provides researchers with the ability to perform computations using a practically unlimited pool of virtual machines, using platforms such as Amazon EC2 and Eucalyptus. CloudBioLinux utilizes these resources to enable instant access to biological software, programming libraries and data.

CloudBioLinux is a community project and we welcome contributors and feedback. Software and data are built off of a Ubuntu base image using Fabric for fully automated installation and deployment. Packages are specified in simple configuration files for both Linux packages and programming language libraries. Please fork our code on GitHub and suggest improvements and additions.

We hope these resources will be useful for biologists as well as programmers. With the help of the NEBC Bio-Linux development team, the images include a FreeNX desktop environment designed to ease the transition to remote computational analysis.

Amazon EC2 Resources

  • ami-00c02a69 -- 64 bit image (21 August 2010)
  • ami-0af91263 -- 32 bit image (16 July 2010)
  • snap-84e771ef -- Data volume (21 August 2010)

Documentation