CloudMan

From BioUML platform
Revision as of 14:29, 5 January 2014 by Fedor Kolpakov (Talk | contribs)

Jump to: navigation, search

CloudMan is a cloud manager that orchestrates all of the steps required to provision a complete compute cluster environment on a cloud infrastructure.

It is primarily used in the context of Galaxy cloud and CloudBioLinux.

It allows one to manage the cluster, all through a web browser.

It makes possible to customize each instance of CloudMan and, if desired, preserve those customizations [1].

It provides support for instance sharing. Each CloudMan instance can be shared as a point in time configuration (in terms of tools, data, and configurations) with individuals or made public.

Contents

Implementation

Python standalone web application, MIT license.

Source code - https://bitbucket.org/galaxy/cloudman/ (82 files, ~800 kb)

CloudMan supports:

  • Amazon Web Services - EC2, EBS
  • OpenStack
  • OpenNebula


Recent extension of CloudMan to add support for data intensive workloads by incorporating Hadoop and HTCondor job managers and thus complement the previously available Sun Grid Engine (SGE):
http://bib.irb.hr/datoteka/631016.CloudMan_for_Big_Data.pdf

Architecture

Each instance is self-contained by keeping track of the configuration components that make up the deployment. As a result, it is possible to create custom versions of the default set of tools or indices. For example, instance A is using the default configuration (EBS snapshots colored in blue) while instance B has been customized (colored in yellow and orange). Once customizations are created, they can be shared with specific users (denoted with ‘S’) or made public (denoted with ‘P’). Instances are shared as point in time data and configuration. In the shown example, instance B has been shared at two time points. Any derived instances (instance C) will use the shared instance configuration settings upon startup. Each instance has its own user data; derived instances use the shared instance’s data and build on top of it.


Cloudman architecture.png

A conceptual representation of CloudMan’s architectural components that facilitate customization and sharing of instances.


Instance customization

At the infrastructure level, each instance is composed of [2]:

  • machine image - common denominator across all instances and contains all of the core software and libraries;
  • configuration repository - each CloudMan instance has its own configuration repository (for AWS - Simple Storage Service (S3)). It includes:
    • source code for CloudMan itself
    • boot-time contextualization script
    • references to the persistent storage resources
    • any application-specific configuration files.
  • one or more persistent storage units - store both data and applications available to the instance. They are realized as a combination of data volumes and snapshots.
    • for AWS - Elastic Block Store (EBS) volumes
    • for OpenStack - Nova volumes.

Tools that are used but not modified at runtime are stored on the snapshots. At instance runtime, those snapshots are used to create temporary volumes, which are attached to instances and used in read-only mode. These snapshots can be modified to include any desired tool. Such modifications are performed at the file system level and the process of adding a tool is the same as installing a tool on any other comparable system.

Once modified, a new EBS volume is created and the instance configuration in the persistent configuration repository simply needs to point to the new snapshot and the instance will use it as part of its deployment.

The process of persisting the custom cluster configuration has been integrated into CloudMan’s web interface, thus simplifying and automating this process.

Instance sharing

CloudMan automates this process by creating a copy of the instance’s configuration and adjusting permissions on shared objects like volume snapshots [2].

Sharing is realized as a point-in-time snapshot of the configuration and data allowing an instance to be shared multiple times at different time points.

Each shared instance may have different access permissions set.

The described instance sharing is currently functional on the AWS cloud. Instance sharing on other cloud middleware solutions that the CloudMan platform is compatible with (OpenStack, Open Nebula, and soon Eucalyptus), is technically not yet possible due to the currently available cloud middleware functionality.

Web interface

CloudMan web interface CloudMan web interface

Publications

Error fetching PMID 23181507:
Error fetching PMID 22700313:
Error fetching PMID 22068528:
Error fetching PMID 21210983:
  1. Error fetching PMID 23181507: [Afgan2012a]
  2. Error fetching PMID 22068528: [Afgan2012b]
  3. Error fetching PMID 22700313: [Afgan2012b]
  4. Error fetching PMID 21210983: [Afgan2010]
All Medline abstracts: PubMed | HubMed

Source code

Tool URL Comment
CloudMan https://bitbucket.org/galaxy/cloudman/ Python, (82 files, ~800 kb)
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox