Author: Lakshay Anand ( lakshayanand15@gmail.com )

chromoMap provides interactive, configurable and elegant graphics visualization of chromosomes or chromosomal regions allowing users to map chromosome elements (like genes,SNPs etc.) on the chromosome plot.Each chromosome is composed of loci(representing a specific range determined based on chromosome length) that, on hover, shows details about the annotations in that locus range. The plots can be saved as HTML documents that can be shared easily. In addition, you can include them in R Markdown or in R Shiny applications.

Some of the prominent features of the package are:

Getting Started

This vignette provide you with a description of how you can use the various features of chromoMap to create fantastic annotation plots. If you, however, want to know more about the applications of the plot, please check the publication or simply contact me. I recommend using the RStudio application since the interactive plots can be viewed beautifully in the application’s viewer pane and it allows you to export the plot either as static image or a stand-alone HTML web page.

Install chromoMap

You can install the package by just typing the following commands:

install.packages("chromoMap")

Prepare Input Data Files

The chromoMap can be used to visualize and annotate chromosomes of any living organism. It is because it renders the chromosome based on the co-ordinate information that you will provide as input. So, if you have the genomic co-ordinates of the organism, you can create chromoMaps for it.

The input data are tab-delimited text files (almost similar to the BED file format). It takes seperate files for the chromosomes and the annotations. The input files should not have column headers (however, I have explained each column type below)

Chromosome Files

This file contains the co-ordinates of the chromosomes. The columns of this file(in order) are described below (all columns are mandatory unless specified optional):

  • chromosome name: a character representing the chromosome/contig/region name like ‘chr1’ or ‘1’ or ‘ch1’
  • chromosome start: a numeric value to specify chromosome (or chromosome region) start position. If you are considering entire chromosome this value is typically 1.
  • chromsome end: a numeric value specifying chromosome/contig/region end position. Again, if you are considering entire chromosome, then this value is the length of chromosome.
  • centromere start (optional): centromeres will be added automatically if you provide the its start cordinates.

I have developed algorithm that will include both start and end cordinates of chromosomes so that users can also be able to visualize a region of chromosome (not necessarily starting at 1). You can use your imagination to visualize anything that has cordinates( like RNA as well).

Your chromosome file should look like:

Annotation Files

Once you have chromosome co-ordinates in file, the next thing is to have data for annotation. annotation elements could be anything that has co-ordinates like genes,SNPS, etc. the data is also provided in the same format.

  • Element Name: a character specifying (uniquely) the elements. This can be identifiers,symbols etc.
  • Chromosome Name: a character specifying the chromosome name. [NOTE: the chromosome names should be consistent in chromosome and data files.]
  • Element Start: A numeric specifying element start position.
  • Element End: A numeric specifying element end position.
  • Data(optional): A numeric or character specifying the data value.
  • Hyperlinks(optional): a character specifying the URL of the element.

your annotation file should look like:



To prevent you from making some possible errors, here are a few points to care about while preparing files:

  • Do not include column headers in files.
  • Chromosomes names should be consistent in both files.
  • Elements and chromosome names (first column of both files) should be unique.

TIP: You can use MS excel to create your files and then use save as tab-delimited option.

My first chromoMap

Once you have your input files ready, begin creating chromosomes plots like a pro. The simplest annotation plot can be created using the following command:

library(chromoMap)
chromoMap("chromosome_file.txt","annotation_file.txt")

This will create a plot with default properties. Well, images included in this vignette are non-interactive, but you should see an interactive plot on RStudio’s viewer pane.

that’s it. you have created a simple annotation plot. now hover over the annotated loci to see the magic. you should see a tooltip describing:

If you have added hyperlinks to the elements, you can click the element labels in tooltip to access the web page.

well, on hover, the tooltip appear on the screen as long as your pointer is over the locus. It will disappear if you move the pointer away. I know, you must be thinking that the tootltip disappears before you can click to the element’s hyperlink. Don’t worry.

TIP: You can click the locus to have a stable tooltip on screen. click again on same or other locus to hide it again.

If you are not satisfied with the default look of the plot(which I’m sure you wouldn’t), you can play around with some of the properties to style your plot described under the section ‘configuring chromoMap’ in this vignette.

Polyploidy

Biologically speaking, chromosomes occur in sets. So, just visualizing a set of chromosome(called as haploid) wouldn’t be sufficient in some scenarios. Hence, I added the feature of adding sets of chromosomes as seperate set of files. Don’t forget to set the ploidy argument to the number of sets you are passing.

chromoMap(c("chromosome_file_set_1.txt","chromosome_file_set_2.txt")
          ,c("annotation_file_set_1.txt","annotation_file_set_2.txt"), ploidy = 2)

polyploidy turned out to be a powerful feature that can actually be used in multiple ways. The sets of chromsomes are rendered independent of each other and, hence, can differ in number and size. Using this feature you can visualize polyploid sets, haploid sets of different species on same plot, or even different samples of same species for comparison. Be creative to use this feature to your own requirement. Some interesting examples I have included in my paper.

Point and Segment-annotation plots

I have provided two types of annotation algorithms that will visualize the annotations differently. Point annotation will annotate an element on a single locus, ignoring its size. While, the segment-annotation algorithm consider the size and visualize the annotation as a segment.

The default is point-annotation. To use segment annotation set the argument segment_annotation to TRUE. Segment annotations will be advantageous in cases like displaying gene structure.

chromoMap("chromosome_file.txt","annotation_file.txt",segment_annotation = T)

here’s a hypothetical example.

Data-based annotation plots

Huge volume of biological data is being produced in today’s world. I thought it would be nice to visualize the data associated with the chrosmome regions or elements. You can do this by creating data-based color annotations in chromoMap. Before going forward let’s know about the data types chromoMap can handle. You can use either numeric data or character/categorical data for annotations. For the type of data type you are using, you need to set the argument data_type to either numeric or categorical. Also, to use this category of plot, you need to set data_based_color_map to TRUE.Now let’s explore the two major types of plots you can create.

group-annotation plots

As the name suggests, this type of plot can be used if your annotations are categorized into groups. This plot will assign distict colors to each group. Your annotations file’s data column should have groups assigned to each element as character value.

IMPORTANT: the data_colors argument will specify the color for each group and must be passed as a list() of vectors. If the ploidy is 2, two vectors will be passed in list. Hence, you must pass each a vector for each ploidy in a list.

chromoMap("chromosome_file.txt","annotation_file.txt",
          data_based_color_map = T,
          data_type = "categorical",
          data_colors = list(c("orange","yellow")))

The best thing is, it will also create a legend for each group with labels used by you as group names. isn’t it amazing? :) [see more under ‘legends’ section]

chromosome heatmaps

Now, let’s create the best plot of the year (just kidding :D). FYI chromosome heatmaps is the major inspiration I started developing this package. Anyways, chromosome heatmaps allow you to visualize numeric data as heat colors. In your annotations file, add numeric data in data column.

chromoMap("chromosome_file.txt","annotation_file.txt",
          data_based_color_map = T,
          data_type = "numeric")

Yes, the legends are shown in this plot too.

let’s look at the tool tip: