Quick Start

library(cinaR)
data("atac_seq_consensus_bm")

Bed formatted consensus matrix (chr, start, end and samples)

dim(bed)
## [1] 1000   25
# bed formatted file
head(bed[,1:4])
##         Chr     Start       End B6-18mo-M-BM-47-GT18-01783
## 52834  chr5  24841478  24845196                       1592
## 29780 chr17   8162955   8164380                        109
## 67290  chr8  40577584  40578029                         72
## 51295  chr4 145277698 145278483                        110
## 4267   chr1 180808752 180815472                       2452
## 45102  chr3  88732151  88732652                         49

Create the contrasts you want to compare, here we create contrasts for 22 mice samples from different strains.

# create contrast vector which will be compared.
contrasts<- c("B6", "B6", "B6", "B6", "B6", "NZO", "NZO", "NZO", "NZO", "NZO", "NZO", 
              "B6", "B6", "B6", "B6", "B6", "NZO", "NZO", "NZO", "NZO", "NZO", "NZO")

cinaR function directly computes the differentially accessible peaks.

# If reference genome is not set hg38 will be used!
results <- cinaR(bed, contrasts, reference.genome = "mm10")
## >> preparing features information...      2021-01-05 15:29:22 
## >> identifying nearest features...        2021-01-05 15:29:24 
## >> calculating distance from peak to TSS...   2021-01-05 15:29:25 
## >> assigning genomic annotation...        2021-01-05 15:29:25 
## >> assigning chromosome lengths           2021-01-05 15:29:52 
## >> done...                    2021-01-05 15:29:52

Now, you can access differential accessibility (DA) and enrichment results.

names(results)
## [1] "DA.results"         "Enrichment.Results"

Inside DA.results, you have the consensus peaks (cp) and differentially accessible (DA) peaks. If batch correction was run, then cp will be a batch-corrected consensus matrix, otherwise it is the filtered and normalized version of original consensus peaks you provided.

names(results$DA.results)
## [1] "cp"       "DA.peaks"

There are many information cinaR provides such as adjusted p value, log fold-changes, gene names etc for each peak:

colnames(results$DA.results$DA.peaks$B6_NZO)
##  [1] "Row.names"     "seqnames"      "start"         "end"          
##  [5] "width"         "strand"        "annotation"    "geneChr"      
##  [9] "geneStart"     "geneEnd"       "geneLength"    "geneStrand"   
## [13] "geneId"        "transcriptId"  "distanceToTSS" "gene_name"    
## [17] "logFC"         "FDR"

Here is an overview of those DA peaks:

head(results$DA.results$DA.peaks$B6_NZO[,1:5])
##                   Row.names seqnames     start       end width
## 1 chr10_105840598_105842176    chr10 105840598 105842176  1579
## 2   chr10_59950325_59952673    chr10  59950325  59952673  2349
## 3   chr10_63176490_63176839    chr10  63176490  63176839   350
## 4   chr10_77220928_77221910    chr10  77220928  77221910   983
## 5   chr10_79751429_79751786    chr10  79751429  79751786   358
## 6   chr10_86021157_86023861    chr10  86021157  86023861  2705

Since the comparison is B6_NZO, if fold-changes are positive it means they are more accesible in B6 compared to NZO and vice versa for negative values!

and here is a little overview for enrichment analyses results:

head(results$Enrichment.Results$B6_NZO[,c("module.name","overlapping.genes", "adj.p")])
##                 module.name                   overlapping.genes      adj.p
## 1         Myeloid lineage 1 TFEB,FBXL5,PLXNC1,GM2A,AGTPBP1,CTSB 0.05914491
## 2  U_metabolism/replication             SLC2A6,GM2A,CTSB,PECAM1 0.05914491
## 3  U_mitochondrial proteins     PIK3R1,PAQR3,UBE3A,MAP4K4,PTPRC 0.32816305
## 4 U_proteasome/ubiquitin cx                  PIK3R1,IREB2,PTPRC 0.39112517
## 5   U_Immunity/cytoskeleton                          RPS6,RPS19 0.66488512
## 6         Myeloid lineage 2                        RNF157,MTUS1 0.66488512

PCA Plots

You can easily get the PCA plots of the samples:

pca_plot(results, contrasts, show.names = F)

You can overlay different information onto PCA plots as well!

# Overlaid information
overlaid.info <- c("B6-18mo", "B6-18mo", "B6-18mo", "B6-18mo", "B6-18mo", 
                   "NZO-18mo", "NZO-18mo", "NZO-18mo", "NZO-18mo", "NZO-18mo", "NZO-18mo", 
                   "B6-3mo", "B6-3mo", "B6-3mo", "B6-3mo", "B6-3mo", 
                   "NZO-3mo", "NZO-3mo", "NZO-3mo", "NZO-3mo", "NZO-3mo", "NZO-3mo")
# Sample IDs
sample.names <- c("S01783", "S01780", "S01781", "S01778", "S01779", 
"S03804", "S03805", "S03806", "S03807", "S03808", 
"S03809", "S04678", "S04679", "S04680", "S04681", 
"S04682", "S10918", "S10916", "S10919", "S10921", 
"S10917", "S10920")
pca_plot(results, overlaid.info, sample.names)
## Warning: ggrepel: 7 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Heatmaps

Differential peaks

You can see the available comparisons using:

show_comparisons(results)
## [1] "B6_NZO"

Then, plot the differential peaks for a selected contrast using:

heatmap_differential(results, comparison = "B6_NZO")

Also, you can configure your heatmaps using the additional arguments of pheatmap function. For more information check out ?pheatmap.

heatmap_differential(results, comparison = "B6_NZO", show_colnames = FALSE)