expands: Expanding Ploidy and Allele-Frequency on Nested Subpopulations
Expanding Ploidy and Allele Frequency on Nested Subpopulations (expands) characterizes coexisting subpopulations in a single tumor sample using copy number and allele frequencies derived from exome- or whole genome sequencing input data (http://www.ncbi.nlm.nih.gov/pubmed/24177718). The model detects coexisting genotypes by leveraging run-specific tradeoffs between depth of coverage and breadth of coverage. This package predicts the number of clonal expansions, the size of the resulting subpopulations in the tumor bulk, the mutations specific to each subpopulation and tumor purity. The main function runExPANdS() provides the complete functionality needed to predict coexisting subpopulations from single nucleotide variations (SNVs) and associated copy numbers. The robustness of subpopulation predictions increases with the number of mutations provided. It is recommended that at least 200 mutations are used as input to obtain stable results. Updates in version 1.6 include: (1) So far mutations had been assigned to maximal one subpopulation. However mutations may not be exclusive to the assigned subpopulation but may also be present in smaller, descending subpopulations. Whether or not this is the case is now decided by leveraging the predicted phylogenetic structure of the subpopulation composition. (2) Included homozygous deletion as potential scenario when modeling (SNV,CNV) pairs with overlapping genomic location, that are propagated during distinct clonal expansions. (3) Optimized solution to improve sensitivity at cell-frequency distribution margins. Need for improvement was because subpopulation detection sensitivity correlates to centrality of subpopulation size during clustering. Tolerance of copy number and allele frequency measurement errors must be higher for marginal cell-frequencies than for central cell-frequencies, in order to counteract the reduced cluster detection sensitivity at the cell-frequency distribution margins. This is only relevant during subpopulation detection (SNV clustering), cell-frequency independent error tolerance still applies during SNV assignment. (4) Fixed a bug where incorrect data matrix conversion could occur when handing non-numerical matrix as parameter to function runExPANdS(). Further documentation and FAQ around this package is available at http://dna-discovery.stanford.edu/software/expands.