Added a section “ID classification” in the documentation for exported data catalog.row.order.
New argument suppress.discarded.variants.warnings in exported function AnnotateIDVCF with default value TRUE.
Added another paper information in AddRunInformation. “Characterization of colibactin-associated mutational signature in an Asian oral squamous cell carcinoma and in other mucosal tumor types”, Genome Research 2020 https://doi.org/10.1101/gr.255620.119.
Changed the format of DOIs in DESCRIPTION according to CRAN policy.
Changed back the return value of ReadStrelkaIDVCFs, ReadStrelkaSBSVCFs, ReadMutectVCFs to a list of data frames with no variants discarded.
Combined all the discarded variants from ReadAndSplitMutectVCFs and ReadAndSplitStrelkaSBSVCFs under one element discarded.variants in the return value. An extra column discarded.reason were added to show the details.
Updated internal functions ReadVCF and ReadVCFs not to remove any discarded variants.
No more removal of “chr” in the CHROM column when reading in VCFs.
CheckAndReturnSBSMatrix, CheckAndReturnDBSMatrix, CreateOneColSBSMatrix,CreateOneColDBSMatrix, VCFsToSBSCatalogs, VCFsToDBSCatalogs.CalculateExpressionLevel for the edge case.CreateOneColIDMatrix when the ID.class contains non canonical representation of the ID mutation type.The return value of exported function ReadStrelkaIDVCFs now sometimes contains a new element, discarded.variants. This appears when there are variants that were discarded immediately after reading in the VCFs. At present these are variants that have duplicated chromosome/positions and variants that have illegal chromosome names. This means that the user must check the return to see if discarded.variants is present and remove it before passing the return to a function that expects a list of VCFs. Code in ICAMS that takes lists of VCFs already checks for this element and removes it if present.
Added argument return.annotated.vcfs to exported function VCFsToIDCatalogs. The default value for the argument is FALSE to be consistent with other functions.
Argument return.annotated.vcfs in functions VCFsToSBSCatalogs,VCFsToDBSCatalogs, VCFsToIDCatalogs, MutectVCFFilesToCatalog, MutectVCFFilesToCatalogAndPlotToPdf, MutectVCFFilesToZipFile, StrelkaSBSVCFFilesToCatalog, StrelkaSBSVCFFilesToCatalogAndPlotToPdf, StrelkaSBSVCFFilesToZipFile, StrelkaIDVCFFilesToCatalog, StrelkaIDVCFFilesToCatalogAndPlotToPdf and StrelkaIDVCFFilesToZipFile.
Argument suppress.discarded.variants.warnings in functions ReadAndSplitMutectVCFs, ReadAndSplitStrelkaSBSVCFs, VCFsToSBSCatalogs,VCFsToDBSCatalogs, VCFsToIDCatalogs, MutectVCFFilesToCatalog, MutectVCFFilesToCatalogAndPlotToPdf, MutectVCFFilesToZipFile, StrelkaSBSVCFFilesToCatalog, StrelkaSBSVCFFilesToCatalogAndPlotToPdf, StrelkaSBSVCFFilesToZipFile, StrelkaIDVCFFilesToCatalog, StrelkaIDVCFFilesToCatalogAndPlotToPdf and StrelkaIDVCFFilesToZipFile.
Added documentation to exported functions ReadAndSplitStrelkaSBSVCFs, StrelkaSBSVCFFilesToCatalog, StrelkaSBSVCFFilesToCatalogAndPlotToPdf and StrelkaSBSVCFFilesToZipFile.
Added information on the “ID classification” in documentation of functions generating ID catalogs, FindDelMH and FindMaxRepeatDel.
Minor changes to documentation of functions PlotCatalog, PlotCatalogToPdf, StrelkaSBSVCFFilesToZipFile, StrelkaIDVCFFilesToZipFile and MutectVCFFilesToZipFile.
Updated documentation for the return value of functions
StrelkaIDVCFFilesToCatalog, StrelkaIDVCFFilesToCatalogAndPlotToPdf, StrelkaIDVCFFilesToZipFile and VCFsToIDCatalogs to make it clearer to the user.
Added new exported data of catalog row order for SBS96, SBS1536 and DBS78 in SigProfiler format to catalog.row.order.sp.
New internal function ConvertICAMSCatalogToSigProSBS96, ReadVCF, ReadVCFs.
New exported function GetFreebayesVAF for calculating variant allele frequencies from Freebayes VCF.
New test data for Strelka mixed VCF.
Added time zone information to file “run-information.txt” when calling functions MutectVCFFilesToZipFile, StrelkaSBSVCFFilesToZipFile and StrelkaIDVCFFilesToZipFile.
Enabled “counts” -> “counts.signature” catalog transformation when the source catalog has NULL abundance.
Added legend for SBS192 plot and changed the legend text for SBS12 plot.
Added a second element plot.object to the return list from function PlotCatalog for catalog types “SBS192Catalog”, “DBS78Catalog”, “DBS144Catalog” and “IndelCatalog”. The second element is a numeric vector giving the coordinates of the bar midpoints, useful for adding to the graph.
Made the returns from PlotCatalog and PlotCatalogToPdf invisible.
Improved time performance of GetMutectVAF, CanonicalizeDBS, CanonicalizeQUAD.
if statements in GetCustomKmerCounts、 GetStrandedKmerCounts and GetGenomeKmerCounts.
CreateOneColIDMatrix when there is NA ID category.
GetMutectVAF to check if the VCF is indeed a Mutect VCF.
CreateOneColDBSMatrix when the VCF does not have any variant in the transcribed region.
CalculatePValues when there is only a single expression value.
Created an internal function MakeDataFrameFromVCF to read in data lines of a VCF.
New argument name.of.VCF in internal function CheckAndFixChrNames to make the error message more informative.
New argument name.of.VCF in exported function AnnotateIDVCF to make the error message more informative.
ReadStrelkaIDVCF to make the error message more informative.AnnotateIDVCF to a list. The first element annotated.vcf contains the annotated VCF. If there are rows that are discarded, the function will generate a warning and a second element discarded.variants will be included in the returned list.flag.mismatches deprecated in exported function AnnotateIDVCF. If there are mismatches to references, the function will automatically discard these rows. User can refer to the element discarded.variants in the return value for the discarded variants.SplitStrelkaSBSVCF when there are no non.SBS mutations in the input.MakeDataFrameFromMutectVCF when a Mutect VCF has no meta-information lines.CreateOneColSBSMatrix when an annotated SBS VCF has variants on transcribed regions that all fall on transcripts on both strand.CreateOneColDBSMatrix when an annotated DBS VCF has variants on transcribed regions that all fall on transcripts on both strand.ReadAndSplitStrelkaSBSVCFs.MutectVCFFilesToZipFile, StrelkaSBSVCFFilesToZipFile and StrelkaIDVCFFilesToZipFile.trans.ranges to make it optional.name.of.VCF in internal functions ReadStrelkaSBSVCF, ReadStrelkaIDVCF and exported function GetStrelkaVAF.flag.mismatches in functions VCFsToIDCatalogs, MutectVCFFilesToCatalog, MutectVCFFilesToCatalogAndPlotToPdf, MutectVCFFilesToZipFile, StrelkaIDVCFFilesToCatalog, StrelkaIDVCFFilesToCatalogAndPlotToPdf and StrelkaIDVCFFilesToZipFile.GetStrelkaVAF andGetMutectVAF to a data frame which contains the VAF and read depth information.PlotCatalogToPdf a list. The first element is a logical value indicating whether the plot is successful. The second element is a list containing the strand bias statistics (only for SBS192Catalog with “counts” catalog.type and non-NULL abundance and argument plot.SBS12 = TRUE).PlotCatalog and PlotCatalogToPdf: For class SBS96Catalog: (New) Allow setting ylim and cex. (New) For PlotCatalog (not PlotCatalogToPdf), allow plotting of a 96 x 2 catalog, in which case behavior is a stacked bar chart. (New) Plot x axis tick marks if xlabels is not TRUE; set par(tck = 0) to suppress. For class IndelCatalog: (New) Allow setting ylim.GetCustomKmerCounts.PlotTransBiasGeneExpToPdf so that ymax on the plot will be changed based on plot.type.flat.abundance from “numeric” to “integer”.TransformCatalog; see documentation for rationale.TransformCatalog and updated its documentation for parameter target.abundance.CheckAndFixChrNames and updated the automated tests.TransformCatalog.GetMutectVAF and updated the warning message to make it more informative.cbind to check the attributes of the incoming catalogs and assign attributes accordingly.TransformCatalog to check the attributes of the catalog to be transformed in the first place.AnnotateSBSVCF, AnnotateDBSVCF and AnnotateIDVCF.PlotTransBiasGeneExp and PlotTransBiasGeneExpToPdf.names.of.VCFs in functions ReadAndSplitMutectVCFs, ReadAndSplitStrelkaSBSVCFs, ReadStrelkaIDVCFs, MutectVCFFilesToCatalog, MutectVCFFilesToCatalogAndPlotToPdf, StrelkaIDVCFFilesToCatalog, StrelkaIDVCFFilesToCatalogAndPlotToPdf, StrelkaSBSVCFFilesToCatalog and StrelkaSBSVCFFilesToCatalogAndPlotToPdf for users to specify the names of samples in the VCF files.as.catalog.gene.expression.data.HepG2 and gene.expression.data.MCF10A.tumor.col.names in functions ReadAndSplitMutectVCFs, MutectVCFFilesToCatalog and MutectVCFFilesToCatalogAndPlotToPdf to specify the column of the VCF that contains sequencing statistics such as sequencing depth; this column is often called “unknown” in Mutect.MutectVCFFilesToCatalog, MutectVCFFilesToCatalogAndPlotToPdf, StrelkaSBSVCFFilesToCatalog, StrelkaSBSVCFFilesToCatalogAndPlotToPdf, VCFsToSBSCatalogs, VCFsToDBSCatalogs, ReadCatalog informing the user how to change attributes of the generated catalog.VCFsToIDCatalogs, StrelkaIDVCFFilesToCatalog and StrelkaIDVCFFilesToCatalogAndPlotToPdf a list; 1st element is the spectrum catalog (previously the only return); 2nd element is a list of VCFs with additional annotations.PlotCatalog a list. The first element is a logical value indicating whether the plot is successful. The second element is a numeric vector giving the coordinates of all the bar midpoints drawn, useful for adding to the graph (only implemented for SBS96Catalog).output.file argument in MutectVCFFilesToCatalogAndPlotToPdf, StrelkaSBSVCFFilesToCatalogAndPlotToPdf, and StrelkaIDVCFFilesToCatalogAndPlotToPdf so that an indicator of the catalog type plus “.pdf” is simply appended to the base output.file name. Also made this argument optional with sensible default behavior.trans.ranges.GRCh37, trans.ranges.GRCh38 and trans.ranges.GRCm38.FindDelMH, cryptic repeats (i.e. un-normalized deletions in a repeat such as GAGG deleted from CCCAGGGAGGGTCCC, which should be normalized to a deletion of AGGG) are now ignored with a warning rather than causing a stop.FindDelMH, which previously did not flag the cryptic repeat in what is now the second example in the function documentation.as.catalog supports creation of the catalog from a vector (interpreted as a 1-column matrix) and optionally infers the class from the number of rows in the input.