SherlockHolmes: Building a Concordance of Terms in a Series of Texts

Compute the frequency distribution of a search term in a series of texts. For example, Arthur Conan Doyle wrote a total of 60 Sherlock Holmes stories, comprised of 54 short stories and 4 longer novels. I wanted to test my own subjective impression that, in many of the stories, Sherlock Holmes' popularity was used as bait to induce the reader to read a story that is essentially not primarily a Sherlock Holmes story. I used the term "Holmes" as a search pattern, since Watson would frequently address him by name, or use his name to describe something that he was doing. My hypothesis is that the frequency distribution of the search pattern "Holmes" is a good proxy for the degree to which a story is or is not truly a Sherlock Holmes story. The results are presented in a manuscript that is available as a vignette and online at <>.

Version: 1.0.1
Depends: R (≥ 4.2.0)
Imports: qpdf, stringr, dpseg, tableHTML, plotrix, zoo, stargazer, utils, graphics, grDevices, stats, textBoxPlacement, plot.matrix, devtools
Suggests: knitr, rmarkdown, testthat (≥ 3.0.0)
Published: 2023-03-28
Author: Barry Zeeberg [aut, cre]
Maintainer: Barry Zeeberg <barryz2013 at>
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
NeedsCompilation: no
CRAN checks: SherlockHolmes results


Reference manual: SherlockHolmes.pdf
Vignettes: SherlockHolmes Part I
SherlockHolmes Part II


Package source: SherlockHolmes_1.0.1.tar.gz
Windows binaries: r-devel:, r-release:, r-oldrel:
macOS binaries: r-release (arm64): SherlockHolmes_1.0.1.tgz, r-oldrel (arm64): SherlockHolmes_1.0.1.tgz, r-release (x86_64): SherlockHolmes_1.0.1.tgz, r-oldrel (x86_64): SherlockHolmes_1.0.1.tgz
Old sources: SherlockHolmes archive


Please use the canonical form to link to this page.