stringdist: Approximate String Matching and String Distance Functions

Implements an approximate string matching version of R's native 'match' function. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences. This package is built for speed and runs in parallel by using 'openMP'. An API for C or C++ is exposed as well.

Depends: R (≥ 2.15.3)
Imports: parallel
Suggests: testthat
Published: 2018-06-08
Author: Mark van der Loo [aut, cre], Jan van der Laan [ctb], R Core Team [ctb], Nick Logan [ctb], Chris Muir [ctb]
Maintainer: Mark van der Loo <mark.vanderloo at>
License: GPL-3
NeedsCompilation: yes
Citation: stringdist citation info
Materials: NEWS
In views: NaturalLanguageProcessing, OfficialStatistics
CRAN checks: stringdist results


Reference manual: stringdist.pdf
Package source: stringdist_0.9.5.1.tar.gz
Windows binaries: r-devel:, r-release:, r-oldrel:
OS X binaries: r-release: stringdist_0.9.5.1.tgz, r-oldrel: stringdist_0.9.4.7.tgz
Old sources: stringdist archive

Reverse dependencies:

Reverse depends: AurieLSHGaussian, blink, vwr
Reverse imports: available, bcRep, bdlp, bibliometrix, deductive, diffrprojects, fastLink, fcuk, flora, fuzzyjoin, genBaRcode, GetLattesData, LexisNexisTools, lime, lingtypology, lintr, PGRdup, qdap, rabi, reclin, refinr, revtools, SentimentAnalysis, sjmisc, taxlist, tcR, tidystringdist, TSTr, utilsIPEA
Reverse linking to: refinr
Reverse suggests: googleLanguageR, rlist, spew


Please use the canonical form to link to this page.