textTinyR: Text Processing for Small or Big Data Files

Processes big text data files in batches efficiently. For this purpose, it offers functions for splitting, parsing, tokenizing and creating a vocabulary. Moreover, it includes functions for building either a document-term matrix or a term-document matrix and extracting information from those (term-associations, most frequent terms). Lastly, it embodies functions for calculating token statistics (collocations, look-up tables, string dissimilarities) and functions to work with sparse matrices. The source code is based on 'C++11' and exported in R through the 'Rcpp', 'RcppArmadillo' and 'BH' packages.

Version: 1.0.3
Depends: R (≥ 3.2.3), Matrix
Imports: Rcpp (≥ 0.12.5), R6, data.table, utils
LinkingTo: Rcpp, RcppArmadillo (≥ 0.7.5), BH
Suggests: testthat, covr, knitr, rmarkdown
Published: 2017-01-29
Author: Lampros Mouselimis
Maintainer: Lampros Mouselimis <mouselimislampros at gmail.com>
BugReports: https://github.com/mlampros/textTinyR/issues
License: GPL-3
Copyright: inst/COPYRIGHTS
textTinyR copyright details
URL: https://github.com/mlampros/textTinyR
NeedsCompilation: yes
SystemRequirements: The package requires the following two components : A C++11 compiler and on a unix OS the boost-locale headers and libraries ( boost >= 1.55.0 , www.boost.org ). Debian/Ubuntu: libboost-locale-dev, Fedora : yum install boost-devel, OSX/brew : detailed installation instructions can be found in the README file
Materials: README NEWS
CRAN checks: textTinyR results

Downloads:

Reference manual: textTinyR.pdf
Vignettes: Functionality of the textTinyR package
Package source: textTinyR_1.0.3.tar.gz
Windows binaries: r-devel: textTinyR_1.0.3.zip, r-release: textTinyR_1.0.3.zip, r-oldrel: not available
OS X Mavericks binaries: r-release: not available, r-oldrel: not available
Old sources: textTinyR archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=textTinyR to link to this page.