udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <http://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>.

Version: 0.1.1
Depends: R (≥ 2.10)
Imports: Rcpp (≥ 0.11.5), data.table (≥ 1.9.6)
LinkingTo: Rcpp
Suggests: knitr
Published: 2017-09-13
Author: Jan Wijffels [aut, cre, cph], BNOSAC [cph], Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, Czech Republic [cph], Milan Straka [cph], Jana Strakov√° [cph]
Maintainer: Jan Wijffels <jwijffels at bnosac.be>
License: MPL-2.0
URL: https://github.com/bnosac/udpipe
NeedsCompilation: yes
SystemRequirements: C++11
Materials: README NEWS
CRAN checks: udpipe results

Downloads:

Reference manual: udpipe.pdf
Vignettes: UDPipe Natural Language Processing - Annotating text
UDPipe Natural Language Processing - Model Building
Package source: udpipe_0.1.1.tar.gz
Windows binaries: r-devel: udpipe_0.1.1.zip, r-release: udpipe_0.1.1.zip, r-oldrel: udpipe_0.1.1.zip
OS X El Capitan binaries: r-release: udpipe_0.1.1.tgz
OS X Mavericks binaries: r-oldrel: udpipe_0.1.1.tgz
Old sources: udpipe archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=udpipe to link to this page.