taxize

Build status

taxize allows users to search over many taxonomic data sources for species names (scientific and common) and download up and downstream taxonomic hierarchical information - among other things.

The taxize tutorial is can be found at http://ropensci.org/tutorials/taxize_tutorial.html

Contributors

The functions in the package that hit a specific API have a prefix and suffix separated by an underscore. They follow the format of service_whatitdoes. For example, gnr_resolve uses the Global Names Resolver API to resolve species names. General functions in the package that don't hit a specific API don't have two words separated by an underscore, e.g., classification.

You need API keys for Encyclopedia of Life (EOL), the Universal Biological Indexer and Organizer (uBio), Tropicos, and Plantminer.

The following are URL's for API documentation, where to get API keys, and what prefix they have in function names.

SOAP

Note that a few data sources require SOAP web services, which are difficult to support. Thus, data sources that require SOAP web services are included in a full version of the package, but are only available installing from GitHub (see installation notes below). All the remaining data sources are available on the master branch and on CRAN. See the column soap branch only in the table below.

Currently implemented in taxize

Souce Function prefix API Docs API key soap branch only
Encylopedia of Life eol link link false
Taxonomic Name Resolution Service tnrs link none false
Integrated Taxonomic Information Service itis link none false
Phylomatic phylomatic link none false
uBio ubio link link false
Global Names Resolver gnr link none false
Global Names Index gni link none false
IUCN Red List iucn link none false
Tropicos tp link link false
Plantminer plantminer link link false
Theplantlist dot org tpl ** none false
Catalogue of Life col link none false
Global Invasive Species Database gisd * none false
National Center for Biotechnology Information ncbi none none false
CANADENSYS Vascan name search API vascan link none false
International Plant Names Index (IPNI) ipni link none false
World Register of Marine Species (WoRMS) worms link none TRUE
Barcode of Life Data Systems (BOLD) bold link none false
Pan-European Species directories Infrastructure (PESI) pesi link none TRUE
Mycobank myco link none TRUE

**: There are none! We suggest using TPL and TPLck functions in the taxonstand package. We provide two functions to get bullk data: tpl_families and tpl_get.

***: There are none! The function scrapes the web directly.

May be in taxize in the future...

Quickstart

For more examples click here

Install taxize

Stable version from CRAN:

install.packages("taxize")
library('taxize')

Development version from GitHub:

You'll need the devtools package, installs from the master branch

install.packages("devtools")
devtools::install_github("taxize", "ropensci")
library('taxize')

Version with SOAP data sources

You'll need the devtools package, and the XMLSchema and SSOAP packages. The canonical source of XMLSchema and SSOAP is here, but they were cloned to this other Github location (github.com/sckott) to ease installation.

install.packages("devtools")
devtools::install_github(c("sckott/XMLSchema", "sckott/SSOAP"))

Install from the soap branch

devtools::install_github("taxize", "ropensci", ref='soap')
library('taxize')

Get unique taxonomic identifier from NCBI

uids <- get_uid(c("Chironomus riparius", "Chaetopteryx"))

Retrieve classifications

out <- classification(uids)
lapply(out, head)
[[1]]
                name         rank
1 cellular organisms      no rank
2          Eukaryota superkingdom
3       Opisthokonta      no rank
4            Metazoa      kingdom
5          Eumetazoa      no rank
6          Bilateria      no rank

[[2]]
                name         rank
1 cellular organisms      no rank
2          Eukaryota superkingdom
3       Opisthokonta      no rank
4            Metazoa      kingdom
5          Eumetazoa      no rank
6          Bilateria      no rank

Get synonyms

synonyms("Poa annua", db="itis")
$`Poa annua`
                          name    tsn
1      Poa annua var. aquatica 538978
2       Poa annua var. reptans 538979
3                  Aira pumila 785854
4             Catabrosa pumila 787993
5               Ochlopoa annua 791574
6               Poa aestivalis 793946
7                   Poa algida 793954
8         Poa annua var. annua 802116
9     Poa annua var. eriolepis 802117
10 Poa annua var. rigidiuscula 802119
11        Poa annua f. reptans 803667

Get taxonomic IDs from many sources

get_ids(names="Chironomus riparius", db = c('ncbi','itis','col'), verbose=FALSE)
$ncbi
Chironomus riparius
                    "315576"
attr(,"match")
[1] "found"
attr(,"uri")
[1] "http://www.ncbi.nlm.nih.gov/taxonomy/315576"
attr(,"class")
[1] "uid"

$itis
Chironomus riparius
                    "129313"
attr(,"match")
[1] "found"
attr(,"uri")
[1] "http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=129313"
attr(,"class")
[1] "tsn"

$col
Chironomus riparius
                    "8663146"
attr(,"class")
[1] "colid"
attr(,"uri")
[1] "http://www.catalogueoflife.org/col/details/species/id/8663146"

attr(,"class")
[1] "ids"

Get common names from scientific names

sci2comm(scinames='Helianthus annuus', db='itis')
$`Helianthus annuus`
[1] "common sunflower" "sunflower"        "wild sunflower"   "annual sunflower"

Meta

Please report any issues or bugs.

License: MIT

This package is part of the rOpenSci project.

To cite package taxize in publications use:

To cite taxize in publications use:

  Scott Chamberlain and Eduard Szocs (2013). taxize - taxonomic search
  and retrieval in R. F1000Research, 2:191. URL:
  http://f1000research.com/articles/2-191/v2.

A BibTeX entry for LaTeX users is

  @Article{,
    title = {taxize - taxonomic search and retrieval in R},
    journal = {F1000Research},
    author = {{Scott Chamberlain} and {Eduard Szocs}},
    year = {2013},
    url = {http://f1000research.com/articles/2-191/v2},
  }

Get citation information for taxize in R doing citation(package = 'taxize')

ropensci