CRAN Task View: Web Technologies and Services

Maintainer:Thomas Leeper, Scott Chamberlain, Patrick Mair, Karthik Ram, Christopher Gandrud
Contact:thosjleeper at gmail.com
Version:2016-03-21

This Task View contains information about to use R and the world wide web together. The base version of R does not ship with many tools for interacting with the web. Thankfully, there are an increasingly large number of tools for interacting with the web. This task view focuses on packages for obtaining web-based data and information, frameworks for building web-based R applications, and online services that can be accessed from R. A list of available packages and functions is presented below, grouped by the type of activity. The Open Data Task View provides further discussion of online data sources that can be accessed from R.

If you have any comments or suggestions for additions or improvements for this Task View, go to GitHub and submit an issue , or make some changes and submit a pull request . If you can't contribute on GitHub, . If you have an issue with one of the packages discussed below, please contact the maintainer of that package. If you know of a web service, API, data source, or other online resource that is not yet supported by an R package, consider adding it to the package development to do list on GitHub .

Tools for Working with the Web from R

Core Tools For HTTP Requests

There are two packages that should cover most use cases of interacting with the web from R. httr provides a user-friendly interface for executing HTTP methods (GET, POST, PUT, HEAD, DELETE, etc.) and provides support for modern web authentication protocols (OAuth 1.0, OAuth 2.0). HTTP status codes are helpful for debugging HTTP calls. httr makes this easier using, for example, stop_for_status(), which gets the http status code from a response object, and stops the function if the call was not successful. (See also warn_for_status().) Note that you can pass in additional libcurl options to the config parameter in http calls. RCurl is a lower-level package that provides a closer interface between R and the libcurl C library , but is less user-friendly. It may be useful for operations on web-based XML or to perform FTP operations. For more specific situations, the following resources may be useful:

Parsing Structured Web Data

The vast majority of web-based data is structured as plain text, HTML, XML, or JSON (javascript object notation). Web service APIs increasingly rely on JSON, but XML is still prevalent in many applications. There are several packages for specifically working with these format. These functions can be used to interact directly with insecure webpages or can be used to parse locally stored or in-memory web files.

Tools for Working with URLs

Tools for Working with Scraped Webpage Contents

Other Useful Packages and Functions

Web and Server Frameworks

Web Services

Cloud Computing and Storage

Document and Code Sharing

Data Analysis and Processing Services

Social Media Clients

Web Analytics Services

Other Web Services

CRAN packages:

Related links: