Base R ships with a lot of functionality useful for computational
econometrics, in particular in the stats package. This
functionality is complemented by many packages on CRAN, a brief overview
is given below. There is also a considerable overlap between the tools
for econometrics in this view and for finance in the Finance view.
is a suitable mailing list for obtaining help
and discussing questions about both computational finance and econometrics.
Finally, there is also some overlap with the SocialSciences that
also covers a broad variety of tools for social sciences, e.g., including political science.
The packages in this view can be roughly structured into the following topics.
If you think that some package is missing from the list, please let me know.
Linear regression models
- Linear models can be fitted (via OLS) with
(from stats) and standard tests for model comparisons are available in various
methods such as
- Analogous functions
that also support asymptotic tests (z instead of t tests, and
Chi-squared instead of F tests) and plug-in of other covariance
waldtest() in lmtest.
- Tests of more general linear hypotheses are implemented in
- HC and HAC covariance matrices that can be plugged
into these functions are available in sandwich.
- Diagnost checking: The packages
car and lmtest provide a large collection
of regression diagonstics and diagnostic tests.
- Instrumental variables regression (two-stage least squares) is
ivreg() in AER, another implementation
tsls() in package sem.
- Many standard microeconometric models belong to the
family of generalized linear models (GLM) and can be fitted by
from package stats. This includes in particular logit and probit models
for modeling choice data and poisson models for count data. Effects for typical
values of regressors in these models can be obtained and visualized using effects.
- Negative binomial GLMs are available via
glm.nb() in package MASS.
Another implementation of negative binomial models
is provided by aod, which also contains other models for overdispersed
- Zero-inflated and hurdle count models are provided in package pscl.
- Multinomial responses: Multinomial models
with individual-specific covariates only are available in
from package nnet. An implementation with both individual- and
choice-specific variables is mlogit. Generalized additive models
(GAMs) for multinomial responses can be fitted with the VGAM package.
A Bayesian approach to multinomial probit models is provided by MNP.
Various Bayesian multinomial models (including logit and probit) are available
in bayesm. Furthermore, the package RSGHB fits various
hierarchical Bayesian specifications based on direct specification of the likelihood
- Ordered responses: Proportional-odds regression for ordered responses is implemented
polr() from package MASS. The package ordinal
provides cumulative link models for ordered data which encompasses proportional
odds models but also includes more general specifications. Bayesian ordered probit
models are provided by bayesm.
- Censored responses: Basic censored regression models (e.g., tobit models)
can be fitted by
survreg() in survival, a convenience
tobit() is in package AER. Further censored
regression models, including models for panel data, are provided in censReg.
Interval regression models are in intReg. Censored regression models with
conditional heteroskedasticity are in crch.
Furthermore, hurdle models for left-censored data at zero can be estimated with
mhurdle. Models for sample selection are available in sampleSelection.
- Instrumental variables for binary responses: The LARF package estimates
local average response functions for binary treatments and binary instruments.
- Multivariate probit models: Estimation and marginal effect computations can be
carried out with mvProbit.
- Miscellaneous: Further more refined tools for microecnometrics are provided in
the micEcon family of packages: Analysis with
Cobb-Douglas, translog, and quadratic functions is in micEcon;
the constant elasticity of scale (CES) function is in micEconCES;
the symmetric normalized quadratic profit (SNQP) function is in micEconSNQP.
The almost ideal demand system (AIDS) is in micEconAids.
Stochastic frontier analysis is in frontier.
The package bayesm implements a Bayesian
approach to microeconometrics and marketing. Inference for relative
distributions is contained in package reldist.
Further regression models
- Nonlinear least squares modeling is availble in
in package stats.
- Quantile regression: quantreg (including linear, nonlinear, censored,
locally polynomial and additive quantile regressions).
- Linear models for panel data: plm, providing a wide range of within,
between, and random-effect methods (among others) along with corrected standard
errors, tests, etc. For panel-corrected standard errors in OLS and GEE models,
see geepack and pcse. Estimation of linear models with
multiple group fixed effects is contained in lfe.
- Generalized method of moments (GMM) and generalized empirical likelihood (GEL):
- Spatial econometric models: The Spatial view gives details about
handling spatial data, along with information about (regression) modeling. In particular,
spatial regression models can be fitted using spdep and sphet (the
latter using a GMM approach). splm is a package for spatial panel
models. Spatial probit models are available in spatialprobit.
- Linear structural equation models: sem (including two-stage least squares).
- Simultaneous equation estimation: systemfit.
- Nonparametric kernel methods: np.
- Beta regression: betareg and gamlss.
- Truncated (Gaussian) regression: truncreg.
- Nonlinear mixed-effect models: nlme and lme4.
- Generalized additive models (GAMs): mgcv, gam, gamlss
- Mixed data sampling regression: midasr.
- Miscellaneous: The packages VGAM, rms and Hmisc provide several tools for extended
handling of (generalized) linear regression models. Zelig is a unified
easy-to-use interface to a wide range of regression models.
Basic time series infrastructure
- The TimeSeries task view provides much more detailed
information. Here, only the most important aspects are briefly mentioned.
- The class
"ts" in package stats is R's standard class for
regularly spaced time series (especially annual, quarterly, and monthly data).
- Time series in
"ts" format can be
coerced back and forth without loss of information to
from package zoo. zoo provides infrastructure for
both regularly and irregularly spaced time series (the latter via the class
"zoo") where the time information can be of arbitrary class.
This includes daily series (typically with
"Date" time index)
or intra-day series (e.g., with
"POSIXct" time index).
other implementations of irregular time series building on the
time-date class are available in its, tseries and
timeSeries (previously: fSeries) which are all aimed particularly at
finance applications. See the Finance task view for
Time series modeling
- The TimeSeries task view contains detailed information about time series analysis in R.
Time series models for financial econometrics (e.g., GARCH, stochastic volatility models, or
stochastic differential equations, etc.) are described in the Finance. Here, only a brief overview
of the most important methods for econometrics is given.
- Classical time series modeling tools are
contained in the stats package and include
arima() for ARIMA modeling
and Box-Jenkins-type analysis.
- Fitting linear regression models with AR error terms via OLS is possible
gls() from nlme.
- Structural time series models are provided by
StructTS() in stats.
- Filtering and decomposition for time series is available in
HoltWinters() in stats.
- Extensions to these
methods, in particular for forecasting and model selection, are provided in
the forecast package.
- Miscellaneous time series filters are available in mFilter.
- For estimating VAR models, several
methods are available: simple models can be fitted by
ar() in stats, more
elaborate models are provided in package vars,
dse and a Bayesian approach is available in MSBVAR. A
convenient interface for fitting dynamic regression models via OLS is available
in dynlm; a different approach
that also works with other regression functions is implemented in dyn.
- More advanced dynamic system equations can be fitted using dse.
- Various linear and nonlinear autoregressive time series models are provided by tsDyn.
- Periodic autoregressive models are provided by partsm.
- Gaussian linear state space models can be fitted using dlm (via maximum likelihood,
Kalman filtering/smoothing and Bayesian methods).
- Unit root and cointegration techniques are available in urca,
tseries, CADFtest, and tsDyn.
- Time series factor analysis is available in tsfa.
- Asymmetric price transmission modeling is available in apt.
- Packages AER and Ecdat
contain a comprehensive collections of data sets from various standard econometric
textbooks as well as several data sets from the Journal of
Applied Econometrics and the Journal of Business & Economic Statistics
- AER additionally provides an extensive set of
examples reproducing analyses from the textbooks/papers, illustrating
various econometric methods.
- FinTS is the R companion to Tsay's 'Analysis of
Financial Time Series' (2nd ed., 2005, Wiley) containing data sets, functions
and script files required to work some of the examples.
- CDNmoney provides Canadian monetary aggregates.
- pwt provides the Penn World Table from versions 5.6, 6.x, 7.x. The version 8.x
data are available in pwt8.
- The packages expsmooth, fma, and Mcomp are
data packages with time series data
from the books 'Forecasting with Exponential Smoothing: The State Space Approach'
(Hyndman, Koehler, Ord, Snyder, 2008, Springer) and 'Forecasting: Methods and Applications'
(Makridakis, Wheelwright, Hyndman, 3rd ed., 1998, Wiley) and the M-competitions,
- Package erer contains functions and datasets for the book of
'Empirical Research in Economics: Growing up with R' (Sun, forthcoming).
- The package psidR available from GitHub can build panel data
sets from the Panel Study of Income Dynamics (PSID).
- Matrix manipulations: As a vector- and matrix-based language, base R
ships with many powerful tools for doing matrix manipulations, which are
complemented by the packages Matrix and SparseM.
- Optimization and mathematical programming: R and many of its contributed
packages provide many specialized functions for solving particular optimization
problems, e.g., in regression as discussed above. Further functionality for
solving more general optimization problems, e.g., likelihood maximization, is
discussed in the the Optimization task view.
- Bootstrap: In addition to the recommended boot package,
there are some other general bootstrapping techniques available in
bootstrap or simpleboot as well some bootstrap techniques
designed for time-series data, such as the maximum entropy bootstrap in
meboot or the
tsbootstrap() from tseries.
- Inequality: For measuring inequality, concentration and poverty the
package ineq provides some basic tools such as Lorenz curves,
Pen's parade, the Gini coefficient and many more.
- Structural change: R is particularly strong when dealing with
structural changes and changepoints in parametric models, see
strucchange and segmented.
- Exchange rate regimes: Methods for inference about exchange
rate regimes, in particular in a structural change setting, are provided