tally now produces counts by default for all formula shapes. Proportions or percentages must be requested explicitly. This is to avoid common errors, especially when feeding the results into
msummary. Usually this is identical to
summary, but for a few kids of objects it provides modified output that is less verbose.
do * lm( ) will now keep track of the F statistic, too.
confint applied to an object produced using
do now does more appropriate things.
prop.test now set
success = 1 by default
on 0-1 data to treat 0 like failure and 1 like success. Similarly,
level = 1 by default.
CIsim can now produce plots and does so by default when
samples <= 200.
add=TRUE improved for
swap which is useful for creating randomization
distributions for paired designs. The current implementation is a bit slow.
We'll improve that by implementing part of the code in C++.
Some additional functions are now formula-aware:
docFile introduced to simplify accessing files included
with package documentation.
read.file() enhanced to take a package
as an argument and look among package documentation files.
factorize introduced as a way to convert vectors with
few unique values into factors. Can be applied to an entire data frame.
The data sets formerly in this pacakge have been separated out into two
additional packages: NHANES contains the
NHANES data set
and mosaicData contains the other data sets.
SAD were added to compute mean and sum
of all pairs of absolute differences.
Facilities for making choropleth maps has been added. The API for these tools is still under development and may change in future releases.
rspin has been added to simulate spinning a spinner.
Two additional vignettes are included. Less Volume, More Creativity outlines how to use the mosaic package to simplify R for beginners. The other vignette illustrates many of the plotting features added by the mosaic package.
The mosaic package now contains two RMarkdown templates (one fancy and one plain).
plotFun has been improved so that it does a better job of selecting points
where the function is evaluated and no longer warns about
while exploring the domain of the function.
oddsRatio has been redesigned and
relrisk has been added.
summary() methods or
verbose=TRUE to see more information
(including confidence intervals).
Birthdays data set.
mplot and several instances have been
added to make a number of plots easy to generate. There are methods
for objects of classes
"hclust". For several of these there are also
fortify methods that return the data frame created to
read.file now handles (some?) https URLs and accepts
an optional argument
filetype that can be used to
declare the type of data file when it is not identified
The default for
useNA in the
has changed to
mosaic now depends on dplyr both to use some
of its functionality and to avoid naming collisions with functions
do, allowing mosaic and dplyr
to coexist more happily.
some improvements to dot plots with
dotPlot. In particular,
the size of the dots is determined differently and works better
more of the time. Dots were also shifted down by .5 units so that they
don't hover above the x-axis so much. This means that (with default
sizing) the tops of the dots are approximately located at a height
equivalent to the number of dots rather than the center of the dots.
fixed a bug in
do that caused it to scope incorrectly
in some edge cases when a variable had the same name as a function.
ntiles has been reimplemented and now has more
derivedFactor for creating factors
from logical "cases".
HELP data set has been removed from the package.
It was deprecated in version 0.5. Use
plotDist now accepts
under=TRUE, making it easy to add
plots of distributions over (or under) plots of data (e.g., histograms, densityplots, etc.)
or other distributions.
Plotting funcitons with with the option
add=TRUE have been reimplemented using
layer from latticeExtra. See documentaiton of these functions for details.
ladd has been completely reimplemented using
layer() from latticeExtra. See
ladd() for details, including some behavior changes.
aggregating functions (
var, et al) now use
getOptions("na.rm") to determine the default value of
change the default behavior to remove
NAs and options(na.rm=NULL) to restore
do has been largely rewritten with an eye toward improved
efficiency. In particular,
do will take advantage of multiple cores
if the parallel package is availalbe. At this point, sluggishness in applications of
mostly likely due to the sluggishness of what is being done, not to
Added an additional method to
deltaMethod from the car package to make it easier to do propagation of uncertainty is some situations
that arrise commonly in the physical sciences and engineering.
cdist to compute critical values for the central portion
of a distribution.
Some changes to the API for
qdata. For interactive use, this
not cause any problem, but old programmatic uses of
qdata should be
checked as the object returned is now different.
Fixed a bug that caused aggregating functions (
sd, etc.) to produce counter-intuititve results (but with a warning). The results are now what one would expect (and the warning is removed).
rsquared for extracting r-squared from models and model-like objects (
r.squared has been deprecated).
do now handles ANOVA-like objects better
maggregate is now built on some improved behind the scenes functions. Among
other features, the
groups argument is now incorporated as an alternative method
of specifying the goups to aggregate over and the
method argument can be set to
"ddply" to use
ddply from the plyr package for aggregation. This results
in a different output format that may be desired in some applications.
qdata functions have been largely rewritten. In addition,
qdata_f are provided which produce similar results
but have a formula in the first arguemnt slot.
Fixed bug in vignette generation. Static PDFs are now installed in
doc/ and so
are available from within the package as well as via links to external files.
fetchGapminder for fetching data sets originally from
cdata for finding end points of a central portion of a variable.
Name changes in functions like
prop to avoid internal
: which makes downstream processing messier.
Improved detection of the availability of
Surface plots produced by
plotFun can be used without
manipulate. This makes it possible to put surface plots into RMarkdown or Rnw files or to generate them outside of RStudio.
do() * rflip() now records proportion heads as well
as counts of heads and tails.
restoreLatticeOptions to switch back and forth between
lattice defaults and
dotPlot uses a different algorithm to determine dot sizes.
(Still not perfect, but
cex can be used to further scale the dots.)
histogram so that
nint matches the number of bins used more accurately.
fixed coding error in the HELP datasets so that
i2: max number of drinks is at least as large as
i1: the average number of drinks.
removed the deprecated HELP dataset (now called HELPrct)
Various minor bug fixes and internal improvements.
Various improvements and bug fixes to
mPlot provides an interactive environment for
creating lattice and ggplot2 plots.
Some support for producing maps has been introduced, notably
sp2df for converting SpatialPolygonDataFrames to regular data frames (which is useful for plotting with ggplot2, for example). Also the
Countries data frame facilitates
mapping country names among different sources of map data.
Data frames returned by
do are now marked as such so that
can behave differently for such data frames and for "regular" data frames.
t.test can now do 1-sample t-test described using a formula.
Aggregating functions (e.g.
var, etc. using a formula
interface) have been completely
reimplemented and additional aggregating functions are provided.
ntiles function has been added to facilitate creating
factors based on quantile ranges.
Changes in format to
Minor changes in documentation.
Added vignettes: Starting with R and A Compendium of Commands to Teach Statistics.
Plan to deprecate datasets from the Carnegie Melon University Online Learning Initiative Statistics Modules in next release.
xhistogram is now deprecated. Use
Added vignette: Minimal R for Intro Stats.
Implemented symbolic integration for simple functions.
Aggregating functions (
var, etc.) now use
to determine default behavior.
Various bug fixes in
var() allow it to work in a wider range of situations.
TukeyHSD so that explicit use of
aov is no longer required
panel.lmbands for plotting confidence and prediction bands in linear regression
Some data cleaning in the Carnegie Melon University Online Learning Initiative Statistics Modules. In particular
the name collision with
MASS has been
removed by renaming the data set
freqpolygon for making frequency polygons.
r.squared for extracting r-squared from models and model-like objects.
Modified names of data frame produced by
do so that hyphens ('-') are turned into dots ('.')
We're still in beta, but we hope things are beginning to stabilize as we settle on syntax and coding idioms for the package. Here are some of the key updates since 0.4:
removed dependency on RCurl since it caused installation problems for some PC users. (Code requiring RCurl now checks at run time whether the package is available.)
further improvements to formula interfaces to common functions. The conditional | now works in more situations and & has been replaced by + so that formulas look more like the formulas
lm() and its cousins.
inclusion of the datasets from the Carnegie Mellon University Online Learning Initiative Statistics modules. These are in alpha form and some additional data cleaning and renaming may happen in the near future.
makeFun() now has methods for glm and nls objects
D() improved to use symbolic differentiation in more cases and allow pass through to
stats::D() when that makes sense. This allows functions like deltaMethod() from the car package
to work properly even when the mosaic package is loaded.
The API for
antiD() has been modified somewhat. This may go through another revision
if/when we add in symbolic differentiation, but we think we are now close to the end state.
The HELP dataset has been replaced by the HELPrct dataset, and the former will be deprecated in the next release.
The CPS data set has been renamed CPS85.
fitModel() have been added as wrappers around linear models using ns(), bs(), and nls().
Each of these returns the model fit as a function.
improvements to the vignettes.
renamed mtable() to tally(), added new functionality
reimplemented D() and antiD()
improvements to statTally()
new confint() functionality
makeFun() and plotFun() interface to plotting using formulas
added new vignette on Teaching Calculus using R
added new vignette on Resampling-Based Inference using R
changed default behavior for aggregating functions na.rm option so that it defaults to usual behavior unless given a formula as argument