------------------- Version: 5.13-20 Date: 2012-02-06 CHANGES: - lift() was expanded into lift.formula() for calculating the plot points and xyplot.lift() to create the plot. - The package vignettes were altered to stop loading external RData files. - A few match.call() changes were made to pass new R CMD check tests. ADDITIONS: - calibration(), calibration.formula() and xyplot.calibration() were created to make probability calibration plots. - Model types 'xyf' and 'bdk' from the kohonen package were added. - update.train() was added so that tuning parameters can be manually set if the automated approach to setting their values is insufficient. ------------------- Version: 5.11-06 Date: 2012-01-12 CHANGES: - When using method = "pls" in train(), the plsr() function used the default PLS algorithm ("kernelpls"). Now, the full orthogonal scores method is used. This results in the same model, but a more extensive set of values are calculated that enable VIP calculations (without much of a loss in computational efficient). - A check was added to preProcess() to ensure valid values of 'method' were used. ADDITIONS: - A new method, "kernelpls", was added. - 'residuals' and 'summary' methods were added to 'train' objects that pass the final model to their respective functions. ------------------- Version: 5.11-06 Date: 2012-01-12 BUGS: - Bugs were fixed that prevented hold-out predictions from being returned. ADDITIONS: - SOM models were added ('xyf' and 'bdk') from the kohonen package. ------------------- Version: 5.10-13 Date: 2012-01-02 BUGS: - A bug in roc() was found when the classes were completely separable. CHANGES: - The ROC calculations for twoClassSummary() and filterVarImp() were changed to use the pROC package. This, and other changes, have increased efficiency. For filterVarImp on the cell segmentation data lead to a 54-fold decrease in execution time. For the Glass data in the mlbench package, the speedup was 37-fold. Warnings were added for roc(), aucRoc() and rocPoint() regarding their deprecation. ADDITIONS: - random ferns (package rFerns) were added - another sparse LDA model (from the penalizedLDA) was also added ------------------- Version: 5.09-012 Date: 2011-12-10 BUGS: - Fixed a bug which occurred when plsda() models were used with class probabilities CHANGES: - As of 8/15/11, the glmnet() function was updated to return a character vector. Because of this, train() required modification and a version requirement was put in the package description file. ------------------- Version: 5.09-006 Date: 2011-12-6 CHANGES: - Shea X made a suggestion and provided code to improve the speed of prediction when sequential parameters are used for gbm() models. - Andrew Ziem suggested an error check with metric = "ROC" and classProbs = FALSE. BUGS: - Andrew Ziem found a bug in how train() obtained earth() class probabilities ------------------- Version: 5.08-011 Date: 2011-11-24 BUGS: - Andrew Ziem found another small bug with parallel processing and train() (functions in the caret namespace cannot be found). - Ben Hoffman found a bug in pickSizeTolerance() that was fixed. - Jiaye Yu found (and fixed) a bug in getting predictions back from rfe() ------------------- Version: 5.07-024 Date: 2011-11-13 BUGS: - Using 'saveDetails' in sbfControl() or rfeControl() will save the predictions on the hold-out sets (Jiaye Yu wins the prize for finding that one). ADDITIONS: - trainControl() now has a logical to save the hold-out predictions. ------------------- Version: 5.07-005 Date: 2011-11-07 ADDITIONS: - type = "prob" was added for avNNet prediction. - A warning was added when a model from RWeka is used with train() and (it appears that) multicore is being used for parallel processing. The session will crash, so don't do that. BUGS: - A bug was fixed where the extrapolation limits were being applied in predict.train but not in extractPrediction. Thanks to Antoine Stevens for finding this. - Modifications were made to some of the workflow code to expose internal functions. When parallel processing was used with doMPI or doSMP, foreach did not find some caret internals (but doMC did). ------------------- Version: 5.07-001 Date: 2011-10-21 CHANGES: - changed calls to predict.mvr since the pls package now has a namespace. ------------------- Version: 5.06-002 Date: 2011-10-13 CHANGES: - a beta version of custom models with train() is included. The "caretTrain" vignette was updated with a new section that defines how to make custom models. ------------------- Version: 5.05-004 Date: 2011-10-11 CHANGES: - laying some of the groundwork for custom models - updates to get away from deprecated (mean and sd on data frames) BUGS: - The pre-processing in train() bug of the last version was not entirely squashed. Now it is. ------------------- Version: 5.04-007 Date: 2011-09-26 CHANGES: - panel.lift() was moved out of the examples in ?lift and into the package along with another function, panel.lift2. - lift() now uses panel.lift2 by default ADDITIONS: - Added robust regularized linear discriminant analysis from the rrlda package - Added evtree:::evtree BUGS: - A weird bug was fixed that occurred when some models were run with sequential parameters that were fixed to single values (thanks to Antoine Stevens for finding this issue). - Another bug was fixed where pre-processing with train() could fail ------------------- Version: 5.03-003 Date: 2011-09-23 BUGS: - pre-processing in train() did not occur for the final model fit =[ ------------------- Version: 5.02-011 Date: 2011-09-19 ADDITIONS: - A function, lift(), was added to create lattice objects for lift plots. - Several models were added from the obliqueRF package: 'ORFridge' (linear combinations created using L2 regularization), 'ORFpls' (using partial least squares), 'ORFsvm' (linear support vector machines), and 'ORFlog' (using logistic regression). As of now, the package only support classification. CHANGES: - Added regression models 'simpls' and 'widekernelpls'. These are new models since both train() and plsr() have an argument called 'method', so the computational algorithm could not be passed through using the three dots. - Model 'rpart' was added that uses cp as the tuning parameter. To make the model codes more consistent, 'rpart' and 'ctree' correspond to the nominal tuning parameters (cp and mincriterion, respectively) and 'rpart2' and 'ctree2' are the alternate versions using 'maxdepth'. - The text for 'ctree's tuning parameter was changed to '1 - P-Value Threshold' BUGS: - The argument 'controls' was not being properly passed through in models 'ctree' and 'ctree2'. ------------------- Version: 5.01-001 Date: 2011-08-02 BUGS: - 'controls' was not being set properly for cforest() models in train() - The print methods for train(), rfe() and sbf() did not recognize LOOCV - avNNet sometimes failed with categorical outcomes with bag = FALSE - A bug in preProcess was fixed that was triggered by matrices without dimnames (found by Allan Engelhardt) - bagged MARS models with factor outcomes now work - cforest was using the argument 'control' instead of 'controls' - a few bugs for class probabilities were fixed for 'slda', 'hdda', 'glmStepAIC', 'nodeHarvest', 'avNNet' and 'sda' CHANGES: - When looping over models and resamples, the foreach package is now being used. Now, when using parallel processing, the caret code stays the same and parallelism is invoked using one of the "do" packages (eg. doMC, doMPI, etc). This affects train(), rfe() and sbf(). Their respective man pages have been revised to illustrate this change. - The order of the results produced by defaultSummary() were changed so that the ROC AUC is first - A few man and C files were updated to eliminate R CMD check warnings - Now that we are using foreach, the verbose option in trainControl(), rfeControl() and sbfControl() are now defaulted to FALSE - rfe() now returns the variable ranks in a single data frame (previously there were data frames in lists of lists) for each of use. This will will break code from previous versions. The built-in RFE functions were also modified - confusionMatrix methods for rfe() and sbf() were added - NULL values of 'method' in preProcess are no longer allowed - a model for ridge regression was added (method = 'ridge') based on enet() ------------------- Version: 4.98 Date: 2011-07-25 BUGS: - Also, a bug was fixed in a few of the bagging aggregation functions (found by Harlan Harris). - Fixed a bug spotted by Richard Marchese Robinson in createFolds when the outcome was numeric. The issue is that createFolds is trying to randomize n/4 numeric samples to k folds. With < 40 samples, it could not always do this and would generate less than k folds in some cases. The change will adjust the number of groups based on n and k. For small samples sizes, it will not use stratification. For larger data sets, it will at most group the data into 4 quartiles. ADDITIONS: - A function confusionMatrix.train() was added to get an average confusion matrices across resampled hold-outs when using the train() function for classification. - Added another model, "avNNet", that fits several neural networks via the nnet package using different seeds, then averages the predictions of the networks. There is an additional bagging option. CHANGES: - The default value of the 'var' argument of bag() was changed. - As requested, most options can be passed from train() to preProcess(). The trainControl() function was re-factored and several options (e.g. 'k', 'thresh') were combined into a single list option called 'preProcOptions'. The default is consistent with the original configuration: preProcOptions = list(thresh = 0.95, ICAcomp = 3, k = 5) Also, another option was added to preProcess(). The 'pcaComp' option can be used to set exactly how many components are used (as opposed to just a threshold). It defaults to NULL so that the threshold method is still used by default, but a non-null value of 'pcaComp' over-rides 'thresh'. - When created within train(), the call for preProcess() is now modified to be a text string ("scrubed") because the call could be very large. - Removed two deprecated functions: applyProcessing and processData. - A new version of the cell segmentation data was saved and the original version was moved to the package website (see ?segmentationData for location). First, several discrete versions of some of the predictors (with the suffix "Status") were removed. Second, there are several skewed predictors with minimum values of zero (that would benefit from some transformation, such as the log). A constant value of 1 was added to these fields: AvgIntenCh2, FiberAlign2Ch3, FiberAlign2Ch4, SpotFiberCountCh4 and TotalIntenCh2. ------------------- Version: 4.92 Date: 2011-07-05 Some tweaks were made to plot.train in a effort to get the group key to look less horrid. train(), rfe() and sbf() are now able to estimate the time that these models take to predict new samples. Their respective control objects have a new option, timingSamps, that indicates how many of the training set samples should be used for prediction (the default of zero means do not estimate the prediction time). xyplot.resamples was modified. A new argument, "what", has values: - "scatter" plots the resampled performance values for two models - "BlandAltman" plots the difference between two models by the average (aka a MA plot) for two models - "tTime", "mTime", "pTime" plot the total model building/tuning time ("t") or the final model building time ("m") or the time to produce predictions ("p") against a confidence interval for the average performance. 2+ models can be used. Three new model types were added to train() using leaps:::regsubsets: "leapForward", "leapBackward" and "leapSeq". The tuning parameter, nvmax, is the maximum number of terms in the subset. Two bug fixes: - the seed was accidentally set when preProcess used ICA (spotted by Allan Engelhardt) - preProcess was always being called (even to do nothing) (found by Guozhu Wen) ------------------- Version: 4.91 Date: 2011-06-09 Added a few new models associated with the bst package: bstTree, bstLs and bstSm. A model denoted as "M5" that combines M5P and M5Rules from the RWeka package. This new model cuses either of these functions depending on the tuning parameter "rules". ------------------- Version: 4.90 Date: 2011-06-01 Fixed a bug with train() and method = "penalized". Thanks to Fedor for finding it. ------------------- Version: 4.89 Date: 2011-05-26 A new tuning parameter was added for M5Rules controlling smoothing. The Laplace correction value for Naive Bayes was also added as a tuning parameter. varImp.RandomForest was updated to work =] It now requires a recent version of the party package. ------------------- Version: 4.88 Date: 2011-04-28 A variable importance method was created for Cubist models. ------------------- Version: 4.87 Date: 2011-04-27 Altered the earth/MARS/FDA labels to be more exact. Added cubist models from the Cubist package. A new option to trainControl was added to allow users to constrain the possible predicted values of the model to the range seen in the training set or a user-defined range. One-sided ranges are also allowed. ------------------- Version: 4.85 Date: 2011-04-01 Two typos fixed in print.rfe and print.sbf (thanks to Jan Lammertyn) ------------------- Version: 4.83 Date: 2011-03-31 Bug fixes: - dummyVars failed with formulas using "." (all.vars does not handle this well) - ctree2 was failing for some classification models When SVM classification models are used with class.weights, the options prob.model is automatically set to FALSE (otherwise, it is always set to TRUE). A warning is issued that the model will not be able to create class probabilities. Also or SVM classification models, there are cases when the probability model generates negative class probabilities. In these cases, we assign a probability of zero then coerce the probabilities to sum to one. Several typos in the help pages were fixed (thanks to Andrew Ziem). Added a new model, svmRadialCost, that fits the SVM model and estimates the sigma parameter for each resample (to properly capture the uncertainty). preProcess() has a new method called "range" that scales the predictors to [0, 1] (which is approximate for new samples if the training set ranges is narrow in comparison). A check was added to train() to make sure that, when the user passes a data frame to tuneGrid, the names are correct and complete. print.train prints the number of classes and levels for classification models. ------------------- Version: 4.78 Date: 2011-02-08 Added a few bagging modules. See ?bag. Added basic timings of the entire call to train(), rfe() and sbf() as well as the fit time of the final model. These are stored in an element called "times". The data files were updated to use better compression, which added a higher R version dependency. plot.train was pretty much re-written to more effectively use trellis theme defaults and to allow arguments (e.g. axis labels, keys, etc) to be passed in to over-ride the defaults. Fixed bugs for: - lda bagging function - print.train when preProc is NULL - predict.BoxCoxTrans would go all klablooey if there were missing values - varImp.rpart was failing with some models (thanks to Maria Delgado) ------------------- Version: 4.77 Date: 2011-01-21 A new class was added or estimating and *applying* the Box-Cox transformation to data called BoxCoxTrans. This is also included as an option to transform *predictor* variables. Although the Box-Tidwell transformation was invented for this purpose, the Box-Cox transformation is more straightforward, less prone to numerical issues and just as effective. This method was also added to preProcess. Fixed mis-labelled x axis in plot.train when a transformation is applied for models with three tuning parameters. When plotting a train object with method == "gbm" and multiple values of the shrinkage parameter, the ordering of panels was improved. Fixed bugs for regression prediction using partDSA and qrf. Another bug, reported by Jan Lammertyn, related to extractPrediciton with a single predictor was also fixed. ------------------- Version: 4.76 Date: 2011-01-07 Fixed a bug where linear SVM models were not working for classification ------------------- Version: 4.75 Date: 2010-12-27 New model types: - 'gcvEearth' which is the basic MARS model. The pruning procedure is the nominal one based on GCV; only the degree is tuned by train(). - 'qrnn' for quantile regression neural networks from the qrnn package. - 'Boruta' for random forests models with feature selection via the Boruta package. ------------------- Version: 4.74 Date: 2010-12-27 Some changes to print.train: - the call is not automatically printed (but can be when print.train is explicitly invoked) - the "Selected" column is also not automatically printed (but can be) - non-table text now respects options("width") - only significant digits are now printed when tuning parameters are kept at a constant value ------------------- Version: 4.73 Date: 2010-12-21 Bug fixes to preProcess related to complete.cases and a single predictor. For knn models (knn3 and knnreg), added automatic conversion of data frames to matrices ------------------- Version: 4.72 Date: 2010-12-17 A new function for RFE with gam:::gam was added. "Down-sampling" was implemented with bag() so that, for classification models, each class has the same number of classes as the smallest class. Added a new class, dummyVars, that creates an entire set of binary dummy variables (instead of the reduced, full rank set). The initial code was suggested by Gabor Grothendieck on R-Help. The predict method is used to create dummy variables for any data set. Added R2 and RMSE functions for evaluating regression models Bug fixes: - varImp.gam() failed to recognize objects from mgcv - a small fix to test a logical vector filterVarImp() - when diff.resamples calculated the number of comparisons, the "models" argument was ignored. - predict.bag was ignoring type = "prob" - minor updates to conform to R 2.13.0 ------------------- Version: 4.70 Date: 2010-11-09 Added a warning to train when class levels are not valid R variable names. Fixed a bug in the variable importance function for multinom objects. Added p-value adjustments to summary.diff.resamples. Confidence intervals in dotplot.diff.resamples are adjusted accordingly if the Bonferroni is used. For dotplot.resamples(), no point was plotted when the upper and/or lower interval values were NaN. Now, the point is plotted but without the interval bars. Updated print.rfe to correctly describe new resampling methods. ------------------- Version: 4.69 Date: 2010-10-28 Fixed a bug in predict.rfe where an error was thrown even though the required predictors were in newdata. Changed preProcess() so that centering and scaling are both automatic when PCA or ICA are requested. ------------------- Version: 4.68 Date: 2010-10-21 Added two functions, checkResamples() and checkConditionalX() that identify predictor data with degenerate distributions when conditioned on a factor. Added a high content screening data set (segmentedData) from Hill et al. Impact of image segmentation on high-content screening data quality for SK-BR-3 cells. BMC bioinformatics (2007) vol. 8 (1) pp. 340. Fixed bugs in how sbf objects were printed (when using repeated CV) and classification models with earth and classProbs = TRUE. ------------------- Version: 4.67 Date: 2010-10-13 Added predict.rfe Added imputation using bagged regression trees to preProcess(). Fixed bug in varImp.rfe that caused incorrect results (thanks to Lawrence Mosley for the find). ------------------- Version: 4.65 Date: 2010-10-08 Fixed a bug where train() would not allow knn imputation. filterVarImp() and roc() now check for missing values and use complete data for each predictor (instead of case- wise deletion across all predictors). ------------------- Version: 4.64 Date: 2010-10-06 Fixed bug introduced in the last version with createDataPartition(... list = FALSE). Fixed a bug predicting class probabilities when using earth/glm models Fixed a bug that occurred when train() was used with ctree or ctree2 methods. Fixed bugs in rfe() and sbf() when running in parallel; not all the resampling results were saved ------------------- Version: 4.63 Date: 2010-09-19 Misc: o A p-value from McNemar's test was added to confusionMatrix. o Updated print.train so that constant parameters are not shown in the table (but a note is written below the table instead). Also, the output was changed slightly to be more easily read (I hope) o Adapted varImp.gam to work with either mgcv or gam packages. o Expanded the tuning parameters for lvq. o Some of the examples in the Model Building vignette were changed Resampling: o Added bootstrap 632 rule and repeated cross-validation to trainControl. A new function, createMultiFolds, is used to generate indices for repeated CV. o The various resampling functions now have *named* lists as output (with prefixes "Fold" for cv and repeated cv and "Resample" otherwise) Pre-processing: o Pre-processing has been added to train with the preProcess argument. This has been tested when caret function are used with rfe() and sbf() (via caretFuncs and caretSBF, respectively). o When preProcess(method = "spatialSign"), centering and scaling is done automatically too. Also, a bug was fixed that stopped the transformation from being executed. o knn imputation was added to preProcess(). The RANN package is used to find the neighbors (the knn impute function in the impute library was consistently generating segmentation faults, so we wrote our own). o Changed the behavior of preProcess in situations where scaling is requested but there is no variation in the predictor. Previously, the method would fail. Now a warning is issued and the value of the standard deviation is coerced to be one (so that scaling has no effect). ------------------- Version: 4.62 Date: 2010-09-08 Added mgcv:::gam (with smoothing splines and feature selection) and gam:::gam (with basic splines and loess) smoothers. For these models, a formula is derived from the data where "near zero variance" predictors (see ?nearZerVar) are excluded and predictors with less than 10 distinct values are entered as linear (i.e. unsmoothed) terms. ------------------- Version: 4.61 Date: 2010-09-05 Changed earth fit for classification models to use the glm argument with a binomial family. Added varImp.multinom, which is based on the absolute values of the model coefficients ------------------- Version: 4.60 Date: 2010-09-02 The feature selection vignette was updated slightly (again). ------------------- Version: 4.59 Date: 2010-09-01 Updated rfe() and sbf() to include class probabilities in performance calculations. Also, the names of the resampling indices were harmonized across train(), rfe() and sbf(). The feature selection vignette was updated slightly. ------------------- Version: 4.58 Date: 2010-08-28 Added the ability to include class probabilities in performance calculations. See ?trainControl and ?twoClassSummary. Updated and restructured the main vignette. ------------------- Version: 4.57 Date: 2010-08-25 Internal changes related to how predictions from models are stored and summarized. With the exception of loo, the model performance values are calculated by the workers instead of the main program. This should reduce i/o and lay some groundwork for upcoming changes. The default grid for relaxo models were changed based on and initial model fit. partDSA model predictions were modified; there were cases where the user might request X partitions, but the model only produced Y < X. In these cases, the partitions for missing models were replaced with the largest model that was fit. The function modelLookup() was put in the namespace and a man file was added. The names of the resample indices are automatically reset, even if the user specified them. ------------------- Version: 4.56 Date: 2010-08-19 Fixed a bug generated a few versions ago where varImp for plsda and fda objects crashed. ------------------- Version: 4.55 Date: 2010-08-18 When computing the scale parameter for RBF kernels, the option to automatically scale the data was changed to TRUE ------------------- Version: 4.54 Date: 2010-08-17 Added logic.bagging in logicFT with method = "logicBag" ------------------- Version: 4.53 Date: 2010-08-16 Fixed a bug in varImp.train related to nearest shrunken centroid models. Added logic regression and logic forests ------------------- Version: 4.51 Date: 2010-08-10 Added an option to splom.resamples so that the variables in the scatter plots are models or metrics. ------------------- Version: 4.50 Date: 2010-08-09 Added dotplot.resamples plus acknowledgements to Hothorn et al (2005) and Eugster et al (2008) ------------------- Version: 4.49 Date: 2010-08-08 Enhanced the tuneGrid option to allow a function to be passed in. ------------------- Version: 4.48 Date: 2010-08-06 Added a prcomp method for the resamples class ------------------- Version: 4.47 Date: 2010-08-05 Extended resamples() to work with rfe() and sbf() ------------------- Version: 4.46 Date: 2010-08-04 Cleaned up some of the man files for the resamples class and added parallel.resamples. Fixed a bug in diff.resamples where ... were not being passed to the test statistic function. Added more log messages in train() when running verbose. Added the German credit data set. ------------------- Version: 4.45 Date: 2010-07-25 Added a general framework for bagging models via the bag() function. Also, model type "hdda" from the HDclassif package was added. ------------------- Version: 4.44 Date: 2010-07-19 Added neuralnet, quantregForest and rda:::rda to train(). Since there is a naming conflict with mda:::rda, the rda:::rda model was given a method value of "scrda". ------------------- Version: 4.43 Date: 2010-06-30 Bug fix release: o the resampling estimate of the standard deviation given by train() since v 4.39 was wrong o a new field was added to varImp.mvr called "estimate". In cases where the mvr model had multiple estimates of performance (e.g. training set, CV, etc) the user can now select which estimate they want to be used in the importance calculation (thanks to Sophie Bréand for finding this) ------------------- Version: 4.42 Date: 2010-06-09 Added predict.sbf and modified the structure of the sbf helper functions. The "score" function only computes the metric used to filter and the filter function does the actual filtering. This was changed so that FDR corrections or other operations that use all of the p-values can be computed. Also, the formatting of p-values in print.confusionMatrix was changed and an argument was added to maxDissim so that the variable name is returned instead of the index. Independent component analysis was added to the list of pre-processing operations and a new model ("icr") was added to fit a pcr-like model with the ICA components. ------------------- Version: 4.40 Date: 2010-05-19 Added hda and cleaned up the caret training vignette ------------------- Version: 4.39 Date: 2010-05-15 Added several classes for examining the resampling results. There are methods for estimating pair-wise differences and lattice functions for visualization. The training vignette has a new section describing the new features. ------------------- Version: 4.38 Date: 2010-05-12 Added partDSA and stepAIC for linear models and generalized linear models ------------------- Version: 4.37 Date: 2010-04-18 Fixed a new bug in how resampling results are exported ------------------- Version: 4.36 Date: 2010-04-17 Added penalized linear models from the foba package ------------------- Version: 4.35 Date: 2010-04-15 Added rocc classification and fixed a typo. ------------------- Version: 4.34 Date: 2010-03-17 Added two new data sets: dhfr and cars ------------------- Version: 4.33 Date: 2010-03-01 Added GAMens (ensembles using gams) Fixed a bug in roc() that, for some data cases, would reverse the "positive" class and report sensitivity as specificity and vice-versa. ------------------- Version: 4.32 Date: 2009-12-24 Added a parallel random forest method in train() using the foreach package. Also added penalized logistic regression using the plr() function in the stepPlr package. ------------------- Version: 4.31 Date: 2009-12-07 Added a new feature selection function, sbf (for selection by filter). Fixed bug in rfe that did not affect the results, but did produce a warning. A new model function, nullModel, was added. This model fits either the mean only model for regression or the majority class model for classification. Also, ldaFuncs had a bug fixed. Minor changes to Rd files ------------------- Version: 4.30 Date: 2009-11-09 For whatever reason, there is now a function in the spls package by the name of splsda that does the same thing. A few functions and a man page were changed to ensure backwards compatibility. ------------------- Version: 4.29 Date: 2009-11-06 Added stepwise variable selection for lda and qda using the stepclass function in klaR ------------------- Version: 4.28 Date: 2009-11-05 Added robust linear and quadratic discriminant analysis functions from rrcov. Also added another column to the output of extractPRob and extractPrediction that saves the name of the model object so that you can have multiple models of the same type and tell which predictions came from which model. Changes were made to plotClassProbs: new parameters were added and densityplots can now be produced. ------------------- Version: 4.27 Date: 2009-11-01 Added nodeHarvest ------------------- Version: 4.26 Date: 2009-10-26 Fixed a bug in caretFunc() that led to NaN variable rankings, so that the first k terms were always selected. ------------------- Version: 4.25 Date: 2009-10-03 Added parallel processing functionality for rfe() ------------------- Version: 4.24 Date: 2009-09-29 Added the ability to use custom metrics with rfe() ------------------- Version: 4.22 Date: 2009-09-23 Many Rd changes to work with updated parser. ------------------- Version: 4.21 Date: 2009-09-18 Re-saved data in more compressed format ------------------- Version: 4.20 Date: 2009-07-19 Added pcr as a method ------------------- Version: 4.19 Date: 2009-06-30 Weights argument was added to train for models that accept weights Also, a bug was fixed for lasso regression (wrong lambda specification) and other for prediction in naive Bayes models with a single predictor. ------------------- Version: 4.18 Date: 2009-06-17 Fixed bug in new nearZeroVar and updated format.earth so that it does not automatically print the formula ------------------- Version: 4.17 Date: 2009-06-04 Added a new version of nearZeroVar from Allan Engelhardt that is much faster ------------------- Version: 4.16 Date: 2009-05-21 Fixed bugs in extractProb (for glmnet) and filterVarImp. For glmnet, the user can now pass in their own value of family to train (otherwise train will set it depending on the mode of the outcome). However, glmnet doesn't have much support for families at this time, so you can't change links or try other distributions. ------------------- Version: 4.15 Date: 2009-05-13 Fixed bug in createFolds when the smallest y value is more than 25% of the data ------------------- Version: 4.14 Date: 2009-05-13 Fixed bug in print.train ------------------- Version: 4.13 Date: 2009-05-12 Added vbmp from vbmp package ------------------- Version: 4.12 Date: 2009-05-07 Added additional error check to confusionMatrix and fixed an absurd typo in print.confusionMatrix ------------------- Version: 4.11 Date: 2009-04-25 Added - linear kernels for svm, rvm and Gaussian processes - rlm from MASS - a knn regression model, knnreg - a set of functions (class "classDist") to computes the class centroids and covariance matrix for a training set for determining Mahalanobis distances of samples to each class centroid - a set of functions (rfe) for doing recursive feature selection (aka backwards selection). A new vignette was added for more details ------------------- Version: 4.10 Date: 2009-03-20 Added OneR and PART from RWeka ------------------- Version: 4.09 Date: 2009-03-20 - Fixed error in documentation for confusionMatrix: old: Detection Prevalence = \frac{A}{A+B} correct: Detection Prevalence = \frac{A+B}{A+B+C+D} The underlying code was correct. - Added lars (fraction and step as parameters) ------------------- Version: 4.08 Date: 2009-02-18 Updated train and bagEarth to allow earth for classification models ------------------- Version: 4.07 Date: 2009-01-25 Added glmnet models ------------------- Version: 4.06 Date: 2009-01-24 Added code for sparse PLS classification. Fix a bug in prediction for caTools::LogitBoost ------------------- Version: 4.05 Date: 2009-01-23 Updated again for more stringent R CMD check tests in R-devel 2.9 ------------------- Version: 4.04 Date: 2009-01-22 Updated for more stringent R CMD check tests in R-devel 2.9 ------------------- Version: 4.03 Date: 2009-01-20 Significant internal changes were made to how the models are fit. Now, the function used to compute the models is passed in as a parameter (defaulting to lapply). In this way, users can use their own parallel processing software without new versions of caret. Examples are given in ?train. Also, fixed a bug where the MSE (instead of RMSE) was reported for random forest OOB resampling There are more examples in ?train. Changes to confusionMatrix, sensitivity, specificity and the predictive value functions: - each was made more generic with default and table methods - confusionMatrix "extractor" functions for matrices and tables were added - the pos/neg predicted value computations were changed to incorporate prevalence - prevalence was added as an option to several functions - detection rate and prevalence statistics were added to confusionMatrix - the examples were expanded in the help files This version of caret will break compatibility with caretLSF and caretNWS. However, these packages will not be needed now and will be deprecated. ------------------- Version: 3.51 Date: 2008-12-03 Updated the man files and manuals. ------------------- Version: 3.50 Date: 2008-12-02 Added qda, mda and pda. ------------------- Version: 3.49 Date: 2008-11-30 Fixed bug in resampleHist. Also added a check in the train functions that error trapped with glm models and > 2 classes ------------------- Version: 3.48 Date: 2008-11-30 Added glms. Also, added varImp.bagEarth to the namespace. ------------------- Version: 3.47 Date: 2008-11-24 Added sda from the sda package. There was a naming conflict between sda::sda and sparseLDA:::sda. The method value for sparseLDA was changed from "sda" to "sparseLDA". ------------------- Version: 3.46 Date: 2008-11-11 Added spls from the spls package ------------------- Version: 3.45 Date: 2008-10-17 Added caching of RWeka objects to that they can be saved to the file system and used in other sessions. (changes per Kurt Hornik on 2008-10-05) ------------------- Version: 3.44 Date: 2008-10-15 Added sda from the sparseLDA package (not on CRAN). Also, a bug was fixed where the ellipses were not passed into a few of the newer models (such as penalized and ppr) ------------------- Version: 3.43 Date: 2008-10-03 Added the penalized model from the penalized package. In caret, it is regression only although the package allows for classification via glm models. However, it does not allow the user to pass the classes in (just an indicator matrix). Because of this, it doesn't really work with the rest of the classification tools in the package. ------------------- Version: 3.42 Date: 2008-09-26 Added a little more formatting to print.train ------------------- Version: 3.41 Date: 2008-09-19 For gbm, let the user over-ride the default value of the distribution argument (brought us by Peter Tait via RHelp). ------------------- Version: 3.40 Date: 2008-09-18 Changed predict.preProcess so that it doesn't crash if newdata does not have all of the variables used to originally pre-process *unless* PCA processing was requested. ------------------- Version: 3.39 Date: 2008-09-18 Fixed bug in varImp.rpart when the model had only primary splits. Minor changes to the Affy normalization code Changed typo in predictors man page ------------------- Version: 3.38 Date: 2008-09-09 Added a new class called predictors that returns the names of the predictors that were used in the final model. Also added ppr from the stats package. Minor update to the project web page to deal with IE issues ------------------- Version: 3.37 Date: 2008-09-04 Added the ability of train to use custom made performance functions so that the tuning parameters can be chosen on the basis of things other than RMSE/R-squared and Accuracy/Kappa. Specific changes: - a new argument was added to trainControl called "summaryFunction" that is used to specify the function used to compute performance metrics. The default function preserves the functionality prior to this new version - a new argument to train is "maximize" which is a logical for whether the performance measure specified in the "metric" argument to train should be maximized or minimized. - the selection function specified in trainControl carries the maximize argument with it so that customized performance metrics can be used. Other changes: - a bug was fixed in confusionMatrix (thanks to Gabor Grothendieck) - another bug was fixed related to predictions from least square SVMs ------------------- Version: 3.36 Date: 2008-08-29 Added superpc from the superpc package. One note: the data argument that is passed to superpc is saved in the object that results from superpc.train. This is used later in the prediction function. ------------------- Version: 3.35 Date: 2008-08-27 Added slda from ipred ------------------- Version: 3.34 Date: 2008-08-25 Fixed a few bugs related to the lattice plots from version 3.33. Also added the ripper (aka JRip) and logistic model trees from RWeka ------------------- Version: 3.33 Date: 2008-08-22 Added xyplot.train, densityplot.train, histogram.train and stripplot.train. These are all functions to plot the resampling points. There is some overlap between these functions, plot.train and resampleHist. plot.train gives the average metrics only while these plot all of the resampled performance metrics. resampleHist could plot all of the points, but only for the final optimal set of predictors. To use these functions, there is a new argument in trainControl called returnResamp which should have values "none", "final" and "all". The default is "final" to be consistent with previous versions, but "all" should be specified to use these new functions to their fullest. ------------------- Version: 3.32 Date: 2008-07-28 The functions "predict.train" and "predict.list" were added to use as alternatives to the extractPrediction and extractProbs functions. Added C4.5 (aka J48) and rules-based models (M5 prime) from RWeka. Also added logitBoost from the caTools package. This package doesn't have a namespace and RWeka has a function with the same name. It was suggested to use the "::" prefix to differentiate them (but we'll see how this works).