Fixed a minor bug in prepDocuments which arises in cases where there are vocab elements which do not appear in the data.
Fixed a minor bug in frex calculation that caused some models not to label.
Fixed a minor bug in searchK that caused heldout results to report incorrectly.
Rewrite of plot.estimateEffect() which fixed a bug in some interaction models. Also returns results invisibly for creating custom plots.
Increased the stability of the spectral methods for stm initialization.
Complete rewrite of plotRemoved() which makes it much faster for larger datasets.
A minor patch to deal with textProcessor() in older versions of R.
Large changes many of which are not backwards compatible.
Numerous speed improvements to the core algorithm.
Introduction of several new options for the core stm function including spectral initalization, memoized inference, and model restarts.
Content covariate models are now estimated using the distributed multinomial formulation which is dramatically faster. Default prior also changed to L1.
Handling of document level convergence was changed to ensure positive definiteness in the document-level covariance matrices
Fixed bug in binary/binary interactions.
Numerous new diagnostic and summary functions
Expanding the console printing of many of the preprocessing functions
Fix an error with vignettes building on linux machines
sageLabels exported but not documented
factorCheck diagnostic function exported
Bug fix in the semantic Coherence function that affected content covariate models.
Bug fix to plot.STM() where for content covariate models with only a subset of topics requested the labels would show up as mostly NA. Thanks to Jetson Leder-Luis for pointing this out.
Bug fix for the readCorpus() function with txtorg vocab. Thanks to Justin Farrell for pointing this out.
Added some diagnostics to notify the user when words have been dropped in preprocessing.
Automatically coerce dates to numeric in spline function.
Very minor change with textProcessor() to accomodate API change in tm version 0.6
New option for plot.STM() which plots the distribution of theta values. Thanks to Antonio Coppola for coauthoring this component.
Deprecated option "custom" in "labeltype" of plot.STM(). Now you can simply specify the labels. Added additional functionality to specify custom topic names rather than the default "Topic #:"
Bug fixes to various portions of plot.STM() that would cause labels to not print.
Added numerous error messages.
Added permutationTest() function and associated plot capabilities
Updates to the vignette.
Added functionality to a few plotting functions.
When using summary() and labelTopics() content covariate models now have labels thresholded by a small value. Thus one may see no labels or very few labels particularly for topic-covariate interactions which indicates that there are no sizable positive deviations from the baseline.
S3 method for findThoughts and ability to threshold by theta.
Allow estimateEffect() to receive a data frame. (Thanks to Baoqiang Cao for pointing this out)
Major updates to the vignette
Minor Updates to several plotting functions
Fixed an error where labelTopics() would mislabel when passed topic numbers out of order (Thanks to Jetson Leder-Luis for pointing this out)
Introduction of the termitewriter function.
Version for submission to CRAN (2/28/2014)
Introduced new dataset poliblog5k and shrunk the footprint of the package
Numerous alternate options changed and some slight syntax changes to stm to finalize the API.
New build 2/14/2014
Fixing a small bug introduced in the last version which kept defaults of manyTopics() from working.
Updated version posted to Github (2/13/2014)
Various improvements to plotting functions.
Setting the seed in selectModel() threw an error. This is now corrected. Thanks to Mark Bell for pointing this out.
First public version released on Github (2/5/2014)
This is a beta release and we may change some of the API before submission to CRAN.