Versions 0.01 - 0.19
This is an experimental package for exploring various empirical Bayes
problems usually involving Kiefer-Wolfowitz non parametric ML estimation
of various mixture problems. It supplants my earlier MeddeR package that
had some similar capabilities, but employed an interface to Mosek via Matlab.
Note that there is a .Rprofile file that may be needed to specify the
license file for mosek. This may be needed to get mosek() to agree to do something.
At some point Rmosek became capable of doing multicore things and simulations
became much more convenient using foreach().
Version 0.20
1. Added the tuning parameter rtol to KWDual and friends to control the
convergence tolerance. The default value of 1e-6 is the same as Mosek
but in some cases tightening it to say 1e-10 can produce a better solution.
There is a demo called "tannenbaum" that illustrates this.
Version 0.21
1. Fixed a bug in GVmix that was caused by my neglect of a Jacobian (mea
culpa!). This shouldn't affect simulation experience simv[123].R since it
was just a multiplicative factor in the loglikelihood. But this needs to
be checked.
Version 0.22
1. Added Bmix function for Binomial mixtures and a demo of this using the
Beckett and Diaconis tack data taken (shamelessly) from the DPpackage.
2. Added rtol parameter to the call for WGLVmix and WGVmix.
3. Some further cleaning up of bball: name matching and removal of observations on
players with fewer than 3 half seasons.
Version 0.23
1. Fixed the documentation for tacks data.
2. added option to collapse binomial data into cell counts in Bmix.
Version 0.26
1. For hist = TRUE option in GLmix the binning is now done with equally
spaced bins on the full support of the data. The parameter m can be used
to choose the bin width. And v can be given to separately control grid for
fit.
Version 0.27
1. Changed KWDual to return weighted log likelihood. This was ok as it
was when the weights were always 1/n but for the histogram binning, it needed
to be changed. Now it returns the average loglikelihood so for example it
GLmix it gets multiplied by n to get the usual loglikelihood value.
Version 0.28
1. Added TLmix to do mixtures of Student t's with known df. Note that the
equal quantile spacing of the v's is experimental.
Version 0.30
1. Cleanup of some issues in the medde.Rd file to prepare a version for CRAN.
Version 0.31
1. Reduced size of example in medde.Rd and moved other examples that ran
demos out of examples and into details section.
Version 0.32
1. Reduced size of examples again in medde.Rd and added SystemRequirements line to
Description file.
Version 0.35
1. Bug in TLmix that messed up the default v grid. (Damn missing minus signs!)
2. Changed the default grid in TLmix back to equally spaced from quantile
based. This probably needs further investigation.
3. Added pv to return in GLVmix.R and to WGLVmix.R
4. Added 2012 data to bball
5. Added g to the returned list from GVmix and WGVmix
6. Added TLVmix and WTLVmix to do Gaussian mixture estimation with Student
t formulation of the likelihood.
7. Added Gompertzmix and Weibullmix and the medflies data set.
Version 0.37
1. Changed the definition of g output by GLmix to agree with its man page.
If we ever get around to unifying all these functions this needs to be checked
carefully, there are various issues especially when there are weights
involved.
Version 0.38
1. Fixed a bug in the medde function for the monotonized Bayes rule
estimator.
Version 0.41
1. Removed GLVmix and WGLVmix since they are supplanted by TLVmix and
WTLVmix.
2. Added a predict.GLmix function as an illustration for how to do prediction
from the posterior for Lp norm loss with p in {0,1,2}.
3. Added a sigma argument to GLmix so that one can specify non standard
Gaussian noise.
4. Added a demo for prediction to illustrate (3.).
5. Added check for zero indices in the Loss = 1 predict.GLmix function
Version 0.47
1. Fixed the hist option in GLmix to deal with heterogeneous sigmas.
Version 0.48
1. Added control option to allow users to pass mosek control parameters,
mainly at this point to control num_threads for use in simulations.
Version 0.49
1. Added WGLVmix for bivariate panel problems like the income dynamics paper.
2. Added reference to JSS Convex Opt paper to medde man page
Version 0.50
1. Fixed Gompertzmix.R in accordance with Brian's advice.
Version 0.52
1. Major cleanup of code and conversion of documentation to roxygen2 format.
2. Added function to compute Cosslett (1983) estimator for binary choice
model.
3. Rationalized likelihood and Bayes Rule computations for KW functions.
Version 0.54
1. Removed dependence on SparseM, which was causing some difficulties with
R CMD check and NAMESPACE conflicts. With any luck this should also produce
some efficiency gain perhaps very slight.
2. Cleaned up Description file and added Jiaying and Ivan to authors list.
3. Added importFrom directives for stats graphics and methods packages.
4. Added option to compute KW solutions via the POGS procedure. This is
quite experimental at this stage. Try demo(GLmix2) to illustrate use and
performance.
Version 0.55
1. Reintroduced weights for all of the fitting routines that didn't already
have them. (As a consequence of trying to reconstruct medfly figures which
required them, as pointed out by Jiaying.)
Version 0.58
1. Commented out all the POGS options since CRAN didn't like package
dependencies that didn't have a proper repo.
Version 0.59
1. Fixed weights for GLmix which didn't work right when hist = TRUE option
was used.
2. Also fixed weights in Bmix and Weibull.
Version 0.61
1. Fixed bug in KWDual that inhibited passing rtol
2. Fixed date on Koenker-Mizera pointed about by Kurt
Version 0.62 submitted Feb 2 2016
1. Made lambda = 1 the default in Weibullmix
2. Added Norberg life insurance data.
3. Modified Weibullmix to allow lambda to be a vector to accommodate a linear
predictor for profile likelihood settings.
4. Modified Pmix to allow exposure variable
5. Added vignette rebayes.pdf
Version 0.63
1. added importFrom("stats", "dpois") to NAMESPACE.
2. added WTLmix to do iterative Normal/Gamma estimation for the independent
prior longitudinal model.
3. Fixed sign bug in mesh1 inside medde and added a demo to recreate Figure
from the paper Convex Optimization Shape Constraints, Compound Decisions and
Empirical Bayes Rules. XJ had the wrong sign coming out of mesh1.
4. Added Rxiv function for archiving table and figure files, still needs
work.
Version 0.65-8
1. Fixed nasty bug in Hellinger, 1:p not rep(1,p) in opro[2,] in medde.
2. Added weights option to medde.
3. Added a Silverman density estimation example to demos
4. Added predict methods for Pmix and Bmix
5. Added quantile loss option to predict methods
6. Added the vignette rebayes.Rnw and rebayes.bib files
7. Added another fitted Silverman curve to the Silverman demo to
illustrate that when lambda is very small the fit oscillates around a bit.
Version 0.70
1. Added Guvenen.rda to prepare for medde vignette.
2. Added medde vignette
Version 0.75
1. modified Cosslett to return log-likelihood
2. Added a profile likelihood demo for Cosslett
Version 0.80
1. Added the Bayesian Deconvolution vignette.
2. Updated medde vignette
Version 0.82
1. Added option for bivariate medde fitting, highly experimental,
since the triogram computation of the penalty contribution seems to be fishy.
Version 0.85
1. Drastic rewrite of medde following closely Ivan's nddcc approach, at this
point no bivariate option, but the 1d version seems to be much more stable.
Version 0.86
1. Fixed formula for the primal problem P_\alpha in the vignette as noted
by Ivan (email May 9, 2017).
Version 1.2
1. Updated some references due to publication of the JSS vignette.
2. Added demo meddep that illustrates that the primal and dual solutions
overplot.
Version 1.3
1. Modified some fitting functions to address the observation of Dave Zhao
that we should always set d = 1, reflecting that we are estimating a discrete
df, rather than a histogram-type density.
Version 1.4
1. Modified the Pmix function to allow truncation as illustrated in the
added demo Shakespeare.R which is described further in our comment on Efron's
2019 Statistical Science paper.
Version 1.5
1. Modified all the XXmix functions to specify d == 1, rather than d =
diff(v) as in prior versions. This reflects discussions with Dave Zhao
(UIUC) and the realization that KW is really estimating a CDF not a pdf.
This may have unintended consequences for some old code that explicitly
renormalized the solution for f. Check sum(fit$y) as an initial sanity check.
2. Modified the KWDual function to (optionally) use the Mosek V9 exponential
cone constraint formulation. This option is done automatically based on
a test of packageVersion("Rmosek") that reveals which version of Mosek is
installed.
Version 1.9
1. Repaired a bug in the V9 code for medde() with alpha = 1.
2. Added a Grenander demo to show that the classical Grenander NPMLE
for a decreasing density can be done several different ways.
3. Removed the primal code for meddep that was intended to confirm
that the dual and primal version of medde were computing the same thing.
This could be reintroduced, but it would require adapting the primal code
in the prior version of the demo to work for V9. The simplified demo
now is called medde1 rather than meddep and just compares estimates of
a Student t density with 4 different values of alpha.
Version 2.0
1. Changed the stop() statement when the Mosek termination code isn't 0,
to a warning(), since in Mosek 9 we were occasionally getting stalled
solutions that proved to be essentially the same as those produced by Mosek 8.
Version 2.1
1. Added an option to predict.Pmix and a new demo Pmix1 to
illustrate it..
Version 2.2
1. Simplified the medde specification of the mosek problem for alpha < 0
cases.
2. Removed check of response$code for mosek solution which produced lots
of annoying mosek stalled messages that didn't seem to be very meaningful.
Version 2.3
1. Added B2mix function for bivariate binomial mixtures
Version 2.31
1. Replaced an undefined mu variable in predict.B2mix.R.
2. Fixed the definition of the OraclelogLik in B2mix1 demo so that it is
evaluated at the true mixture distribution, rather than assuming that the
oracle knows which p vector is associated with each observation.
Version 2.33-6
1. Added a KWPrimal function to present mosek with primal rather than dual
problems. This _should_ produce equivalent solutions, but badly needs further
testing. Intended initially to facilitate development of new code to enable
one to add further linear equality constraints on the mixing distribution.
2. Added new function for fitting a binomial model with possibly dependent
Poisson number of trials, and a demo BPmix1.
3. Added another demo fitting binomial model with possible dependent numbers
of trials, however in this case no Poisson assumption is made, we only hope
(!) to infer the dependence from the marginal distribution of the trial data.
This is highly experimental.
4. Added new function for Gaussian mixtures with standard deviations
proportional to 1/sqrt(m) where m is a Poisson random variable and possibly
dependent on the mean of the Gaussians. This is analogous to code{BPmix}.
There is an associated demo NPmix1.
Version 2.37
1. Added new function BDGLmix to implement Efron's Bayesian deconvolution
(log-spline) estimator. It returns an object that is compatible with the
GLmix object, so predict can be used to compute estimates of posterior means
and quantiles. This seems particularly useful for confidence intervals as
shown here: http://www.econ.uiuc.edu/~roger/research/ebayes/cieb.pdf
Version 2.38
1. Added several utility functions rKW, qKW, bwKW for computing random
samples, quantiles, and smoothing bandwidths respectively for KW objects.
Version 2.39
1. Added GLVmix and associated predict and plot methods. This is like
the longitudinal WGLVmix fitting function except that the input data
structure is simplified for the case where the raw longitudinal data is
not available.
Version 2.40
1. Added KW2smooth and associated bandwidth function bwKW2
Version 2.41
1. Modified WGLVmix and GLVmix so that they return solutions in the "right"
order, that is such that contour(u,v,matrix(fuv,pu,pv)) works properly.
NB This change means that older plotting code such as used for the income dynamics
paper and the baseball paper would need to be modified slightly.
Version 2.43
1. Fixed bug in WGLVmix in construction of B matrix
2. Added predict method for WGLVmix and added "GLVmix" to the class
inheritance so that the plot method for GLVmix should work for WGLVmix
objects.
3. Changed predict methods for WGLVmix and GLVmix so that if newdata is
missing and Loss = 2 then just return object$du.
4. WGLVmix and GLVmix now return A matrix.
5. Added Lfdr generic and methods for GLVmix and WGLVmix objects.
6. Added qKW2, Finv and ThreshFDR to the misc.R file
Version 2.44
1. Modified Lfdr.R so Left option uses weak inequality. Jiaying's email
of 29/8/20.
Version 2.46
1. In version 2.45 a bug was introduced in predict.GLmix, attributable
to an inadvisable matrix vector multiplication. This bug was fixed here
and in other predict functions, in particular their quantile options were
also modified.
Version 2.47
1. Modified bwKW so bandwidth is bounded away from zero.
Version 2.48
1. Modified Gammamix to add eps to calling sequence since prior default
wasn't small enough. The new default is eps = 1e-10.
Version 2.50
1. Added the function RLR to estimate logistic regression models subject to
an l1 penalty on a linear transformation of the parameters.
2. Added logLik to the return object of RLR.
2.51
1. Bmix has added option to check uniqueness of solution and further
explanation about this in documentation.
2. GVmix has tighter eps to avoid problem with dgamma evaluation and outer
brute force evaluation of A matrix replaced by loop and system dgamma().
This was motivated by numerical problems with previous version for Daniel
Wilhelm's testing.
3. GLVmix and WGLVmix also replace brute force dgamma with system function.
4. Added v <- v[v > 0] for these scale mixture grids to avoid dgamma errors.
5. Reliance on dgompertz from reliaR package was removed due to anticipated
archival of that package announced 21 March 2022 by Kurt.
2.52
1. Added censoring option for Weibullmix
2.54
1. Removed mention of the pogs option for KWDual in the documention.
2. Changed default grid for TLmix to avoid extreme support points for grid.
3. Added error checking to KWDual to report Mosek warnings and errors. I
tried to distinguish between Mosek errors and warnings, but was defeated when
I found that the STALL warning was not 100006 as documented, but 10006. So
all Mosek messages are warnings as things currently stand. This change was
prompted by two inquiries that reported obscure R messages when the Mosek
message that it couldn't find the license file would have been better.
4. Turns out that the Mosek STALL message is more common than I originally
thought, so I've altered the changes in 3.) so that this particular warning
is only made when verb > 0.
5. Fixed various utils::packageVersion("Rmosek") < 9 to replace 9 by "9" per
Kurt's memo of 18/07/23.
6. fixed arg check in Gompertzmix