This vignette demonstrates an example of how to use the logitr()
function to estimate multinomial logit (MNL) models with preference space and WTP space utility parameterizations.
This example uses the yogurt data set from Jain et al. (1994). The data set contains 2,412 choice observations from a series of yogurt purchases by a panel of 100 households in Springfield, Missouri, over a roughly two-year period. The data were collected by optical scanners and contain information about the price, brand, and a “feature” variable, which identifies whether a newspaper advertisement was shown to the customer. There are four brands of yogurt: Yoplait, Dannon, Weight Watchers, and Hiland, with market shares of 34%, 40%, 23% and 3%, respectively.
In the utility models described below, the data variables are represented as follows:
Symbol | Variable |
---|---|
\(p\) | The price in US dollars. |
\(x_{j}^{\mathrm{Feat}}\) | Dummy variable for whether the newspaper advertisement was shown to the customer. |
\(x_{j}^{\mathrm{Hiland}}\) | Dummy variable for the “Highland” brand. |
\(x_{j}^{\mathrm{Yoplait}}\) | Dummy variable for the “Yoplait” brand. |
\(x_{j}^{\mathrm{Dannon}}\) | Dummy variable for the “Dannon” brand. |
This example will estimate the following homogeneous multinomial logit model in the preference space:
\[\begin{equation} u_{j} = \alpha p_{j} + \beta_1 x_{j}^{\mathrm{Feat}} + \beta_2 x_{j}^{\mathrm{Hiland}} + \beta_3 x_{j}^{\mathrm{Yoplait}} + \beta_4 x_{j}^{\mathrm{Dannon}} + \varepsilon_{j} \label{eq:mnlPrefExample} \end{equation}\]
where the parameters \(\alpha\), \(\beta_1\), \(\beta_2\), \(\beta_3\), and \(\beta_4\) have units of utility.
To estimate the model, first load the logitr package:
library(logitr)
Estimate the model using the logitr()
function:
<- logitr(
mnl_pref data = yogurt,
choiceName = 'choice',
obsIDName = 'obsID',
parNames = c('price', 'feat', 'hiland', 'yoplait', 'dannon')
)
#> Running Model...
#> Done!
Print a summary of the results:
summary(mnl_pref)
#> =================================================
#> MODEL SUMMARY:
#>
#> Model Space: Preference
#> Model Run: 1 of 1
#> Iterations: 21
#> Elapsed Time: 0h:0m:0.12s
#> Weights Used?: FALSE
#>
#> Model Coefficients:
#> Estimate StdError tStat pVal signif
#> price -0.366581 0.024366 -15.0448 0 ***
#> feat 0.491436 0.120063 4.0932 0 ***
#> hiland -3.074385 0.145383 -21.1468 0 ***
#> yoplait 1.375768 0.088982 15.4613 0 ***
#> dannon 0.641174 0.054498 11.7650 0 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Model Fit Values:
#>
#> Log.Likelihood. -2656.8878782
#> Null.Log.Likelihood. -3343.7419990
#> AIC. 5323.7758000
#> BIC. 5352.7168000
#> McFadden.R2. 0.2054148
#> Adj..McFadden.R2 0.2039195
#> Number.of.Observations. 2412.0000000
View the estimated model coefficients:
coef(mnl_pref)
#> price feat hiland yoplait dannon
#> -0.3665807 0.4914356 -3.0743854 1.3757681 0.6411738
Compute the WTP implied from the preference space model:
<- wtp(mnl_pref, priceName = "price") wtp_mnl_pref
wtp_mnl_pref#> Estimate StdError tStat pVal signif
#> lambda 0.366581 0.024358 15.0495 0e+00 ***
#> feat 1.340593 0.358979 3.7345 2e-04 ***
#> hiland -8.386654 0.509515 -16.4601 0e+00 ***
#> yoplait 3.752975 0.470260 7.9806 0e+00 ***
#> dannon 1.749066 0.199298 8.7761 0e+00 ***
This example will estimate the following homogeneous multinomial logit model in the WTP space:
\[\begin{equation} u_{j} = \lambda ( \omega_1 x_{j}^{\mathrm{Feat}} + \omega_2 x_{j}^{\mathrm{Hiland}} + \omega_3 x_{j}^{\mathrm{Yoplait}} + \omega_4 x_{j}^{\mathrm{Dannon}} - p_{j}) + \varepsilon_{j} \label{eq:mnlWtpExample} \end{equation}\]
where the parameters \(\omega_1\), \(\omega_2\), \(\omega_3\), and \(\omega_4\) have units of dollars and \(\lambda\) is the scale parameter.
Estimate the model using the logitr()
function:
# Extract the WTP computed from the preference space model
# to use as the initial starting values
<- wtp_mnl_pref$Estimate
startingValues
<- logitr(
mnl_wtp data = yogurt,
choiceName = 'choice',
obsIDName = 'obsID',
parNames = c('feat', 'hiland', 'yoplait', 'dannon'),
priceName = 'price',
modelSpace = 'wtp',
options = list(
# Since WTP space models are non-convex, run a multistart:
numMultiStarts = 10,
# If you want to view the results from each multistart run,
# set keepAllRuns=TRUE:
keepAllRuns = TRUE,
# Use the computed WTP from the preference space model as the starting
# values for the first run:
startVals = startingValues,
# Because the computed WTP from the preference space model has values
# as large as 8, I increase the boundaries of the random starting values:
startParBounds = c(-10, 10)))
#> Running Multistart 1 of 10...
#> NOTE: Using user-provided starting values for this run
#> Running Multistart 2 of 10...
#> Running Multistart 3 of 10...
#> Running Multistart 4 of 10...
#> Running Multistart 5 of 10...
#> Running Multistart 6 of 10...
#> Running Multistart 7 of 10...
#> Running Multistart 8 of 10...
#> Running Multistart 9 of 10...
#> Running Multistart 10 of 10...
#> Done!
Print a summary of the results:
summary(mnl_wtp)
#> =================================================
#> SUMMARY OF ALL MULTISTART RUNS:
#>
#> run logLik iterations status
#> 1 1 -2791.167 37 3
#> 2 2 -2804.468 83 3
#> 3 3 -2803.789 82 3
#> 4 4 -2656.888 115 3
#> 5 5 -2900.218 47 -1
#> 6 6 -2656.888 40 3
#> 7 7 -2850.196 61 -1
#> 8 8 -2803.533 86 3
#> 9 9 -2803.581 102 3
#> 10 10 -37481.797 12 -1
#> ---
#> Use statusCodes() to view the meaning of the status codes
#>
#> Below is the summary of run 6 of 10 multistart runs
#> (the run with the largest log-likelihood value)
#> =================================================
#> MODEL SUMMARY:
#>
#> Model Space: Willingness-to-Pay
#> Model Run: 6 of 10
#> Iterations: 40
#> Elapsed Time: 0h:0m:0.24s
#> Weights Used?: FALSE
#>
#> Model Coefficients:
#> Estimate StdError tStat pVal signif
#> lambda 0.366591 0.024367 15.0446 0e+00 ***
#> feat 1.340561 0.355862 3.7671 2e-04 ***
#> hiland -8.386701 0.502486 -16.6904 0e+00 ***
#> yoplait 3.752872 0.168118 22.3229 0e+00 ***
#> dannon 1.749022 0.179892 9.7226 0e+00 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Model Fit Values:
#>
#> Log.Likelihood. -2656.8878781
#> Null.Log.Likelihood. -3343.7419990
#> AIC. 5323.7758000
#> BIC. 5352.7168000
#> McFadden.R2. 0.2054148
#> Adj..McFadden.R2 0.2039195
#> Number.of.Observations. 2412.0000000
View the estimated model coefficients:
coef(mnl_wtp)
#> lambda feat hiland yoplait dannon
#> 0.3665908 1.3405612 -8.3867008 3.7528724 1.7490222
Since WTP space models are non-convex, you cannot be certain that the model reached a global solution, even when using a multi-start. However, homogeneous models in the preference space are convex, so you are guaranteed to find the global solution in that space. Therefore, it can be useful to compute the WTP from the preference space model and compare it against the WTP from the WTP space model. If the WTP values and log-likelihood values from the two model spaces are equal, then the WTP space model is likely at a global solution.
To compare the WTP and log-likelihood values between the preference space and WTP space models, use the wtpCompare()
function:
<- wtpCompare(mnl_pref, mnl_wtp, priceName = 'price') wtp_mnl_comparison
wtp_mnl_comparison#> pref wtp difference
#> lambda 0.366581 0.3665908 0.00000976
#> feat 1.340593 1.3405612 -0.00003180
#> hiland -8.386654 -8.3867008 -0.00004677
#> yoplait 3.752975 3.7528724 -0.00010263
#> dannon 1.749066 1.7490222 -0.00004380
#> logLik -2656.887878 -2656.8878781 0.00000011