# Estimating Weighted Logit Models

This vignette demonstrates an example of how to use the logitr() function with the weightsName argument to estimate weighted logit models.

# The data

This example uses the cars_us data set from Helveston et al. (2015) containing 384 stated choice observations from US car buyers. Conjoint surveys were fielded in 2012 online in the US on Amazon Mechanical Turk and in person at the 2013 Pittsburgh Auto show. Participants were asked to select a vehicle from a set of three alternatives. Each participant answered 15 choice questions.

In the utility models described below, the data variables are represented as follows:

Symbol Variable
$$p$$ The price in US dollars.
$$x_{j}^{\mathrm{hev}}$$ Dummy variable for HEV vehicle type
$$x_{j}^{\mathrm{phev10}}$$ Dummy variable for PHEV10 vehicle type
$$x_{j}^{\mathrm{phev20}}$$ Dummy variable for PHEV20 vehicle type
$$x_{j}^{\mathrm{phev40}}$$ Dummy variable for PHEV40 vehicle type
$$x_{j}^{\mathrm{bev75}}$$ Dummy variable for BEV75 vehicle type
$$x_{j}^{\mathrm{bev100}}$$ Dummy variable for BEV100 vehicle type
$$x_{j}^{\mathrm{bev150}}$$ Dummy variable for BEV150 vehicle type
$$x_{j}^{\mathrm{phevFastcharge}}$$ Dummy variable for if the PHEV has a fast charging capability
$$x_{j}^{\mathrm{bevFastcharge}}$$ Dummy variable for if the BEV has a fast charging capability
$$x_{j}^{\mathrm{opCost}}$$ The vehicle operating costs (cents / mile)
$$x_{j}^{\mathrm{accelTime}}$$ The vehicle 0-60mph acceleration time
$$x_{j}^{\mathrm{american}}$$ Dummy variable for an American brand
$$x_{j}^{\mathrm{japanese}}$$ Dummy variable for a Japanese brand
$$x_{j}^{\mathrm{chinese}}$$ Dummy variable for a Chinese brand
$$x_{j}^{\mathrm{skorean}}$$ Dummy variable for a S. Korean brand

# The utility model

In this example, we’ll estimate two versions of the following utility model in the WTP space: one without weights and one with weights. Notation is taken from Helveston et al. (2015):

$$$\begin{split} &u_{j} = \lambda (\\ &\omega_1 x_{j}^{\mathrm{hev}} + \omega_2 x_{j}^{\mathrm{phev10}} + \omega_3 x_{j}^{\mathrm{phev20}} + \omega_4 x_{j}^{\mathrm{phev40}} +\\ &\omega_5 x_{j}^{\mathrm{bev75}} + \omega_6 x_{j}^{\mathrm{bev100}} + \omega_7 x_{j}^{\mathrm{bev150}} +\\ &\omega_8 x_{j}^{\mathrm{phevFastcharge}} + \omega_9 x_{j}^{\mathrm{bevFastcharge}} + \omega_{10} x_{j}^{\mathrm{opCost}} + \omega_{11} x_{j}^{\mathrm{accelTime}} +\\ &\omega_{12} x_{j}^{\mathrm{american}} + \omega_{13} x_{j}^{\mathrm{japanese}} + \omega_{14} x_{j}^{\mathrm{chinese}} + \omega_{15} x_{j}^{\mathrm{skorean}} - p_{j}\\ &) +\varepsilon_{j} \end{split} \label{eq:mnlWtpCarsExample}$$$

where all the $$\omega$$ parameters have units of dollars and $$\lambda$$ is the scale parameter.

# Unweighted model

To estimate the model, first load the logitr package:

library(logitr)

Estimate the unweighted model using the logitr() function:

mnl_wtp_unweighted <- logitr(
data       = cars_us,
choiceName = 'choice',
obsIDName  = 'obsnum',
parNames   = c(
'hev', 'phev10', 'phev20', 'phev40', 'bev75', 'bev100', 'bev150',
'american', 'japanese', 'chinese', 'skorean', 'phevFastcharge',
'bevFastcharge','opCost', 'accelTime'),
priceName = 'price',
modelSpace = 'wtp',
options = list(
# Since WTP space models are non-convex, run a multistart:
numMultiStarts = 10,
# Because the computed WTP from the preference space model has values
# as large as 8, I increase the boundaries of the random starting values:
startParBounds = c(-10, 10))
)
#> Running Multistart 1 of 10...
#> Running Multistart 2 of 10...
#> Running Multistart 3 of 10...
#> Running Multistart 4 of 10...
#> Running Multistart 5 of 10...
#> Running Multistart 6 of 10...
#> Running Multistart 7 of 10...
#> Running Multistart 8 of 10...
#> Running Multistart 9 of 10...
#> Running Multistart 10 of 10...
#> Done!
summary(mnl_wtp_unweighted)
#> =================================================
#> SUMMARY OF ALL MULTISTART RUNS:
#>
#>    run      logLik iterations status
#> 1    1   -4616.952         20      3
#> 2    2  -91921.671         12     -1
#> 3    3 -217550.757         12     -1
#> 4    4 -227771.717         12     -1
#> 5    5 -399278.274         12     -1
#> 6    6 -399225.516         12     -1
#> 7    7  -47900.544         12     -1
#> 8    8 -138607.411         12     -1
#> 9    9 -159194.787         12     -1
#> 10  10   -6753.815         12     -1
#> ---
#> Use statusCodes() to view the meaning of the status codes
#>
#> Below is the summary of run 1 of 10 multistart runs
#> (the run with the largest log-likelihood value)
#> =================================================
#> MODEL SUMMARY:
#>
#> Model Space:   Willingness-to-Pay
#> Model Run:                1 of 10
#> Iterations:                    20
#> Elapsed Time:            0h:0m:1s
#> Weights Used?:              FALSE
#>
#> Model Coefficients:
#>                  Estimate StdError    tStat   pVal signif
#> lambda           0.073874 0.002049  36.0587 0.0000    ***
#> hev              0.809009 0.996838   0.8116 0.4171
#> phev10           1.168442 1.065878   1.0962 0.2730
#> phev20           1.650408 1.078537   1.5302 0.1260
#> phev40           2.582623 1.070764   2.4119 0.0159      *
#> bev75          -16.045739 1.215388 -13.2022 0.0000    ***
#> bev100         -13.002791 1.197046 -10.8624 0.0000    ***
#> bev150          -9.572838 1.151531  -8.3131 0.0000    ***
#> american         2.343738 0.796228   2.9436 0.0033     **
#> japanese        -0.372092 0.792110  -0.4697 0.6386
#> chinese        -10.269311 0.869738 -11.8074 0.0000    ***
#> skorean         -6.028549 0.832726  -7.2395 0.0000    ***
#> phevFastcharge   2.878382 0.812240   3.5438 0.0004    ***
#> bevFastcharge    2.919785 0.906811   3.2198 0.0013     **
#> opCost          -1.636199 0.066921 -24.4499 0.0000    ***
#> accelTime       -1.697230 0.159503 -10.6408 0.0000    ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Model Fit Values:
#>
#> Log.Likelihood.         -4616.9518008
#> Null.Log.Likelihood.    -6328.0067827
#> AIC.                     9265.9036000
#> BIC.                     9372.4427000
#> Number.of.Observations.  5760.0000000

# Weighted model

To estimate the weighted model, simply add the weightsName argument to the call to logitr(), referring to the column of weights that will be used to weight each choice observation. In this example, the weights used in the weights column range from 0.2 to 5:

summary(cars_us$weights) #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 0.2000 0.2000 0.2000 0.6891 0.2000 5.0000 mnl_wtp_weighted <- logitr( data = cars_us, choiceName = 'choice', obsIDName = 'obsnum', parNames = c( 'hev', 'phev10', 'phev20', 'phev40', 'bev75', 'bev100', 'bev150', 'american', 'japanese', 'chinese', 'skorean', 'phevFastcharge', 'bevFastcharge','opCost', 'accelTime'), priceName = 'price', modelSpace = 'wtp', weightsName = 'weights', # This is the key argument for enabling weights options = list( numMultiStarts = 10, startParBounds = c(-10, 10)) ) #> Running Multistart 1 of 10... #> Running Multistart 2 of 10... #> Running Multistart 3 of 10... #> Running Multistart 4 of 10... #> Running Multistart 5 of 10... #> Running Multistart 6 of 10... #> Running Multistart 7 of 10... #> Running Multistart 8 of 10... #> Running Multistart 9 of 10... #> Running Multistart 10 of 10... #> Done! Print a summary of the results: summary(mnl_wtp_weighted) #> ================================================= #> SUMMARY OF ALL MULTISTART RUNS: #> #> run logLik iterations status #> 1 1 -3425.630 20 3 #> 2 2 -68903.400 12 -1 #> 3 3 -171394.794 12 -1 #> 4 4 -91447.066 12 -1 #> 5 5 -3882.399 57 -1 #> 6 6 -320972.429 12 -1 #> 7 7 -207883.379 12 -1 #> 8 8 -11220.800 12 -1 #> 9 9 -94653.348 12 -1 #> 10 10 -101932.682 12 -1 #> --- #> Use statusCodes() to view the meaning of the status codes #> #> Below is the summary of run 1 of 10 multistart runs #> (the run with the largest log-likelihood value) #> ================================================= #> MODEL SUMMARY: #> #> Model Space: Willingness-to-Pay #> Model Run: 1 of 10 #> Iterations: 20 #> Elapsed Time: 0h:0m:0.56s #> Weights Used?: TRUE #> #> Model Coefficients: #> Estimate StdError tStat pVal signif #> lambda 0.052287 0.002092 24.9882 0.0000 *** #> hev -1.173829 1.611264 -0.7285 0.4663 #> phev10 0.026970 1.781966 0.0151 0.9879 #> phev20 1.695364 1.751177 0.9681 0.3330 #> phev40 2.650029 1.774308 1.4936 0.1353 #> bev75 -20.123043 1.977590 -10.1755 0.0000 *** #> bev100 -19.483936 1.983661 -9.8222 0.0000 *** #> bev150 -13.682128 1.958643 -6.9855 0.0000 *** #> american 8.189383 1.288470 6.3559 0.0000 *** #> japanese 0.933201 1.288933 0.7240 0.4691 #> chinese -19.003111 1.549420 -12.2647 0.0000 *** #> skorean -9.508394 1.397768 -6.8026 0.0000 *** #> phevFastcharge 3.948589 1.330074 2.9687 0.0030 ** #> bevFastcharge 3.332094 1.477859 2.2547 0.0242 * #> opCost -1.597512 0.106356 -15.0205 0.0000 *** #> accelTime -1.171728 0.255409 -4.5876 0.0000 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Model Fit Values: #> #> Log.Likelihood. -3425.6303371 #> Null.Log.Likelihood. -4360.5909275 #> AIC. 6883.2607000 #> BIC. 6989.7998000 #> McFadden.R2. 0.2144114 #> Adj..McFadden.R2 0.2107422 #> Number.of.Observations. 5760.0000000 # Compare results Here is a comparison of the coefficients between the weighted and unweighted models. All of the significant coefficients have the same sign, but the magnitudes shift some based on the differential weighting of each individual choice in the weighted model: coef_compare <- data.frame( Unweighted = coef(mnl_wtp_unweighted), Weighted = coef(mnl_wtp_weighted)) coef_compare #> Unweighted Weighted #> lambda 0.07387402 0.05228733 #> hev 0.80900893 -1.17382947 #> phev10 1.16844217 0.02697028 #> phev20 1.65040790 1.69536415 #> phev40 2.58262306 2.65002859 #> bev75 -16.04573856 -20.12304300 #> bev100 -13.00279092 -19.48393620 #> bev150 -9.57283825 -13.68212753 #> american 2.34373764 8.18938288 #> japanese -0.37209245 0.93320075 #> chinese -10.26931078 -19.00311133 #> skorean -6.02854890 -9.50839420 #> phevFastcharge 2.87838215 3.94858853 #> bevFastcharge 2.91978513 3.33209388 #> opCost -1.63619932 -1.59751197 #> accelTime -1.69722990 -1.17172825 Here is a comparison of the log-likelihood for the weighted and unweighted models: logLik_compare <- c( "Unweighted" = mnl_wtp_unweighted$logLik,
"Weighted" = mnl_wtp_weighted\$logLik)
logLik_compare
#> Unweighted   Weighted
#>  -4616.952  -3425.630

# References

Helveston, John Paul, Yimin Liu, Eleanor Mcdonnell Feit, Erica R. H. Fuchs, Erica Klampfl, and Jeremy J. Michalek. 2015. Will Subsidies Drive Electric Vehicle Adoption? Measuring Consumer Preferences in the U.S. and China.” Transportation Research Part A: Policy and Practice 73: 96–112. https://doi.org/10.1016/j.tra.2015.01.002.