# Reproducing Results

#### 2016-07-17

library(heuristica)

# Prior results

Heuristica can be used to reproduce results from prior research, making sure to have the same data set and heuristics. Consider the city population results on page 103 of Simple Heuristics That Make Us Smart (citation below). It reported the following percent correct for models fitting city populations– that is, when the models are given all 83 cities to fit and then are asked to estimate the poulation of the same 83 cities:

Regression Dawes Take The Best Minimalist
74 74 74 70

# Getting the same data and models

## Data set

To use the exact same data set as that research, we need city_population_original, even though it had some transcription errors from the almanac. (city_population corrected the transciption errors.)

## Models

Here is how the models from prior research map the heuristica models:

Regression Dawes’ model Take The Best Minimalist
regInterceptModel unitWeightModel ttbModel minModel

Two of the mappings are non-obvious.

regInterceptModel By default, regression models include the intercept, so prior research included it, and that maps to regInterceptModel. However, when comparing two estimates, the intercepts cancel out, so fitting the intercept just wastes a degree of freedom. Future research should use regModel.

unitWeightModel Because the term “dawes model” did not catch on, this package uses the more commonly-known descriptive name, unitWeightModel. The models are the same.

(Likewise Franklin’s model from the book maps to validityWeightModel in heuristica.)

# Simulation

Now fit the models to the data. Regression will determine its beta weights, ttb will determine cue order, and unit weight linear and minimalist will determine cue direction (whether having a soccer team is associate with higher or lower population).

data_set <- city_population_original
criterion_col <- 3    # Population
cols_to_fit <- 4:ncol(data_set) # The 9 cues

reg <- regInterceptModel(data_set, criterion_col, cols_to_fit)
ttb <- ttbModel(data_set, criterion_col, cols_to_fit)
unit <- unitWeightModel(data_set, criterion_col, cols_to_fit)
min <- minModel(data_set, criterion_col, cols_to_fit)

To determin the percent of correct inference in fitting, pass the exact same data set into the percent correct function with the fitted models.

out <- percentCorrectList(data_set, list(reg, ttb, unit, min))
# Round values to make comparison easier.
round(out)
##   regInterceptModel ttbModel unitWeightModel minModel
## 1                74       74              74       71

The results match the results reported in the book– 74% correct for all models except minimalist, which got only 70% correct.

# Prediction

The code abov simulations fitting. To simulate prediction, fit the models with a subset of the cities and run percentCorrect with the other cities. Then repeat for various ways to split the data and run summary statistics on the results. A demonstration of how to do this is in the cross-validation vignette.