# Checking and Improving Results of package Synth

## Introduction

This vignette illustrates the usage of improveSynth. For a more general introduction to package MSCMT see its main vignette.

Estimating an SCM model involves searching for an approximate solution of a nested optimization problem. Although the formulation of the optimization problem is quite simple, finding a (good approximate) solution can be hard for several reasons, see Becker and Klößner (2017) and Becker and Klößner (2018). While implementing package MSCMT we put a lot of effort into the design of a smart and robust (but still fast) optimization procedure.

Apart from function mscmt for the estimation of SCM models based on our model syntax, we also included the convenience function improveSynth, which implements checks for feasibility and optimality of results delivered by package Synth. Below, we illustrate how to use improveSynth.

## First Example

We exemplify the usage of improveSynth based on the first example of function synth in package Synth.

### Generating the result of package Synth

The following code is thus essentially borrowed from the example section of the corresponding help page (all comments have been removed):

library(Synth)
## ##
## ## Synth Package: Implements Synthetic Control Methods.
## ## See http://www.mit.edu/~jhainm/software.htm for additional information.
data(synth.data)
dataprep.out <-
dataprep(
foo = synth.data,
predictors = c("X1", "X2", "X3"),
predictors.op = "mean",
dependent = "Y",
unit.variable = "unit.num",
time.variable = "year",
special.predictors = list(
list("Y", 1991, "mean"),
list("Y", 1985, "mean"),
list("Y", 1980, "mean")
),
treatment.identifier = 7,
controls.identifier = c(29, 2, 13, 17, 32, 38),
time.predictors.prior = c(1984:1989),
time.optimize.ssr = c(1984:1990),
unit.names.variable = "name",
time.plot = 1984:1996
)

synth.out <- synth(dataprep.out)
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
##  searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 4.714688
##
## solution.v:
##  0.00490263 0.003884407 0.1972011 0.2707289 0.0007091301 0.5225738
##
## solution.w:
##  0.0001407318 0.004851527 0.1697786 0.2173031 0.6079231 2.9419e-06

### Checking the result

We check the result by applying function improveSynth to synth.out and dataprep.out:

library(MSCMT)
synth2.out <- improveSynth(synth.out,dataprep.out)
## Results reported by package Synth
## =================================
##
## Optimal V    : 0.0049026303620646 0.00388440715187941 0.197201084472783
##                0.270728900094351 0.000709130113708991 0.522573847805214
## Optimal W*(V): 0.000140731762002351 0.00485152709141158 0.169778625515031
##                0.217303120466497 0.607923052750901 2.94191546847469e-06
## with corresponding predictor loss ('loss W') of 0.00588247
## and corresponding dependent loss ('loss V') of 4.714688.
##
##
## Feasibility of W*(V)
## ====================
##
## GOOD: W*(V) is (essentially) optimal (new loss W: 0.00587508910734561).
##
##
## Optimality of V
## ===============
##
## WARNING: 'Optimal' V (as reported by package Synth) is not optimal, (one
##          of potentially many) 'true' optimal V* (with sum(V*)=1):
## Optimal V*    : 5.6000357074924e-09 0.56000357074924 1.77473264640914e-08
##                 5.6000357074924e-09 5.60285881866157e-08 0.439996344274774
## Optimal W*(V*): 0 0 0.0456880744691394 0.250556360555894 0.627179685612279
##                 0.0765758793626869
## with corresponding predictor loss ('loss W') of 7.597302e-09
## and corresponding dependent loss ('loss V') of 4.440946.

Obviously, package Synth generated a feasible solution, but this solution is (considerably) suboptimal, because the original dependent loss of 4.714688 is considerably larger than the dependent loss 4.440946 obtained by improveSynth.

## Second Example

In the second example, we modify the first example by allowing package Synth to use genoud as (outer) optimization algorithm.

### Generating the result of package Synth

genoud is switched on by the corresponding function argument. We capture the output with capture.output because it is very verbose. Furthermore, the calculation is quite lengthy, therefore the results have been cached.1

if (file.exists("synth3.out.RData")) load ("synth3.out.RData") else {
set.seed(42)
out <- capture.output(synth3.out <- synth(dataprep.out,genoud=TRUE))
}  

### Checking the result

We again check the result by applying function improveSynth to synth3.out and dataprep.out:

synth4.out <- improveSynth(synth3.out,dataprep.out)
## Results reported by package Synth
## =================================
##
## Optimal V    : 2.0267533885719e-10 0.282598702518196 0.0021778890504857
##                0.00278189732871801 0.000373699249646763 0.712067811650278
## Optimal W*(V): 0.0420138249827668 0.0112849868071455 0.0223432791829297
##                0.218134146153805 0.595395073764248 0.110828688687666
## with corresponding predictor loss ('loss W') of 0.0001350879
## and corresponding dependent loss ('loss V') of 4.328506.
##
##
## Feasibility of W*(V)
## ====================
##
## WARNING: W*(V) is NOT optimal and thus infeasible!
## 'True' W*(V): 0 0 0 0.229091487590933 0.715755416007496 0.0551530964015713
## with corresponding predictor loss ('loss W') of 4.4355e-05
## and corresponding dependent loss ('loss V') of 6.09574.
##
##
## Optimality of V
## ===============
##
## WARNING: 'Optimal' V (as reported by package Synth) is not optimal (W*(V)
##          was infeasible), (one of potentially many) 'true' optimal V*
##          (with sum(V*)=1):
## Optimal V*    : 5.6000357074924e-09 0.56000357074924 1.77473264640914e-08
##                 5.6000357074924e-09 5.60285881866157e-08 0.439996344274774
## Optimal W*(V*): 0 0 0.0456880744691394 0.250556360555894 0.627179685612279
##                 0.0765758793626869
## with corresponding predictor loss ('loss W') of 7.597302e-09
## and corresponding dependent loss ('loss V') of 4.440946.

Now, package Synth generated a solution with a dependent loss of 4.328506 which is even smaller than the dependent loss 4.440946 obtained by improveSynth. However, the solution generated by Synth is infeasible: the inner optimization failed, returning a suboptimal weight vector w for the control units, which itself lead to a wrong calculation of the dependent loss (which, of course, depends on w). Implanting the true optimal w (depending on v) leads to a large increase of the dependent loss, which uncovers the suboptimality of v.

improveSynth is able to detect this severe problem and calculates an improved and feasible solution (the improved solution matches the solution obtained from the first call to improveSynth above, with a dependent loss of 4.440946).

## Summary

Issues with the inner and outer optimizers used in synth from package Synth may lead to infeasible or suboptimal solutions. This vignette illustrated the usage of the convenience function improveSynth from package MSCMT for checking and potentially improving results obtained from synth.

## References

Becker, Martin, and Stefan Klößner. 2017. “Estimating the Economic Costs of Organized Crime by Synthetic Control Methods.” Journal of Applied Econometrics 32 (7): 1367–9. http://dx.doi.org/10.1002/jae.2572.

———. 2018. “Fast and Reliable Computation of Generalized Synthetic Controls.” Econometrics and Statistics 5: 1–19. https://doi.org/10.1016/j.ecosta.2017.08.002.

1. To reproduce from scratch, please delete "synth3.out.RData" from the vignettes folder.