BLE_SRS

library(BayesSampling)

Application of the BLE to the Simple Random Sample design

(From Section 2.3.1 of the “Gonçalves, Moura and Migon: Bayes linear estimation for finite population with emphasis on categorical data”)

In a simple model, where there is no auxiliary variable, and a Simple Random Sample was taken from the population, we can calculate the Bayes Linear Estimator for the individuals of the population with the BLE_SRS() function, which receives the following parameters:

• $$y_s$$ - either a vector containing the observed values or just the value for the sample mean ($$\sigma$$ and $$n$$ parameters will be required in this case);
• $$N$$ - total size of the population;
• $$m$$ - prior mean. If NULL, sample mean will be used (non-informative prior);
• $$v$$ - prior variance of an element from the population ($$> \sigma^2$$). If NULL, it will tend to infinity (non-informative prior);
• $$\sigma$$ - prior estimate of variability (standard deviation) within the population. If NULL, sample variance will be used;
• $$n$$ - sample size. Necessary only if $$y_s$$ represent sample mean (will not be used otherwise).

Vague Prior Distribution

Letting $$v \to \infty$$ and keeping $$\sigma^2$$ fixed, that is, assuming prior ignorance, the resulting estimator will be the same as the one seen in the design-based context for the simple random sampling case.

This can be achieved using the BLE_SRS() function by omitting either the prior mean and/or the prior variance, that is:

• $$m =$$ NULL - the sample mean will be used
• $$v =$$ NULL - prior variance will tend to infinity

Examples

1. We will use the TeachingSampling’s BigCity dataset for this example (actually we have to take a sample of size $$10000$$ from this dataset so that R can perform the calculations). Imagine that we want to estimate the mean or the total Expenditure of this population, after taking a simple random sample of only 20 individuals, but applying a prior information (taken from a previous study or an expert’s judgment) about the mean expenditure (a priori mean = $$300$$).
data(BigCity)
set.seed(1)
Expend <- sample(BigCity$Expenditure,10000) mean(Expend) #Real mean expenditure value, goal of the estimation #> [1] 375.586 ys <- sample(Expend, size = 20, replace = FALSE) Our design-based estimator for the mean will be the sample mean: mean(ys) #> [1] 479.869 Applying the prior information about the population we can get a better estimate, especially in cases when only a small sample is available: Estimator <- BLE_SRS(ys, N = 10000, m=300, v=10.1^5, sigma = sqrt(10^5)) Estimator$est.beta
#>       Beta
#> 1 390.8338
Estimator$Vest.beta #> V1 #> 1 2524.999 Estimator$est.mean[1,]
#> [1] 390.8338
Estimator$Vest.mean[1:5,1:5] #> V1 V2 V3 V4 V5 #> 1 102524.999 2524.999 2524.999 2524.999 2524.999 #> 2 2524.999 102524.999 2524.999 2524.999 2524.999 #> 3 2524.999 2524.999 102524.999 2524.999 2524.999 #> 4 2524.999 2524.999 2524.999 102524.999 2524.999 #> 5 2524.999 2524.999 2524.999 2524.999 102524.999 1. Example from the help page ys <- c(5,6,8) N <- 5 m <- 6 v <- 5 sigma <- 1 Estimator <- BLE_SRS(ys, N, m, v, sigma) Estimator #>$est.beta
#>       Beta
#> 1 6.307692
#>
#> $Vest.beta #> V1 #> 1 0.3076923 #> #>$est.mean
#>     y_nots
#> 1 6.307692
#> 2 6.307692
#>
#> $Vest.mean #> V1 V2 #> 1 1.3076923 0.3076923 #> 2 0.3076923 1.3076923 #> #>$est.tot
#> [1] 31.61538
#>
#> $Vest.tot #> [1] 3.230769 1. Example from the help page, but informing sample mean and sample size instead of sample observations ys <- mean(c(5,6,8)) n <- 3 N <- 5 m <- 6 v <- 5 sigma <- 1 Estimator <- BLE_SRS(ys, N, m, v, sigma, n) #> sample mean informed instead of sample observations, parameters 'n' and 'sigma' will be necessary Estimator #>$est.beta
#>       Beta
#> 1 6.307692
#>
#> $Vest.beta #> V1 #> 1 0.3076923 #> #>$est.mean
#>     y_nots
#> 1 6.307692
#> 2 6.307692
#>
#> $Vest.mean #> V1 V2 #> 1 1.3076923 0.3076923 #> 2 0.3076923 1.3076923 #> #>$est.tot
#> [1] 31.61538
#>
#> \$Vest.tot
#> [1] 3.230769