OxCOVID19 Database Access

Ernest Guevarra

In this vignette, we show how to perform the OxCOVID19 Database access examples shown here using R and the oxcovid19 package.

Epidemiology table

List of available sources

The task here is to list out all the unique epidemiology table sources sorted alphabetically.

library(oxcovid19)
library(magrittr)
library(dplyr)

connect_oxcovid19() %>%                      ## Connect to PostgreSQL server
  get_table(tbl_name = "epidemiology") %>%   ## Retrieve epidemiology table
  arrange(source) %>%                        ## Sort the table by source
  select(source) %>%                         ## Select the source column
  distinct() %>%                             ## Get only unique sources
  pull(source)                                 
#>  [1] "AUS_C1A"    "BEL_LE"     "BEL_SCI"    "BEL_WY"     "BRA_MSHM"  
#>  [6] "CAN_GOV"    "CHE_OPGOV"  "CHN_ICL"    "DEU_JPGG"   "ESP_MS"    
#> [11] "ESP_MSVP"   "EU_ZH"      "FRA_SPF"    "FRA_SPFCG"  "GBR_NIDH"  
#> [16] "GBR_PHE"    "GBR_PHS"    "GBR_PHTW"   "GBR_PHW"    "IDN_GTPPC" 
#> [21] "IND_COVIND" "IRL_HSPC"   "IRQ_GOV"    "ITA_PC"     "ITA_PCDM"  
#> [26] "JPN_C1JACD" "KOR_DS4C"   "LAT_DSRP"   "LBN_GOV"    "MYS_MHYS"  
#> [31] "NGA_CDC"    "NGA_SO"     "NLD_CW"     "PAK_GOV"    "POL_WIKI"  
#> [36] "PRT_MSDS"   "RUS_GOV"    "SAU_GOV"    "SWE_GM"     "SWE_SIR"   
#> [41] "THA_STAT"   "TUR_MHOE"   "UAE_GOV"    "USA_CTP"    "USA_NYT"   
#> [46] "WRD_ECDC"   "WRD_WHO"    "WRD_WHOJHU" "ZAF_DSFSI"

Data for single source

In this example, the task is to retrieve the epidemiology table and then get only the data from source GBR_PHTW and then sort resulting dataset by decreasing date.

connect_oxcovid19() %>%                      ## Connect to PostgreSQL server
  get_table(tbl_name = "epidemiology") %>%   ## Retrieve epidemiology table
  filter(source == "GBR_PHTW") %>%           ## Select specific source
  arrange(desc(date))                        ## Sort by date
#> # Source:     lazy query [?? x 15]
#> # Database:   postgres [covid19@covid19db.org:5432/covid19]
#> # Ordered by: desc(date)
#>    source date       country countrycode adm_area_1 adm_area_2 adm_area_3 tested
#>    <chr>  <date>     <chr>   <chr>       <chr>      <chr>      <chr>       <int>
#>  1 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Weste… <NA>           NA
#>  2 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Taysi… <NA>           NA
#>  3 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Shetl… <NA>           NA
#>  4 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Orkney <NA>           NA
#>  5 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Lothi… <NA>           NA
#>  6 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Lanar… <NA>           NA
#>  7 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Highl… <NA>           NA
#>  8 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Great… <NA>           NA
#>  9 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Gramp… <NA>           NA
#> 10 GBR_P… 2020-08-01 United… GBR         Scotland   NHS Forth… <NA>           NA
#> # … with more rows, and 7 more variables: confirmed <int>, recovered <int>,
#> #   dead <int>, hospitalised <int>, hospitalised_icu <int>, quarantined <int>,
#> #   gid <chr>

For both tasks, the results in the example were replicated in R using the oxcovid19 functions.

Mobility

The task here was to retrieve the mobility table from the PostgreSQL server and then extract only those with GOOGLE_MOBILITY as the source and GBR as the country code. Finally, the resulting table is sorted by date.

connect_oxcovid19() %>%                      ## Connect to PostgreSQL server
  get_table(tbl_name = "mobility") %>%       ## Retrieve mobility table
  filter(source == "GOOGLE_MOBILITY",        ## Get only data from `Google`
         countrycode == "GBR") %>%           ## Get only data from `GBR`
  arrange(desc(date))                        ## Sort by date
#> # Source:     lazy query [?? x 17]
#> # Database:   postgres [covid19@covid19db.org:5432/covid19]
#> # Ordered by: desc(date)
#>    source date       country countrycode adm_area_1 adm_area_2 adm_area_3
#>    <chr>  <date>     <chr>   <chr>       <chr>      <chr>      <chr>     
#>  1 GOOGL… 2020-08-07 United… GBR         Wales      Wrexham    <NA>      
#>  2 GOOGL… 2020-08-07 United… GBR         Wales      Vale of G… <NA>      
#>  3 GOOGL… 2020-08-07 United… GBR         Wales      Torfaen    <NA>      
#>  4 GOOGL… 2020-08-07 United… GBR         Wales      Swansea    <NA>      
#>  5 GOOGL… 2020-08-07 United… GBR         Wales      Rhondda, … <NA>      
#>  6 GOOGL… 2020-08-07 United… GBR         Wales      Powys      <NA>      
#>  7 GOOGL… 2020-08-07 United… GBR         Wales      Pembrokes… <NA>      
#>  8 GOOGL… 2020-08-07 United… GBR         Wales      Newport    <NA>      
#>  9 GOOGL… 2020-08-07 United… GBR         Wales      Neath Por… <NA>      
#> 10 GOOGL… 2020-08-07 United… GBR         Wales      Monmouths… <NA>      
#> # … with more rows, and 10 more variables: transit_stations <dbl>,
#> #   residential <dbl>, workplace <dbl>, parks <dbl>, retail_recreation <dbl>,
#> #   grocery_pharmacy <dbl>, gid <chr>, transit <dbl>, walking <dbl>,
#> #   driving <dbl>

The results match the results shown in the examples.