Using EpiGraphDB R package

library("epigraphdb")

Methods to query EpiGraphDB

We provide a list of functions that are equivalent to the upstream API endpoints for users to use. For endpoints that don’t have equivalent functions, users can use the query_epigraphdb function.

Here we show the two approaches to query MR data from EpiGraphDB, using the mr function and using query_epigraphdb to query the GET /mr endpoint.

mr

df <- mr(
  exposure_trait = "Body mass index",
  outcome_trait = "Coronary heart disease",
  mode = "table"
)
df
#> # A tibble: 6 x 10
#>   exposure.id exposure.trait outcome.id outcome.trait  mr.b  mr.se  mr.pval
#>   <chr>       <chr>          <chr>      <chr>         <dbl>  <dbl>    <dbl>
#> 1 ieu-a-974   Body mass ind… ieu-a-7    Coronary hea… 0.389 0.0493 3.42e-15
#> 2 ieu-a-2     Body mass ind… ieu-a-7    Coronary hea… 0.397 0.0727 4.79e- 8
#> 3 ieu-a-95    Body mass ind… ieu-a-7    Coronary hea… 0.455 0.0931 1.02e- 6
#> 4 ieu-a-2     Body mass ind… ieu-a-8    Coronary hea… 0.331 0.0684 1.31e- 6
#> 5 ieu-a-835   Body mass ind… ieu-a-7    Coronary hea… 0.360 0.0756 1.94e- 6
#> 6 ieu-a-974   Body mass ind… ieu-a-8    Coronary hea… 0.328 0.0731 6.97e- 6
#> # … with 3 more variables: mr.method <chr>, mr.selection <chr>,
#> #   mr.moescore <dbl>

GET /mr

df <- query_epigraphdb(
  route = "/mr",
  params = list(
    exposure_trait = "Body mass index",
    outcome_trait = "Coronary heart disease"
  ),
  mode = "table"
)

df
#> # A tibble: 6 x 10
#>   exposure.id exposure.trait outcome.id outcome.trait  mr.b  mr.se  mr.pval
#>   <chr>       <chr>          <chr>      <chr>         <dbl>  <dbl>    <dbl>
#> 1 ieu-a-974   Body mass ind… ieu-a-7    Coronary hea… 0.389 0.0493 3.42e-15
#> 2 ieu-a-2     Body mass ind… ieu-a-7    Coronary hea… 0.397 0.0727 4.79e- 8
#> 3 ieu-a-95    Body mass ind… ieu-a-7    Coronary hea… 0.455 0.0931 1.02e- 6
#> 4 ieu-a-2     Body mass ind… ieu-a-8    Coronary hea… 0.331 0.0684 1.31e- 6
#> 5 ieu-a-835   Body mass ind… ieu-a-7    Coronary hea… 0.360 0.0756 1.94e- 6
#> 6 ieu-a-974   Body mass ind… ieu-a-8    Coronary hea… 0.328 0.0731 6.97e- 6
#> # … with 3 more variables: mr.method <chr>, mr.selection <chr>,
#> #   mr.moescore <dbl>

For more information on the API endpoints, please visit:

Returned data format

As a general principle, we offer two modes of the returned data: a table mode (default) that returns a data frame, and a raw mode that preserves the hierarchical structure of the upstream json data and contains other information that might benefit users.

mode = "table"

By default, for ease of use, the query returns a data frame which is a tidyverse tibble:

df <- mr(
  exposure_trait = "Body mass index",
  outcome_trait = "Coronary heart disease"
)
df
#> # A tibble: 6 x 10
#>   exposure.id exposure.trait outcome.id outcome.trait  mr.b  mr.se  mr.pval
#>   <chr>       <chr>          <chr>      <chr>         <dbl>  <dbl>    <dbl>
#> 1 ieu-a-974   Body mass ind… ieu-a-7    Coronary hea… 0.389 0.0493 3.42e-15
#> 2 ieu-a-2     Body mass ind… ieu-a-7    Coronary hea… 0.397 0.0727 4.79e- 8
#> 3 ieu-a-95    Body mass ind… ieu-a-7    Coronary hea… 0.455 0.0931 1.02e- 6
#> 4 ieu-a-2     Body mass ind… ieu-a-8    Coronary hea… 0.331 0.0684 1.31e- 6
#> 5 ieu-a-835   Body mass ind… ieu-a-7    Coronary hea… 0.360 0.0756 1.94e- 6
#> 6 ieu-a-974   Body mass ind… ieu-a-8    Coronary hea… 0.328 0.0731 6.97e- 6
#> # … with 3 more variables: mr.method <chr>, mr.selection <chr>,
#> #   mr.moescore <dbl>

mode = "raw"

Alternatively, you can use results_type = "raw" to get the unformatted response from EpiGraphDB API.

response <- mr(
  exposure_trait = "Body mass index",
  outcome_trait = "Coronary heart disease",
  mode = "raw"
)
response %>% str(2)
#> List of 2
#>  $ metadata:List of 3
#>   ..$ query        : chr "MATCH (exposure:Gwas)-[mr:MR]->(outcome:Gwas) WHERE exposure.trait = \"Body mass index\" AND outcome.trait = \""| __truncated__
#>   ..$ total_seconds: num 0.0128
#>   ..$ empty_results: logi FALSE
#>  $ results :List of 6
#>   ..$ :List of 3
#>   ..$ :List of 3
#>   ..$ :List of 3
#>   ..$ :List of 3
#>   ..$ :List of 3
#>   ..$ :List of 3

There are several reasons that a raw mode might benefit you:

  1. The results component preserves the upstream hierarchical json structure that might be useful for users aiming for specific tasks such as rendering network plots or batch post-processing the returned data in a large scale.

  2. The query component returns the cypher query that fetches data from the EpiGraphDB neo4j databases. EpiGraphDB will offer functionality (forthcoming) for users to send cypher queries to the web API that can return more complex query structure (visit our web app for examples). Once you are sufficiently well-versed in cypher you can construct your own refined queries to better suit your needs.