library(rstac)

Introduction

This tutorial will use the open-source package rstac to search data in Planetary Computer’s SpatioTemporal Asset Catalog (STAC) service. STAC services can be accessed through STAC API endpoints, which allow users to search datasets using various parameters such as space and time. In addition to demonstrating the use of rstac, the tutorial will explain the Common Query Language (CQL2) filter extension to narrow the search results and find datasets that meet specific criteria in the STAC API.

This tutorial is based on reading STAC API data in Python.

Reading data from STAC API

To access Planetary Computer STAC API, we’ll create a rstac query.

planetary_computer <- stac("https://planetarycomputer.microsoft.com/api/stac/v1")
planetary_computer
#> ###RSTACQuery
#> - url: https://planetarycomputer.microsoft.com/api/stac/v1
#> - params:
#> - field(s): version, base_url, endpoint, params, verb, encode

Listing supported properties in CQL2

CQL2 expressions can be constructed using properties that refer to attributes of items. A list of all properties supported by a collection can be obtained by accessing the /collections/<collection_id>/queryables endpoint. Filter expressions can use properties listed in this endpoint.

In this example, we will search for Landsat Collection 2 Level-2 imagery of the Microsoft main campus from December 2020. The name of this collection in STAC service is landsat-c2-l2. Here we’ll prepare a query to retrieve its queriables and make a GET request to the service.

planetary_computer |>
  collections("landsat-c2-l2") |> 
  queryables() |> 
  get_request()
#> ###Queryables
#> - properties
#>   - datetime
#>   - id
#>   - eo:cloud_cover
#>   - view:off_nadir
#>   - view:sun_azimuth
#>   - view:sun_elevation
#>   - landsat:wrs_row
#>   - landsat:wrs_path
#>   - landsat:scene_id
#>   - landsat:cloud_cover_land
#>   - landsat:collection_category
#>   - platform
#>   - instrument
#> - field(s): $schema, $id, type, title, properties

Searching with CQL2

Now we can use rstac to make a search query with CQL2 filter extension to obtain the items.

time_range <- cql2_interval("2020-12-01", "2020-12-31")
bbox <- c(-122.2751, 47.5469, -121.9613, 47.7458)
area_of_interest = cql2_bbox_as_geojson(bbox)

stac_items <- planetary_computer |>
  ext_filter(
    collection == "landsat-c2-l2" &&
      t_intersects(datetime, {{time_range}}) &&
      s_intersects(geometry, {{area_of_interest}})
  ) |>
  post_request()

In that example, our filter expression used a temporal (t_intersects) and a spatial (s_intersects) operators. t_intersects() only accepts interval as it second argument, which we created using function cql2_interval(). s_intersects() spatial operator only accepts GeoJSON objects as its arguments. This is why we had to convert the bounding box vector (bbox) into a structure representing a GeoJSON object using the function cql2_bbox_as_geojson(). We embrace the arguments using {{ to evaluate them before make the request.

items is a STACItemCollection object containing 8 items that matched our search criteria.

stac_items
#> ###STACItemCollection
#> - features (8 item(s)):
#>   - LC08_L2SP_046027_20201229_02_T2
#>   - LE07_L2SP_047027_20201228_02_T1
#>   - LE07_L2SP_046027_20201221_02_T2
#>   - LC08_L2SP_047027_20201220_02_T2
#>   - LC08_L2SP_046027_20201213_02_T2
#>   - LE07_L2SP_047027_20201212_02_T1
#>   - LE07_L2SP_046027_20201205_02_T1
#>   - LC08_L2SP_047027_20201204_02_T1
#> - assets: 
#> ang, atmos_opacity, atran, blue, cdist, cloud_qa, coastal, drad, emis, emsd, green, lwir, lwir11, mtl.json, mtl.txt, mtl.xml, nir08, qa, qa_aerosol, qa_pixel, qa_radsat, red, rendered_preview, swir16, swir22, tilejson, trad, urad
#> - item's fields: 
#> assets, bbox, collection, geometry, id, links, properties, stac_extensions, stac_version, type

Exploring data

A STACItemCollection is a regular GeoJSON object. It is a collection of STACItem entries that stores metadata on assets. Users can convert a STACItemCollection to a sf object containing the properties field as columns. Here we depict the items footprint.

sf <- items_as_sf(stac_items)

# create a function to plot a map
plot_map <- function(x) {
  library(tmap)
  library(leaflet)
  current.mode <- tmap_mode("view")
  tm_basemap(providers[["Stamen.Watercolor"]]) +
    tm_shape(x) + 
    tm_borders()
}

plot_map(sf)
#> tmap mode set to interactive viewing

Some collections use the eo extension, which allows us to sort items by attributes like cloud coverage. The next example selects the item with lowest cloud_cover attribute:

cloud_cover <- stac_items |>
  items_reap(field = c("properties", "eo:cloud_cover"))
selected_item <- stac_items$features[[which.min(cloud_cover)]]

We use function items_reap() to extract cloud cover values from all features.

Each STAC item have an assets field which describes files and provides link to access them.

items_assets(selected_item)
#>  [1] "ang"              "atran"            "blue"             "cdist"           
#>  [5] "coastal"          "drad"             "emis"             "emsd"            
#>  [9] "green"            "lwir11"           "mtl.json"         "mtl.txt"         
#> [13] "mtl.xml"          "nir08"            "qa"               "qa_aerosol"      
#> [17] "qa_pixel"         "qa_radsat"        "red"              "rendered_preview"
#> [21] "swir16"           "swir22"           "tilejson"         "trad"            
#> [25] "urad"

purrr::map_dfr(items_assets(selected_item), function(key) {
  tibble::tibble(asset = key, description = selected_item$assets[[key]]$title)
})
#> # A tibble: 25 × 2
#>    asset   description                       
#>    <chr>   <chr>                             
#>  1 ang     Angle Coefficients File           
#>  2 atran   Atmospheric Transmittance Band    
#>  3 blue    Blue Band                         
#>  4 cdist   Cloud Distance Band               
#>  5 coastal Coastal/Aerosol Band              
#>  6 drad    Downwelled Radiance Band          
#>  7 emis    Emissivity Band                   
#>  8 emsd    Emissivity Standard Deviation Band
#>  9 green   Green Band                        
#> 10 lwir11  Surface Temperature Band          
#> # … with 15 more rows

Here, we’ll inspect the rendered_preview asset. To plot this asset, we can use the helper function preview_plot() and provide a URL to be plotted. We use the function assets_url() to get the URL. This function extracts all available URLs in items.

selected_item$assets[["rendered_preview"]]$href
#> [1] "https://planetarycomputer.microsoft.com/api/data/v1/item/preview.png?collection=landsat-c2-l2&item=LC08_L2SP_047027_20201204_02_T1&assets=red&assets=green&assets=blue&color_formula=gamma+RGB+2.7%2C+saturation+1.5%2C+sigmoidal+RGB+15+0.55&format=png"

selected_item |> 
  assets_url(asset_names = "rendered_preview") |>
  preview_plot()