Pulling Reports with rdfp

Steven M. Mortimer

2018-03-29

First, we load rdfp and specify the DFP network we would like to connect to. Then we authenticate by using dfp_auth(). Any existing cached token would be used or we will be prompted to authenticate via the browser.

library(rdfp)
options(rdfp.network_code = 123456789)
dfp_auth()

Delivery by Advertiser

A common report request is to check how each line item is delivering for an advertiser. This type of request can be run through a DFP report job. Report jobs have a special flow, but there is a single function, dfp_full_report_wrapper(), in the rdfp package that will manage that entire flow and run a requested job for you.

In this example, we specify dimensions including the line, order, advertiser, and ad unit. The 'FLAT' ad unit view just ensures that we get tabular data with one row per ad unit in case we’d like to total them up. If you’d like to run multiple reports with slight variations in date or dimensions, it is encouraged that you keep you criteria and use a loop or vectorized function to generate each report since the request data must be formatted as a list in each call.

request_data <- list(reportJob =
                        list(reportQuery =
                               list(dimensions = 'MONTH_AND_YEAR',
                                    dimensions = 'AD_UNIT_ID',
                                    dimensions = 'AD_UNIT_NAME',
                                    dimensions = 'ADVERTISER_NAME',
                                    dimensions = 'ORDER_NAME',
                                    dimensions = 'LINE_ITEM_NAME',
                                    adUnitView = 'FLAT',
                                    columns = 'AD_SERVER_IMPRESSIONS', 
                                    columns = 'AD_SERVER_CLICKS',
                                    dateRangeType = 'LAST_WEEK')
                             )
                      )
report_data <- dfp_full_report_wrapper(request_data)
report_data[,c('Dimension.MONTH_AND_YEAR', 'Dimension.AD_UNIT_ID', 'Column.AD_SERVER_CLICKS')]
#> # A tibble: 7,040 x 3
#>    Dimension.MONTH_AND_YEAR Dimension.AD_UNIT_ID Column.AD_SERVER_CLICKS
#>    <chr>                                   <dbl>                   <dbl>
#>  1 2018-12                           21677451947                       1
#>  2 2018-12                           21677451947                      23
#>  3 2018-12                           21677451947                       9
#>  4 2018-12                           21677451947                     150
#>  5 2018-12                           21677451947                       0
#>  6 2018-12                           21677451947                      41
#>  7 2018-12                           21677451947                      73
#>  8 2018-12                           21677451947                      37
#>  9 2018-12                           21677451947                      23
#> 10 2018-12                           21677451947                      32
#> # ... with 7,030 more rows

Working with Saved Queries

DFP has a feature that allows for saving query criteria. These queries can be pulled back via the API and the criteria submitted back so that you always run the same report each time, or can easily replicate a query that another person has created in the UI.

# look for a particular saved query
request_data <- list(filterStatement=list(query="WHERE id = 936165016"))
this_result <- dfp_getSavedQueriesByStatement(request_data, as_df=FALSE)
this_report_query <- this_result[[1]]$reportQuery

# resubmit the report job with the saved query
report_data <- list(reportJob=list(reportQuery = this_report_query))
report_data <- dfp_full_report_wrapper(report_data)
report_data[,c('Dimension.AD_UNIT_ID', 'Column.AD_SERVER_CLICKS')]
#> # A tibble: 7,228 x 2
#>    Dimension.AD_UNIT_ID Column.AD_SERVER_CLICKS
#>                   <dbl>                   <dbl>
#>  1          21677451947                       9
#>  2          21677451947                     204
#>  3          21677451947                      19
#>  4          21677451947                     668
#>  5          21677451947                       2
#>  6          21677451947                     112
#>  7          21677451947                     182
#>  8          21677451947                      75
#>  9          21677451947                     211
#> 10          21677451947                     222
#> # ... with 7,218 more rows

A More Detailed Explanation of the Report Process

Reports actually require 3 steps from the [ReportService] (https://developers.google.com/ad-manager/api/reference/v201811/ReportService): 1) to request the report, 2) check on its status, and 3) download. This basic process flow is required for all reports requested via this service. The wrapper function used above named dfp_full_report_wrapper manages all aspects of reporting, so this level of detail is not needed unless the wrapper service does not quite fit your needs.

# In order to run a report you must specify how the report should be structured 
# by specifying a reportQuery inside a reportJob. All of the dimensions, columns, 
# date range options, etc. are documented at:
# https://developers.google.com/ad-manager/api/reference/v201811/ReportService.ReportQuery
request_data <- list(reportJob=list(reportQuery=list(dimensions='MONTH_AND_YEAR', 
                                                     dimensions='AD_UNIT_ID',
                                                     adUnitView='FLAT',
                                                     columns='AD_SERVER_CLICKS', 
                                                     dateRangeType='LAST_WEEK'
                                                     )))

# the result is a list and most importantly the ID is included for checking its status
dfp_runReportJob_result <- dfp_runReportJob(request_data)
dfp_runReportJob_result$id
#> [1] 11076593120

# to check the status repeatedly you just need to provide the id
# dfp_getReportJobStatus_result returns a character string of the reportJob status
request_data <- list(reportJobId=dfp_runReportJob_result$id)
dfp_getReportJobStatus_result <- dfp_getReportJobStatus(request_data)
dfp_getReportJobStatus_result
#> # A tibble: 1 x 1
#>   V1         
#>   <chr>      
#> 1 IN_PROGRESS

# a simple while loop can keep checking a long running request until ready
counter <- 0
while(dfp_getReportJobStatus_result$V1 != 'COMPLETED' & counter < 10){
  dfp_getReportJobStatus_result <- dfp_getReportJobStatus(request_data)
  Sys.sleep(3)
  counter <- counter + 1
}

# once the status is "COMPLETED" the data download URL can be retrieved
request_data <- list(reportJobId=dfp_runReportJob_result$id, exportFormat='CSV_DUMP')
dfp_getReportDownloadURL_result <- dfp_getReportDownloadURL(request_data)

# this function has been provided to download the data from URL and convert to a tbl_df
# supported exportFormats are currently c('CSV_DUMP', 'TSV', 'TSV_EXCEL')
report_data <- dfp_report_url_to_dataframe(report_url=dfp_getReportDownloadURL_result$V1, 
                                           exportFormat='CSV_DUMP')
report_data[,c('Dimension.MONTH_AND_YEAR', 'Dimension.AD_UNIT_ID', 'Column.AD_SERVER_CLICKS')]
#> # A tibble: 9 x 3
#>   Dimension.MONTH_AND_YEAR Dimension.AD_UNIT_ID Column.AD_SERVER_CLICKS
#>   <chr>                                   <dbl>                   <dbl>
#> 1 2018-12                           21677451947                     864
#> 2 2018-12                           21677451950                     210
#> 3 2018-12                           21677553898                    5049
#> 4 2018-12                           21677553901                     174
#> 5 2018-12                           21677553904                    3474
#> 6 2018-12                           21677451953                    1687
#> 7 2018-12                           21677553910                      65
#> 8 2018-12                           21677553913                    2589
#> 9 2018-12                           21677554012                      75