The iglu package is developed to assist the analyses of data from Continuous Glucose Monitors (CGMs). CGMs are small wearable devices that measure the glucose levels continuously throughout the day, with some monitors taking measurements as often as every 5 minutes. Data from these monitors provide a detailed quantification of the variation in blood glucose levels during the course of the day, and thus CGMs play an increasing role in clinical practice. For more on CGMs, see Rodbard (2016) “Continuous Glucose Monitoring: A Review of Successes, Challenges, and Opportunities.”.
Multiple CGM-derived metrics have been developed to assess the quality of glycemic control and glycemic variability, many of which are summarized in Rodbard (2009) “Interpretation of continuous glucose monitoring data: glycemic variability and quality of glycemic control.”. The iglu package streamlines the calculation of these metrics by providing clearly named functions that output metrics values with one line of code.
The iglu package is designed to work with Continuous Glucose Monitor (CGM) data in the form of a data frame with the following three columns present:
Blood glucose level measurement [in mg/dL] (
Timestamp for glucose measurement (
Subject identification (
The iglu package comes with example data from 5 subjects with Type II diabetes whose glucose levels were measured using Dexcom G4 CGM. These data are part of a larger study analyzed in Gaynanova et al. (2020).
Example data with 1 subject can be loaded with:
This dataset contains 2915 observations of 3 columns corresponding to the three components listed in the introduction:
"id"- Factor (character string) column for subject identification
"time"- Factor (character string) column that can be converted to DateTime for measurement timestamp
"gl"- Numeric column for glucose measurement
Data used with iglu functions may have additional columns, but the columns for id, time and glucose values must be named as above.
dim(example_data_1_subject) #>  2915 3 str(example_data_1_subject) #> 'data.frame': 2915 obs. of 3 variables: #> $ id : Factor w/ 1 level "Subject 1": 1 1 1 1 1 1 1 1 1 1 ... #> $ time: POSIXct, format: "2015-06-06 16:50:27" "2015-06-06 17:05:27" ... #> $ gl : int 153 137 128 121 120 138 155 159 154 152 ... head(example_data_1_subject) #> id time gl #> 1 Subject 1 2015-06-06 16:50:27 153 #> 2 Subject 1 2015-06-06 17:05:27 137 #> 3 Subject 1 2015-06-06 17:10:27 128 #> 4 Subject 1 2015-06-06 17:15:28 121 #> 5 Subject 1 2015-06-06 17:25:27 120 #> 6 Subject 1 2015-06-06 17:45:27 138
Example data with multiple subjects can be loaded with:
This dataset contains the same 3 columns as the dataset in the single subject case, but now with 13866 observations from 5 subjects. The first subject in this multiple subject dataset is the same as the single subject from the previous examples.
dim(example_data_5_subject) #>  13866 3 str(example_data_5_subject) #> 'data.frame': 13866 obs. of 3 variables: #> $ id : Factor w/ 5 levels "Subject 1","Subject 2",..: 1 1 1 1 1 1 1 1 1 1 ... #> $ time: POSIXct, format: "2015-06-06 16:50:27" "2015-06-06 17:05:27" ... #> $ gl : int 153 137 128 121 120 138 155 159 154 152 ...
All the metrics implemented in the package can be divided into two categories: time-independent and time-dependent. Time-independent metrics do not use any linear interpolation because the time component of the data is not used in their calculations. Because the time component is not necessary, when working with a single subject only a glucose vector is required. If a glucose vector for multiple subjects is supplied, or if a data frame that doesn’t have all three columns is supplied, these functions will treat all glucose values as though they are from the same subject.
All metric functions in iglu will produce the output in a tibble form. See documentation on tibbles with vignette(‘tibble’) or ?
Some metric functions, like
above_percent(), will return multiple values for a single subject.
When a data frame is passed, subject id will always be printed in the id column, and metrics will be printed in the following columns.
As discussed above, just the glucose vector can be supplied for the single subject case.
However, it is not recommended to pass just glucose values whenever the time and subject are also available, because this output will not contain the subject ID.
The list of target values for the above_percent metric is a parameter that can be changed:
Many metrics have parameters that can be changed. To see available parameters for a given metric, see the documentation i.e. ?above_percent or help(above_percent).
Not all metric functions return multiple values. Many, like
MAGE() (Mean Amplitude of Glycemic Excursions), will return just a single value for each subject, producing a column for value and a column for subject id (if a dataframe is passed), as well as a row for each subject.
Another example of a time-independent metric is Hyperglycemia index , the corresponding
hyper_index function returns a single value for each subject
In this example, Subject 2 has the largest Hyperglycemia index, indicating the worst hyperglycemia. This is reflected in percent of times Subject 2 spends above fixed glucose target (see results of
Observe that the timestamps in the first rows are not evenly due to missing measurements. To address this challenge, we developed
CGMS2DayByDay function that linearly interpolates glucose measures for each subject on an equally spaced time grid from day to day. To prevent extrapolation, missing values are inserted between the two measurements that are more than
intergap minutes apart (default value is 45 minutes, can be changed by the user). This function is automatically called by all metrics that require such interpolation, however it is also available to the user directly. The function is designed to work with one subject data at a time, the structure of function output is shown below.
The first part of the output,
gd2d, is the interpolated grid of values. Each row correspond to one day of measurements, and the columns correspond to equi-distant time grid covering 24 hour time span. The grid is chosen to match the frequency of the sensor (5 minutes in this example leading to \((24 * 60)/ 5 = 288\) columns), which is returned as
dt0. The returned
actual_dates allow to map the rows in
gd2d back to the original dates. The achieved alignment of glucose measurement times across the days enables both the calculation of corresponding metrics, and the creation of lasagna plots. The default frequency can be adjusted as follows.
Note that the final part of the output reflects our input, and there are now only 144 columns instead of 288.
The CGMS2DayByDay function also allows specification of the maximum allowable gap to interpolate values across (default is 45 minutes) and a string corresponding to time zone (default is the timezone of the user’s system).
All functions for metrics requiring linear interpolation will accept the following three parameters that are passed on to
dt0” - Time frequency (numeric) for interpolation. Default will automatically match the frequency of the data
inter_gap” - Maximum allowable gap in minutes (numeric) for interpolation
tz” - String corresponding to timezone where the data’s measurements were recorded
In the example_data_5_subject dataset, it is important to specify
tz = ‘EST’, because a Daylight Savings Time shift can cause miscalculations if the wrong timezone is used. A proper call for this dataset, being recorded in EST, would be:
Examples of proper metric function calls will be shown in the next section.
Some metric functions, like
conga() (Continuous Overlapping Net Glycemic Action), will return just a single value for each subject, resulting in a 2 column tibble (1 column for id and 1 for the single value).
Note that even though we are working with a single subject, a dataframe with glucose values, time, and subject id’s must be passed. Functions for metrics requiring the time component for calculation can not be passed a vector of glucose values.
sd_measures(), which computes 6 unique standard deviation subtypes, requires linear interpolation and returns multiple values for each subject.
sd_measures(example_data_5_subject) #> # A tibble: 5 x 7 #> id SdW SdHHMM SdWSH SdDM SdB SdBDM #> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 Subject 1 26.4 19.6 6.54 16.7 27.9 24.0 #> 2 Subject 2 36.7 22.8 7.62 52.0 48.0 35.9 #> 3 Subject 3 42.9 14.4 9.51 12.4 42.8 42.5 #> 4 Subject 4 24.5 12.9 6.72 16.9 25.5 22.0 #> 5 Subject 5 50.0 29.6 12.8 23.3 50.3 45.9
Notice the high fluctuations in Subject 5, with all but one subtypes of standard deviation being the largest for Subject 5. This provides additional level of CGM data interpretation, since frequent or large glucose fluctuations may contribute to diabetes-related complications independently from chronic hyperglycemia.
The iglu package supports multiple plot types, that are summarized below
|Function call||Visualization description||Main parameters|
||Multiple plot types: time series and lasagna||
||Lasagna plot of glucose values for multiple subjects||
||Lasagna plot of glucose values for a single subject||
||Time series of glucose values colored by rate of change (ROC)||
||Histogram of rate of change (ROC) values||
The time series plot is the default type for the function
plot_glu. This plot type can support both single and multiple subjects.
We set the ‘tz’ (timezone) parameter to be EST because the data was collected in the eastern time zone. If left blank, the time zone used for plotting will be the system’s time zone. Time zone is mainly an issue in cases where daylight savings time might make it appear as though there were duplicate values at some time points.
To just plot a single subject of interest from the grid of time series plots, set the ‘subjects’ parameter to be that subject’s ID.
The red lines can be shifted to any Lower and Upper Target Range Limits with the ‘LLTR’ and ‘ULTR’ arguments.
plot_glu function also supports lasagna plots by changing the ‘plottype’ parameter. For more on lasagna plots, see Swihart et al. (2010) “Lasagna Plots: A Saucy Alternative to Spaghetti Plots.”
By default, this will produce an unsorted lasagna plot using up to 14 days worth of data displayed separately. To average across days at each time point, we can use
datatype = 'average':
We can additionally sort the values at each time point across the five subjects by setting
lasagnatype = 'timesorted'
When working with a single subject, setting
datatype = single will produce plots where rows represent days instead of subjects.
For further customization of lasagna plots, use the
plot_lasagna allows for multi-subject lasagna plots with the additional options of sorting the hours by glucose values for each subject, i.e. horizontal sorting, by setting
lasagnatype = 'subjectsorted'.
plot_lasagna also supports changing the maximum number of days to display, as well as the upper and lower target range limits (LLTR and ULTR), midpoint, and minimum and maximum values to display, all of which will affect the colorbar.
plot_lasagna_1subject allows for customization of the more detailed single subject lasagna plots. There is no datatype parameter for
plot_lasagna_1subject, but there are three types of plots available, accessed with the
As with the
lasagna_plot function, changing the LLTR, ULTR, midpoint, and limits parameters will affect the colorbar.
plot_lasagna_1subject(example_data_1_subject, lasagnatype = 'daysorted', midpoint = 150, limits = c(80,500), tz = 'EST') #> Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm): #> collapsing to unique 'x' values #> Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm): #> collapsing to unique 'x' values #> Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm): #> collapsing to unique 'x' values
In addition to visualizing absolute glucose values,
iglu also allows to visualize local changes in glucose variability as measured by rate of change Clarke et al. (2009). There are two types of visualizations associated with rate of change. The first is a time series plot of glucose values where each point is colored by the rate of change at that given time. Points colored in white have a stable rate of change, meaning the glucose is neither significantly increasing nor decreasing at that time point. Points colored red or blue represent times at which the glucose is significantly rising or falling, respectively. Thus colored points represent times of glucose variability, while white points represent glucose stability. The below figure shows a side by side comparison of rate of change time-series plots for two subjects. Subject 1 shows significantly less glucose variability than Subject 5.
The next figure shows a side by side comparison of rate of change histogram plots for the same subjects. Once again, the colors show in what direction and how quickly the glucose is changing. The histogram plots allow to immediately assess the variation in rate of change. Extreme values on either end of the histogram indicate very rapid rises or drops in glucose - a high degree of local variability. Here, Subject 1 once again shows lower glucose variability by having a narrower histogram with most values falling between -2 mg/dl/min and 2 mg/dl/min. Subject 5 has a shorter, more widely distributed histogram indicating greater glucose variability.
The iglu package comes with a shiny app containing all of the metric calculations as well as all plot types of the package itself.
The full app can be accessed by running
iglu::iglu_shiny() (iglu must be installed to use the
The app itself has a demo available at https://stevebroll.shinyapps.io/shinyigludemo/ with data pre-loaded.