GGIR is an R-package to process multi-day raw accelerometer data for physical activity and sleep research. GGIR will write all output files into two sub-directories of ./meta and ./results. GGIR is increasingly being used by a number of academic institutes across the world.
mMARCH.AC is an R-package to data processing after running GGIR for accelerometer data. In detail, all necessary R/Rmd/shell files were generated for data processing after running GGIR for accelerometer data. Then in module 1, all csv files in the GGIR output directory were read, transformed and then merged. In module 2, the GGIR output files were checked and summarized in one excel sheet. In module 3, the merged data was cleaned according to the number of valid hours on each night and the number of valid days for each subject. In module 4, the cleaned activity data was imputed by the average ENMO over all the valid days for each subject. Finally, a comprehensive report of data processing was created using Rmarkdown, and the report includes few explortatory plots and multiple commonly used features extracted from minute level actigraphy data in module 5-7. This vignette provides a general introduction to mMARCH.AC.
The R package mMARCH.AC has been released with an open-source GPL-3 license on CRAN, and mMARCH.AC can run on Windows and Linux. Parallel computing in Linux is recommended due to the memory requirements associated with reading in multiple of the large data files. The package contains one primary function for users which, when run, generates all necessary R/R Markdown/shell executable files for data processing after running GGIR for accelerometer data; load, read, transform and merge long activity data; examine and summarize GGIR outputs; clean the merged activity data according to the number of valid hours per night and the number of valid days per subject; activity data imputation by taking the average across the valid days for each subject; build a comprehensive report of data processing and exploratory plots; extract multiple commonly used features and study feature structure by the covariance decomposition. Figure 1 presents a flowchart for each step in this process which is described in greater detail below. The procedure, R functions, inputs, and outputs are all described in this package vignette. In addition, more documentation and example data could be found in mMARCH.AC repository on GitHub (URL: https://github.com/WeiGuoNIMH/mMARCH.AC).
Mirroring the GGIR structure of processing individual data files in multiple parts, the mMARCH.AC package is split into seven modules, grouping functionalities in logical processing order. The modules are numbered from 1 to 7. modules 1 to 4 are dedicated to data processing. modules 5 to 7 are dedicated to producing R Markdown reports of data cleaning, feature extraction, and unsupervised covariance decomposition via the joint and individual variance explained (JIVE) method, respectively. These seven modules are carried out sequentially with milestone data automatically being saved locally. To use mMARCH.AC, the first step for users is to install and load the mMARCH.AC package. Then, users run the create.shell() function which creates a single R script. The newly created R script, Studyname_module0.maincall.R, is then edited by users, allowing for the specification of arguments relevant for each of the seven modules. All optional arguments and their defaults are described in the package vignette. In addition, for users with access to a cluster for parallel processing, a shell function, named as module9_swarm.sh is created which can parallelizes the processing of individual files with minor modifications by the user. These modifications are described in the package vignette. Computationally, module 1 is the most time-consuming task, taking up at least 60% of the processing time, which the activity data in .csv format was transformed and merged. Generally, module 1 takes about 10~30 minutes to process a file with 14 days of data recorded at 30 Hz on a GeneActiv device in processor cores of 36 x 2.3 GHz (Intel Gold 6140). All output created for each module is described in the package vignette. Briefly, module 1 and module 2 output are saved using a directory structure with a depth of two, containing output data and summary for all participants. The reports for modules 5 to 7 are saved in .html format and are generated using R Markdown (.Rmd) files. These .Rmd files are included in the output, users the flexibility to adapt the source code to their research purpose.
Figure 1: Overview of main steps and output in mMARCH.AC workflow.
All mMARCH.AC code is written in R and reports generated in R Markdown. The R packages ActFrag and ActCR are used for the calculation of certain physical activity and circadian rhythmicity features. The R package r.jive is used to perform the feature interaction analysis and to study the joint and individual variation structure by JIVE.
Download and install RStudio (optional, but recommended)
Download GGIR with its dependencies, you can do this with one command from the console command line:
install.packages("mMARCH.AC", dependencies = TRUE)
Alternatively, to install the latest development version with the latest bug fixes use instead:
install.packages("remotes")
remotes::install_github("WeiGuoNIMH/mMARCH.AC")
library(mMARCH.AC)
create.shell()
The function will create a template shell script of mMARCH.AC in the current directory, names as STUDYNAME_module0.maincall.R.
cat STUDYNAME_module0.maincall.R
= commandArgs(TRUE);
argv print(argv)
print(paste("length=",length(argv),sep=""))
<-as.numeric(argv[1])
modeprint(c("mode =", mode))
# (Note) Please remove the above lines if you are running this within R console
# instead of submitting jobs to a cluster.
#########################################################################
# (user-define 1) you need to redefine this according different study!!!!
#########################################################################
# example 1
.1<-function(x) unlist(strsplit(x,"\\."))[1]
filename2id
# example 2 (use csv file =c("filename","ggirID"))
.2<-function(x) {
filename2id<-read.csv("./mMARCH.AC/inst/extdata/example/filename2id.csv",head=1,stringsAsFactors=F)
d<-which(d[,"filename"]==x)
y1if (length(y1)==0) stop(paste("Missing ",x," in filename2id.csv file",sep=""))
if (length(y1)>=1) y2<-d[y1[1],"newID"]
return(as.character(y2))
}
#########################################################################
# main call
#########################################################################
<-function(mode,filename2id=NULL){
mMARCH.AC.shell
library(mMARCH.AC)
packageVersion("mMARCH.AC")
# ?mMARCH.AC.maincall # run help to see all argumengts
#########################################################################
# (user-define 2) Fill in parameters of your ggir output
##########################################################################
=
currentdir =
studyname =
bindir =
outputdir setwd(currentdir)
=FALSE # keep all subjects in mMARCH.AC
rmDup=c(50,100,400)
PA.threshold="WW_L50M100V400_T5A5"
part5FN= 5
epochIn = 60
epochOut = FALSE
use.cluster = 9250
log.multiplier = 7
QCdays.alpha = 16
QChours.alpha = c(0,0,0,0)
QCnights.feature.alpha = "average"
DoubleHour=NULL
QC.sleepdur.avg=NULL
QC.nblocks.sleep.avg=NULL
useIDs.FN="R"
Rversion="US/Eastern"
desiredtz=FALSE
RemoveDaySleeper=20
NfileEachBundle=NULL
holidayFN=FALSE
trace#########################################################################
# remove duplicate sample IDs for plotting and feature extraction
#########################################################################
if (mode==3 & rmDup){
# step 1: read ./summary/*remove_temp.csv file (output of mode=2)
<-TRUE #keep the latest visit for each sample
keep.last<-paste(currentdir,"/summary",sep="")
sumdirsetwd(sumdir)
<-paste(studyname,"_samples_remove_temp.csv",sep="")
inFN<-paste(sumdir,"/",studyname,"_samples_remove.csv",sep="")
useIDs.FN
#########################################################################
# (user-define 3 as rmDup=TRUE) create useIDs.FN file
#########################################################################
# step 2: create the ./summary/*remove.csv file manually or by R commands
<-read.csv(inFN,head=1,stringsAsFactors=F)
d<-d[order(d[,"Date"]),]
d<-d[order(d[,"newID"]),]
dwhich(is.na(d[,"newID"])),]
d[<-duplicated(d[,"newID"],fromLast=keep.last) #keep the last copy for nccr
S"duplicate"]<-"remove"
d[S,write.csv(d,file=useIDs.FN,row.names=F)
}
#########################################################################
# call afterggir
#########################################################################
setwd(currentdir)
mMARCH.AC.maincall(mode=mode,
useIDs.FN=useIDs.FN,
currentdir=currentdir,
studyname=studyname,
bindir=bindir,
outputdir=outputdir,
epochIn=epochIn,
epochOut=epochOut,
log.multiplier=log.multiplier,
use.cluster=use.cluster,
QCdays.alpha=QCdays.alpha,
QChours.alpha=QChours.alpha,
QCnights.feature.alpha=QCnights.feature.alpha,
DoubleHour= DoubleHour,
QC.sleepdur.avg=QC.sleepdur.avg,
QC.nblocks.sleep.avg=QC.nblocks.sleep.avg,
Rversion=Rversion,
filename2id=fi