# 24 July 2019, 2019 IMSM Workshop: Analysis and Visualization of Continuous Glucose Monitoring Da…

– All right, so we are the Rho group with my coworkers and

(mumbles) back there, raise your hand. All right, so we’ve been working with continuously monitored glucose data. So this problem was motivated by the fact that many wearable devices now can collect large amounts of

continuously monitored data, such as fitness trackers

and heart rate monitors. And there’s an interest in understanding how can we go about analyzing this data without aggregating over

the time observations. So, for example, a

continuous glucose monitor, or CGM, is able to monitor an individual’s glucose levels every five minutes. So in a day period that would allow for 288 observations per person. Within a one week period that’s around 2,000 observations. With about a month period you’re gonna get around 8,000 observations. If I click on this link, okay so this animation is showing you the glucose values being recorded over the course of a day for

10 different individuals. So you can see how if you have studies for a large number of individuals and maybe over weeks or months, the amount of data is going to accumulate very quickly. The dataset that we’ve been working with is from a clinical trial

that was interested in comparing two different types of ways to monitor glucose levels in individuals with Type I Diabetes. So the first way is the CGM,

that I already mentioned. The second way is using both CGM and the more traditional

approach called BGM, or blood glucose monitoring. And they were interested in understanding if using CGM on its own is as safe and effective as using

both of them together. So this data has actually

been analyzed before, using more of the traditional methods with these large data sets where they compute summary statistics

such as treatment means or area under the curve for

the two different groups, kind of you can imagine

something like you see in the image on the left here. Where you could then compare the means between the groups. We’re interested though in being able to actually leave the data

in its original format and instead of aggregating over it, being able to create some functions that we can then compare between the two treatment groups

so you can maybe gain some additional insight into the data that you wouldn’t be able to gain with aggregating over the time period. So to describe the clinical trial a little bit more, the first part included a running phase in which individuals came in for training and

baseline measurements. At the time point referred to as week zero they were randomized into

one of the two groups, either CGM or CGM plus BGM where they were twice as likely

to be put in to the CGM only group. Then they were required to come in for five different visits after that point over the course of the various weeks. And so there were 226

individuals in the study with the original data set, that’s around 15 million observations. We chose to only work

with about three days around each of the visits. So this image shows,

kinda gives you an idea of what the data we’re working with, this is for one individual. Each of these lines represents the glucose values

measured over the course of the day. It’s divided into the six

different visit periods that we’re working with

and there are three lines within each of those

visit periods representing three different days

that we’re gonna be using for replication. And as you can see, there’s

a lot of variability actually even just within one individual. So we’re hoping the model can

also be able to separate them. The noise, the signal from

the noise in the data. OK. (mumbles) – So our group has

developing several methods. I will introduce you with the first part the intensive longitudinal data analysis. So actually the longitudinal model, they don’t require it for that you have to be the whole data set. It’s allowing for the missing values. And I need to please keep reminding that our data structure actually is same-level based structure You will see that our data is nesting ***, and nesting in subject. Our first model is the marginal

model with GEE approach. Actually this model is our

solving the equation too in order to get the estimate. So you will find actually this

is a non-parametric method. And it’s very robust. But it can only deal with two levels of data structure. And then the second

model is the mixed model with different levels of variation. So we can use it to complete a suit our three-stage strucutre model. And you’ll see here we

have the equation four for the likelihood. So we have to make larger computation because we have two

integrals in the likelihood and also this model maybe

more suitable and accurate when you have multiple

levels data structure. But this is the way that

with fixed parameter and we want to modeling

our group difference that is time-varying. Time-varying means we need to turn our time coefficient into the,

utilize the basis expansion. And in our approach we always prepare the baseline. So, which is the piecewise

polynomial function and (cough drowns speech)

longitudinal modeling I choose the cubic

spline with degree three and the knots to be seven. So if you have any question

we can explain later. So next thing is we want to

say some (sneeze drowns speech) from the first model that is *** model which is as I say that GEE can only deal with two levels data structures so I fixed the date here. So there is two lines means two groups. So you will say that the two group actually make the attack a little bit different at very early

morning but not at daytime. And also the data I used

for the other patient is *** people. And they are (mumbles) observed (mumbles). Also we used the same data set into the mixed model that

with the introduced to (mumbles) from the subject

level and the visitor level. So we will state it actually also would detect a very small group difference and at the beginning of the day and not very much in the daytime. So I will say this for adult actually need to be further

tested by the whole days that we have a very large whole day sample we do not use it. And also it need to be detected to see if the (mumbles) that (mumbles). And I will say, you may

ask what’s the difference between these two model. I will say they have their own advantages and disadvantages. But which one is better? I think just depend on the data structure and also the research questions. But they seem to detect some similarities regarding our data verification and my teammate also have treated the data and another method is

functional data analysis. So give you some new ideas. – So like Biyi said we are also interested in

investigating a different kind of approach to try and see if we can uncover

different kinds of patterns from these data and that’s from this framework of functional data analysis. So I’ll TF the kind of idea of FDA and I’m not going to talk about the actual modeling results, Jin is gonna do that but

I’ll discuss the couple preprocessing steps that

we needed to step through before Jin could run her FDA models. So this kinda gives you a quick flowchart of the data preprocessing

steps that we used before we fit these FDA models. And it requires maybe a

bit of a perspective shift when we start thinking about functional data analysis. Instead of thinking about patient data as these kinds of collections

of pointwise observations that have some dependency

structures associated with them, we’re now thinking about patient data as these kind of functional curves that have simply been discretized over

some fixed timeframe. So it’s a bit of a perspective shift in terms of thinking about the kind of data that we’re working with and what form it takes. So a couple steps, we had to

do some missing data imputation and then some function alignment

of these patient curves prior to fitting the FDA model. I’ll talk through those in turns So first up is this missing

data imputation problem. So in our trend data set

we used approximately, I’m sorry. In our trend data set approximately 1.5% of the glucose measurements

were actually missing across all of our patients. The issue is that no missing values are actually allowed in our

selected functional line framework along with one

of the FDA model (mumbles) that we tried. So the solution was kinda a simple one. We’re gonna use linear

interpolation to impute these missing observations

in our patient glucose level curves. So we can fill in some of these time gaps. So a quick example to give you some flavor of what that linear interpolation

actually looks like. At top we have an example patient curve prior to linear imputation. We can see that these devices are often times dropping

glucose measurements for short or potentially

longer time intervals depending on the kind of user whose actually using these different kinds of recorded measurements. Then afterwards we can see that we filled in these gaps using this linear imputation and for these small

gaps it seems like it’s not totally unreasonable to

go with something this simple. So after doing the missing data imputation we move on to this

functional alignment step. And it’s really an important one in functional data analysis. The idea here is that we’re interested in moving phase variation, we call it, from a collection of observed functions that we assume are generated from the same underlying system process. So another way of thinking about this is that we’re interested in aligning these kind of noisy clock

time functional observations to some global system time

that we think truly dictates the underlying process that generates these patient curves. So there are a number of

functional line frameworks out there by which you can do this functional alignment step. We ended up choosing a

square-root velocity function framework to align our patient curves. It’s well studied in the shape analysis literature and it concurs nice properties in ways that some other frameworks do not. So we can take a look at

what this SRVF alignment is actually doing with

some example patient data. So here we’ve got a collection

of 10 patient curves prior to alignment over a 24 hour period. And it looks like we have a colorful plate of spaghetti here. So these patients have

different lifestyle habits, they’re eating meals at different times which causes peaks at varying times, and then maybe they’re expending energy also at different times based

on differing activity levels and just having different schedules throughout their days which corresponds to some of the drops. So imagine if you wanna take

some kind of summary statistic from these functional data. You would hope that summary statistic, maybe a mean, is reflective of the kinds of trends you see in each

of these individual curves. But if you take a simple

cross sectional mean of this kind of pile

of functions right now you might get something that

doesn’t adequately capture any of the kind of peaks and valleys that we’re seeing happening across all these patient curves just

at different time points. So at bottom once we do

this function alignment you can see that we’re kind of uncovering some of these peaks and

drops in a much nicer way. If we take a cross sectional mean of now this bottom function

or this bottom collection of functions after alignment, that resulting mean

function is maybe going to be a useful kind of summary statistic for us because that mean function is going to also capture some of

these global features we’re seeing among these

separate individuals that were simply masked by

this time dimension effect. So SRVF alignment is actually a (mumbles) non-linear process. We’re stretching and

compressing these functions over different intervals along the length of each of these curves. And so this picture is just the plot of the warp paths associated with each of the functions going

from before alignment to after alignment. The wiggliness you see

in each of these curves about that 45 degree line suggest that there is

again this high degree of non-linearity in the warping path going from before alignment

to after alignment. – So now we move on to

functional data analysis after alignment. So we would like to estimate the curve of glucose levels during

24 hour period day for each individual from each group at a given visit day. For estimation, first

of all, we do not assume any underlying error distributions. Instead we use the cubic B-spline where the knots were placed

30 minutes in 24 hour. It is because we would

like to fit the data as closely as possible. Since we consider 50 pieces of functions to estimate our individual curves from each group, data

model becomes more complex. So we would like to smoothness

for our model complexity. So that the data is plated as smoothly. Here’s the idea for the roughness penalty. So based on general (mumbles)

cross validation procedure we choose the smoothing

paramter, lambda as 100. So we can get the large

number of basis functions as well as the smoothness

in our model complex. So this is the graph for

one individual person estimated curve from CGM and BGM group and after alignment at the visit four. x-axis is the system time

not the noisy clock time. And the y-axis is the

glucose level in the body. The point is observation and the line is estimated glucose curve. So you can see our estimated curve is really quite fitting

well to the observations. So the deviation between observations and estimated curve is very quite small. So our model fitted very well. So next slide is the

graphs of estimated group mean curves with plus minus

one standard deviation errors and the red line is for CGM only and the blue line is for

those CGM plus BGM group. So as you notice here we

found a better alignment. We found very little vertical deviation between the two estimated

group mean curves. So this tells us that

there is not much evidence of difference between two

groups in terms of system time. So we actually wanted

to test that this claim using function t test. Since we do not assume

any error distribution so we use the non-paramettric approach which is the permutation approach. We test functional t test

through the permutation approach. The red line is observed test statistic, and the blue dashed line is the

point of .05 critical values from the the permutation approach. And then the black line is the

maximum .05 critical values. As you notice, the

observed t-test statistics did not exceed either

pointwise .05 critical value or maximum .05 critical value, so this tells us there is

not significant deviation between two mean group curves

across the system time. And critical lines with

the previous group, because we also noticed that

there is not much deviation between group two group

estimated mean curves, and then Katherine will talk about the summary and future work. – All right, so the summaries that we did. We applied the curves

to the GEE, GAMM and FDA to be able to compare functional trends between the two groups. We ran into some limitations, and this leads to future work, is that we were only able

to use portions of the data due to the fact a lotta of these methods were very computationally intensive. It was also difficult to capture some of the full correlation

structure with these models. There possibly could be

better ways to impute and handle the missing data issue. And lastly we didn’t get around to including any covariance in the model, which might be something of interest. OK, and lastly there’s a picture

of us at the Hunt library. (quiet laughter) Any questions?

(applause) – [Man] Questions for the Rho group. – [Assistant] Oh sorry.

(laughter) I thought you’d be over here first. – [Audience Member] On slide 21. The Federal Drug Administration part of (mumbles) about warp. I have a question. Did you use warp path

analysis to actually start to categorize or classify

either body types or lifestyle? The amount of warp needed, or

the amount of alignment needed to put someone with the other group. Can that tell you about people? – Hmm, that’s an interesting thought. (he talks over audience member) We hadn’t really considered that, but that’s a cool thing of thinking about using this information maybe

in a more meaningful way. Yeah, I guess. Honestly, in the shape

analysis literature, what we did is this

slightly different from if you wanted to fully

integrate this SRVF alignment into doing a full-on

statistical analysis, right? So, we selected a baseline patient curve based on this kind of idea of we think that you might need

three meals a day, generally, and you have a snack in there also, so we expect to see

three or four big peaks throughout a 24-hour period for a, like an ideal patient time. So we, shows a baseline patient curve that we’ve reflected that

kind of idea qualitatively from our dataset, then we align the rest of the patient curves to that function. We did that because doing

it the kind of maybe more vertically integrated way when we’re using this

SRVF alignment in full super computationally intensive, requires working with the quotient space after you’ve accounted for

these warping functions computing some kind of *** and then using a dynamic

programming algorithm to align all these function

to some global mean. So I think if we go down that route, then we can really start to think about using exactly the idea that you suggested to extract some more meaning

outta these warp paths and using them in (cough

drowns out speech), so maybe there’s some kind

of distance computation or calculation on this quotient space with these warp paths

would be really interesting to look at. Yep, great idea. (mumbles) – [Audience Member]

Prodding similar lines, I was trying to think of

how you could use this for prediction, ’cause

what you essentially seem to be doing with all the uncertainty from the glucose data

has sorted of transferred to the way varred– – Totally. Yeah, totally, exactly. And so again, limitation

of our approach again to this computational ceiling

that we had to deal with. You know, we’re running

these on our laptops and with nice compute resources. Yeah, I think that the points

that both of you’re making is that we can probably much better use of these warping paths themselves. – [Audience Member] ***

Were you able to track for the patient across different visits, or across different days if

their warp paths are similar? Like, basically the same? – So, we didn’t look at that so much. Kind of based on the way in

which we approach modeling, the correlation structure,

ultimately we ended really looking within individual visits and so we didn’t use the information in a way that would have

been probably nice to. – So we incorporated correlation structure in the *** models. Actually, we still may

have other correlation that need to be further examined, but in those things I think is another way of the (faintly speaking). – [Audience Member] Got

another question on it. Can you put up the slide

where you did the alignment before and after you had, – [Speaker] Is this one?

– That one, yeah. So, I mean the bottom looks really nice, but I guess this warping means

that you’re skewing things in this direction, horizontally. – [Male Speaker] Yeah

in the time direction. – So in a way you’re kinda

losing some time-varying things like the rates at which

a curve drops, right? You’re losing that information. – Yeah, yeah. We used it in the way that the data were originally recorded. – Sure, so what I’m wondering is that is it not possible to create

a surrogate for the dataset, where you just record

the heights of the peaks, the troughs and the time

intervals between them? And it would have lot less data, it would still capture the features that you’re losing anyways

by doing the warping. I know you’ve gotten (mumbles). It’s not this project, but effectively, you’re losing a bunch of information that suggests that you could

create a feature (noise). – I think I understand what you’re saying, but I think the advantage

this SRVF alignment is that we’re respecting the

time dimensions in some way. When we’re thinking about

casting in the FDA framework, the kinda summary objects

that we might wanna use, we would, I think ideally like

to be able to use functions that still respect some kind of time– – And because you keep

the warping information. – [Male Speaker] Yeah, right, right. Exactly, exactly. – There’s no loss of mapping. – [Male Speaker] Right, right. – That’s fair enough. All right, any other questions? Go ahead. – [Audience Member] You talked

how you kind of had some dropouts in the data and that you linearly

interpolated over there. Did you have any kind of rule

for a maximum time period that you thought it would

realistic to interpolate over? – So we did some high level filtering. For patients who were missing

much of a visit period, we eliminated those

people from the get-go, because you would have

had a horizontal line. You may not have even

had a starting position for some of these positions

for some of these positions, for some of these patients to do that. So beyond that, I mean we kinda did again, maybe a qualitative check to see how silly is this imputation? And some cases it was quite silly, right? A patient might end up

dropping these observations, (mumbles) on the bed, I don’t know, they rolled over onto

the device or something, and it’s not recording. So those cases, if it was

at the end of that three-day consecutive period,

imputing was just was a line And so there’s a question, I think, if we go back to using this SRVF framework as it’s originally intended, if we can start getting some kind of, say, mean function within a patient, say cross visits, or something like that that might give us some sense

of their individual behavior, we could probably do this

imputation in a much better way. And so that’s something that

we thought about for sure, but didn’t get around to doing. (laughter) – (mumbles) and everything. – [Man 1] (faint speech) – [Man 2] It’s been a

really exciting result from the students’ work on the. We are confronted with these

type of data more often as we gather more and more devices, and I feel that these

are very (hesistates) their ideas will allow us

to bring some of the ideas that are go to do some of the analysis of our (faint speech) data. So we’re very pleased, and

I want to congratulate them for their efforts. – [Man 1] Right. OK, so let’s thank them. (applause)