24 July 2019, 2019 IMSM Workshop: Analysis and Visualization of Continuous Glucose Monitoring Da…

24 July 2019, 2019 IMSM Workshop: Analysis and Visualization of Continuous Glucose Monitoring Da…

January 12, 2020 0 By Jose Scott


– All right, so we are the Rho group with my coworkers and
(mumbles) back there, raise your hand. All right, so we’ve been working with continuously monitored glucose data. So this problem was motivated by the fact that many wearable devices now can collect large amounts of
continuously monitored data, such as fitness trackers
and heart rate monitors. And there’s an interest in understanding how can we go about analyzing this data without aggregating over
the time observations. So, for example, a
continuous glucose monitor, or CGM, is able to monitor an individual’s glucose levels every five minutes. So in a day period that would allow for 288 observations per person. Within a one week period that’s around 2,000 observations. With about a month period you’re gonna get around 8,000 observations. If I click on this link, okay so this animation is showing you the glucose values being recorded over the course of a day for
10 different individuals. So you can see how if you have studies for a large number of individuals and maybe over weeks or months, the amount of data is going to accumulate very quickly. The dataset that we’ve been working with is from a clinical trial
that was interested in comparing two different types of ways to monitor glucose levels in individuals with Type I Diabetes. So the first way is the CGM,
that I already mentioned. The second way is using both CGM and the more traditional
approach called BGM, or blood glucose monitoring. And they were interested in understanding if using CGM on its own is as safe and effective as using
both of them together. So this data has actually
been analyzed before, using more of the traditional methods with these large data sets where they compute summary statistics
such as treatment means or area under the curve for
the two different groups, kind of you can imagine
something like you see in the image on the left here. Where you could then compare the means between the groups. We’re interested though in being able to actually leave the data
in its original format and instead of aggregating over it, being able to create some functions that we can then compare between the two treatment groups
so you can maybe gain some additional insight into the data that you wouldn’t be able to gain with aggregating over the time period. So to describe the clinical trial a little bit more, the first part included a running phase in which individuals came in for training and
baseline measurements. At the time point referred to as week zero they were randomized into
one of the two groups, either CGM or CGM plus BGM where they were twice as likely
to be put in to the CGM only group. Then they were required to come in for five different visits after that point over the course of the various weeks. And so there were 226
individuals in the study with the original data set, that’s around 15 million observations. We chose to only work
with about three days around each of the visits. So this image shows,
kinda gives you an idea of what the data we’re working with, this is for one individual. Each of these lines represents the glucose values
measured over the course of the day. It’s divided into the six
different visit periods that we’re working with
and there are three lines within each of those
visit periods representing three different days
that we’re gonna be using for replication. And as you can see, there’s
a lot of variability actually even just within one individual. So we’re hoping the model can
also be able to separate them. The noise, the signal from
the noise in the data. OK. (mumbles) – So our group has
developing several methods. I will introduce you with the first part the intensive longitudinal data analysis. So actually the longitudinal model, they don’t require it for that you have to be the whole data set. It’s allowing for the missing values. And I need to please keep reminding that our data structure actually is same-level based structure You will see that our data is nesting ***, and nesting in subject. Our first model is the marginal
model with GEE approach. Actually this model is our
solving the equation too in order to get the estimate. So you will find actually this
is a non-parametric method. And it’s very robust. But it can only deal with two levels of data structure. And then the second
model is the mixed model with different levels of variation. So we can use it to complete a suit our three-stage strucutre model. And you’ll see here we
have the equation four for the likelihood. So we have to make larger computation because we have two
integrals in the likelihood and also this model maybe
more suitable and accurate when you have multiple
levels data structure. But this is the way that
with fixed parameter and we want to modeling
our group difference that is time-varying. Time-varying means we need to turn our time coefficient into the,
utilize the basis expansion. And in our approach we always prepare the baseline. So, which is the piecewise
polynomial function and (cough drowns speech)
longitudinal modeling I choose the cubic
spline with degree three and the knots to be seven. So if you have any question
we can explain later. So next thing is we want to
say some (sneeze drowns speech) from the first model that is *** model which is as I say that GEE can only deal with two levels data structures so I fixed the date here. So there is two lines means two groups. So you will say that the two group actually make the attack a little bit different at very early
morning but not at daytime. And also the data I used
for the other patient is *** people. And they are (mumbles) observed (mumbles). Also we used the same data set into the mixed model that
with the introduced to (mumbles) from the subject
level and the visitor level. So we will state it actually also would detect a very small group difference and at the beginning of the day and not very much in the daytime. So I will say this for adult actually need to be further
tested by the whole days that we have a very large whole day sample we do not use it. And also it need to be detected to see if the (mumbles) that (mumbles). And I will say, you may
ask what’s the difference between these two model. I will say they have their own advantages and disadvantages. But which one is better? I think just depend on the data structure and also the research questions. But they seem to detect some similarities regarding our data verification and my teammate also have treated the data and another method is
functional data analysis. So give you some new ideas. – So like Biyi said we are also interested in
investigating a different kind of approach to try and see if we can uncover
different kinds of patterns from these data and that’s from this framework of functional data analysis. So I’ll TF the kind of idea of FDA and I’m not going to talk about the actual modeling results, Jin is gonna do that but
I’ll discuss the couple preprocessing steps that
we needed to step through before Jin could run her FDA models. So this kinda gives you a quick flowchart of the data preprocessing
steps that we used before we fit these FDA models. And it requires maybe a
bit of a perspective shift when we start thinking about functional data analysis. Instead of thinking about patient data as these kinds of collections
of pointwise observations that have some dependency
structures associated with them, we’re now thinking about patient data as these kind of functional curves that have simply been discretized over
some fixed timeframe. So it’s a bit of a perspective shift in terms of thinking about the kind of data that we’re working with and what form it takes. So a couple steps, we had to
do some missing data imputation and then some function alignment
of these patient curves prior to fitting the FDA model. I’ll talk through those in turns So first up is this missing
data imputation problem. So in our trend data set
we used approximately, I’m sorry. In our trend data set approximately 1.5% of the glucose measurements
were actually missing across all of our patients. The issue is that no missing values are actually allowed in our
selected functional line framework along with one
of the FDA model (mumbles) that we tried. So the solution was kinda a simple one. We’re gonna use linear
interpolation to impute these missing observations
in our patient glucose level curves. So we can fill in some of these time gaps. So a quick example to give you some flavor of what that linear interpolation
actually looks like. At top we have an example patient curve prior to linear imputation. We can see that these devices are often times dropping
glucose measurements for short or potentially
longer time intervals depending on the kind of user whose actually using these different kinds of recorded measurements. Then afterwards we can see that we filled in these gaps using this linear imputation and for these small
gaps it seems like it’s not totally unreasonable to
go with something this simple. So after doing the missing data imputation we move on to this
functional alignment step. And it’s really an important one in functional data analysis. The idea here is that we’re interested in moving phase variation, we call it, from a collection of observed functions that we assume are generated from the same underlying system process. So another way of thinking about this is that we’re interested in aligning these kind of noisy clock
time functional observations to some global system time
that we think truly dictates the underlying process that generates these patient curves. So there are a number of
functional line frameworks out there by which you can do this functional alignment step. We ended up choosing a
square-root velocity function framework to align our patient curves. It’s well studied in the shape analysis literature and it concurs nice properties in ways that some other frameworks do not. So we can take a look at
what this SRVF alignment is actually doing with
some example patient data. So here we’ve got a collection
of 10 patient curves prior to alignment over a 24 hour period. And it looks like we have a colorful plate of spaghetti here. So these patients have
different lifestyle habits, they’re eating meals at different times which causes peaks at varying times, and then maybe they’re expending energy also at different times based
on differing activity levels and just having different schedules throughout their days which corresponds to some of the drops. So imagine if you wanna take
some kind of summary statistic from these functional data. You would hope that summary statistic, maybe a mean, is reflective of the kinds of trends you see in each
of these individual curves. But if you take a simple
cross sectional mean of this kind of pile
of functions right now you might get something that
doesn’t adequately capture any of the kind of peaks and valleys that we’re seeing happening across all these patient curves just
at different time points. So at bottom once we do
this function alignment you can see that we’re kind of uncovering some of these peaks and
drops in a much nicer way. If we take a cross sectional mean of now this bottom function
or this bottom collection of functions after alignment, that resulting mean
function is maybe going to be a useful kind of summary statistic for us because that mean function is going to also capture some of
these global features we’re seeing among these
separate individuals that were simply masked by
this time dimension effect. So SRVF alignment is actually a (mumbles) non-linear process. We’re stretching and
compressing these functions over different intervals along the length of each of these curves. And so this picture is just the plot of the warp paths associated with each of the functions going
from before alignment to after alignment. The wiggliness you see
in each of these curves about that 45 degree line suggest that there is
again this high degree of non-linearity in the warping path going from before alignment
to after alignment. – So now we move on to
functional data analysis after alignment. So we would like to estimate the curve of glucose levels during
24 hour period day for each individual from each group at a given visit day. For estimation, first
of all, we do not assume any underlying error distributions. Instead we use the cubic B-spline where the knots were placed
30 minutes in 24 hour. It is because we would
like to fit the data as closely as possible. Since we consider 50 pieces of functions to estimate our individual curves from each group, data
model becomes more complex. So we would like to smoothness
for our model complexity. So that the data is plated as smoothly. Here’s the idea for the roughness penalty. So based on general (mumbles)
cross validation procedure we choose the smoothing
paramter, lambda as 100. So we can get the large
number of basis functions as well as the smoothness
in our model complex. So this is the graph for
one individual person estimated curve from CGM and BGM group and after alignment at the visit four. x-axis is the system time
not the noisy clock time. And the y-axis is the
glucose level in the body. The point is observation and the line is estimated glucose curve. So you can see our estimated curve is really quite fitting
well to the observations. So the deviation between observations and estimated curve is very quite small. So our model fitted very well. So next slide is the
graphs of estimated group mean curves with plus minus
one standard deviation errors and the red line is for CGM only and the blue line is for
those CGM plus BGM group. So as you notice here we
found a better alignment. We found very little vertical deviation between the two estimated
group mean curves. So this tells us that
there is not much evidence of difference between two
groups in terms of system time. So we actually wanted
to test that this claim using function t test. Since we do not assume
any error distribution so we use the non-paramettric approach which is the permutation approach. We test functional t test
through the permutation approach. The red line is observed test statistic, and the blue dashed line is the
point of .05 critical values from the the permutation approach. And then the black line is the
maximum .05 critical values. As you notice, the
observed t-test statistics did not exceed either
pointwise .05 critical value or maximum .05 critical value, so this tells us there is
not significant deviation between two mean group curves
across the system time. And critical lines with
the previous group, because we also noticed that
there is not much deviation between group two group
estimated mean curves, and then Katherine will talk about the summary and future work. – All right, so the summaries that we did. We applied the curves
to the GEE, GAMM and FDA to be able to compare functional trends between the two groups. We ran into some limitations, and this leads to future work, is that we were only able
to use portions of the data due to the fact a lotta of these methods were very computationally intensive. It was also difficult to capture some of the full correlation
structure with these models. There possibly could be
better ways to impute and handle the missing data issue. And lastly we didn’t get around to including any covariance in the model, which might be something of interest. OK, and lastly there’s a picture
of us at the Hunt library. (quiet laughter) Any questions?
(applause) – [Man] Questions for the Rho group. – [Assistant] Oh sorry.
(laughter) I thought you’d be over here first. – [Audience Member] On slide 21. The Federal Drug Administration part of (mumbles) about warp. I have a question. Did you use warp path
analysis to actually start to categorize or classify
either body types or lifestyle? The amount of warp needed, or
the amount of alignment needed to put someone with the other group. Can that tell you about people? – Hmm, that’s an interesting thought. (he talks over audience member) We hadn’t really considered that, but that’s a cool thing of thinking about using this information maybe
in a more meaningful way. Yeah, I guess. Honestly, in the shape
analysis literature, what we did is this
slightly different from if you wanted to fully
integrate this SRVF alignment into doing a full-on
statistical analysis, right? So, we selected a baseline patient curve based on this kind of idea of we think that you might need
three meals a day, generally, and you have a snack in there also, so we expect to see
three or four big peaks throughout a 24-hour period for a, like an ideal patient time. So we, shows a baseline patient curve that we’ve reflected that
kind of idea qualitatively from our dataset, then we align the rest of the patient curves to that function. We did that because doing
it the kind of maybe more vertically integrated way when we’re using this
SRVF alignment in full super computationally intensive, requires working with the quotient space after you’ve accounted for
these warping functions computing some kind of *** and then using a dynamic
programming algorithm to align all these function
to some global mean. So I think if we go down that route, then we can really start to think about using exactly the idea that you suggested to extract some more meaning
outta these warp paths and using them in (cough
drowns out speech), so maybe there’s some kind
of distance computation or calculation on this quotient space with these warp paths
would be really interesting to look at. Yep, great idea. (mumbles) – [Audience Member]
Prodding similar lines, I was trying to think of
how you could use this for prediction, ’cause
what you essentially seem to be doing with all the uncertainty from the glucose data
has sorted of transferred to the way varred– – Totally. Yeah, totally, exactly. And so again, limitation
of our approach again to this computational ceiling
that we had to deal with. You know, we’re running
these on our laptops and with nice compute resources. Yeah, I think that the points
that both of you’re making is that we can probably much better use of these warping paths themselves. – [Audience Member] ***
Were you able to track for the patient across different visits, or across different days if
their warp paths are similar? Like, basically the same? – So, we didn’t look at that so much. Kind of based on the way in
which we approach modeling, the correlation structure,
ultimately we ended really looking within individual visits and so we didn’t use the information in a way that would have
been probably nice to. – So we incorporated correlation structure in the *** models. Actually, we still may
have other correlation that need to be further examined, but in those things I think is another way of the (faintly speaking). – [Audience Member] Got
another question on it. Can you put up the slide
where you did the alignment before and after you had, – [Speaker] Is this one?
– That one, yeah. So, I mean the bottom looks really nice, but I guess this warping means
that you’re skewing things in this direction, horizontally. – [Male Speaker] Yeah
in the time direction. – So in a way you’re kinda
losing some time-varying things like the rates at which
a curve drops, right? You’re losing that information. – Yeah, yeah. We used it in the way that the data were originally recorded. – Sure, so what I’m wondering is that is it not possible to create
a surrogate for the dataset, where you just record
the heights of the peaks, the troughs and the time
intervals between them? And it would have lot less data, it would still capture the features that you’re losing anyways
by doing the warping. I know you’ve gotten (mumbles). It’s not this project, but effectively, you’re losing a bunch of information that suggests that you could
create a feature (noise). – I think I understand what you’re saying, but I think the advantage
this SRVF alignment is that we’re respecting the
time dimensions in some way. When we’re thinking about
casting in the FDA framework, the kinda summary objects
that we might wanna use, we would, I think ideally like
to be able to use functions that still respect some kind of time– – And because you keep
the warping information. – [Male Speaker] Yeah, right, right. Exactly, exactly. – There’s no loss of mapping. – [Male Speaker] Right, right. – That’s fair enough. All right, any other questions? Go ahead. – [Audience Member] You talked
how you kind of had some dropouts in the data and that you linearly
interpolated over there. Did you have any kind of rule
for a maximum time period that you thought it would
realistic to interpolate over? – So we did some high level filtering. For patients who were missing
much of a visit period, we eliminated those
people from the get-go, because you would have
had a horizontal line. You may not have even
had a starting position for some of these positions
for some of these positions, for some of these patients to do that. So beyond that, I mean we kinda did again, maybe a qualitative check to see how silly is this imputation? And some cases it was quite silly, right? A patient might end up
dropping these observations, (mumbles) on the bed, I don’t know, they rolled over onto
the device or something, and it’s not recording. So those cases, if it was
at the end of that three-day consecutive period,
imputing was just was a line And so there’s a question, I think, if we go back to using this SRVF framework as it’s originally intended, if we can start getting some kind of, say, mean function within a patient, say cross visits, or something like that that might give us some sense
of their individual behavior, we could probably do this
imputation in a much better way. And so that’s something that
we thought about for sure, but didn’t get around to doing. (laughter) – (mumbles) and everything. – [Man 1] (faint speech) – [Man 2] It’s been a
really exciting result from the students’ work on the. We are confronted with these
type of data more often as we gather more and more devices, and I feel that these
are very (hesistates) their ideas will allow us
to bring some of the ideas that are go to do some of the analysis of our (faint speech) data. So we’re very pleased, and
I want to congratulate them for their efforts. – [Man 1] Right. OK, so let’s thank them. (applause)