Data science and predictive analytics are fast becoming a part of our everyday lives. Predictions in elections, Google suggested searches and even your Netflix home feed show how predictive technologies are impacting our daily lives. One of the biggest applications of predictive analytics is in the medical field.
Last year, my daughter was diagnosed with Type 1 diabetes. I’m using data science and predictive analytics to help her manage her condition.
Understanding Type 1 diabetes
Typically, when people hear the term “diabetic,” they immediately think of Type 2 diabetes (otherwise known as Adult-Onset or Noninsulin-Dependent diabetes). Type 2 diabetics can be treated through weight management, diet and the use of insulin. However, Type 1 diabetes is very different and is actually considered an autoimmune disorder.
Many people know that a body uses insulin to consume complex sugars like glucose, which your brain uses for energy. But most people don’t realize that, if this delicate balance is marginally disrupted, it can lead to life-threatening conditions — even death. This is especially true for Type 1 diabetics, who are 100% dependent on synthetic insulin injections.
Because they do not produce enough (or any) insulin on their own, Type 1 diabetics need multiple insulin injections every day for the rest of their lives.
To put this into perspective, the following picture shows the number of injections a typical Type 1 diabetic receives within a year.
Image from T-Hill’s Type 1 Blog
There’s a tremendous amount of medical research dedicated to diabetes, but there appears to be no immediate cure on the horizon. Thankfully, though, there are very active parents out there who refuse to wait. They may not have medical degrees but many of them come from very technical backgrounds, which is how groups like Nightscout have formed.
How Nightscout helped me track my daughter’s diabetes
Full disclosure: I am neither affiliated with nor a member of Nightscout.
Nightscout is an open source project that allows real-time access to CGM (Continuous Glucose Monitoring) data. The group was developed by parents of children with Type 1 diabetes and has continued to be developed, maintained and supported by volunteers. I accidentally ran across this group when I was trying to better understand my daughter’s glucose levels, which are collected 24/7 by her Dexcom sensor. This small sensor measures glucose concentrations within the subdermal interstitial fluid.
Images from Animas
I was conducting research on her glucose levels because I was looking for patterns and clues about better ways to control her blood sugar levels.
In general, our bodies require our blood sugar to stay within a certain range (70-180 mg/dL). If the blood sugars go out of range, people can experience emotional, mental and physical symptoms that can range from mood changes to even death. To keep the blood sugars at ideal levels, the body employs incredibly complex feedback loops with the primary one being the production and release of insulin. For diabetics, this means understanding when and how much insulin should be delivered.
When I started looking at the data from my daughter’s sensor, I started to employ my analytical skills. The first order of business was to understand the data the sensor was collecting. Nightscout does an excellent job of outlining basic data reading fundamentals of my daughter’s Dexcom sensor:
“Glucose levels in interstitial fluid always lag behind real blood glucose values by several minutes. The Dexcom receiver classifies the raw sensor data as being clean when values are stable or relatively predictable. When sensor data is less predictable or more variable, Dexcom classifies the data as having light, moderate or heavy noise. Light, moderate and heavy noise levels can be induced by a number of factors. These factors include real rises or drops in blood sugar, sensor compression, dehydration, excessive moisture around the sensor and transmitter, a poorly secured or failing sensor, and other factors.” – From Nightscout’s Dexcom data blog
With this information, I realized that my daughter’s glucose levels resembled both a time-series and a control problem. Control problems are incredibly challenging in both mathematical formulation and tuning, so I’m setting that aside for future discussion. More importantly, I was looking for simple solutions that I could implement quickly without the use of complicated solutions. This led me to focus on the time-series aspect by starting with a precursory data mining exercise.
Reactive vs. proactive management
Fortunately, my daughter’s Dexcom monitor provides a handful of simple but effective alarms that can be used to help monitor blood sugar levels. There are a series of upper and lower limits and rate-of-change alarms. One of the major weaknesses of using only the upper and lower limits is that this type of monitoring is reactive; a diabetic person must react to the alarm by either administering insulin or consuming additional carbohydrates.
Unfortunately, since there is a delay in both the sensor and when the reaction (insulin or carbohydrates consumption) is detected, the body can experience wild swings, as shown in my daughter’s sensor data below.
Image from NightScout dashboard
It’s important to point out two things in this chart:
- At first glance, this may appear to be a stationary signal, but that is because we are intentionally trying to control the blood sugar within set limits as determined by her endocrinologist.
- The periodic signal is also a product of trying to control her blood sugar.
Fortunately, the Dexcom monitor provides rate-of-change limits, which provides some rudimentary proactive capabilities. For example, if my daughter’s blood sugar is climbing at a hypothetical rate of 2 mg/dL, I can use this information to proactively administer insulin before she hits the upper limit. For folks with a control background, this may remind you of a Proportional-Integral-Derivative (PID) controller and/or Statistical Process Control (SPC), which is something I plan to explore in the future.
After spending some time analyzing my daughter’s data, I was able to establish a rate-of-change limit for her dropping blood sugar levels. For us, it was more important to focus on her blood sugar dropping to a dangerous level.
The following plots show before and after the limit change for a week’s worth of overlaid data by day.
Successfully predicting blood sugar crashes with rate-of-change limits
Each of the plots are on the same scale and, as you can see, applying the rate-of-change limits helped us lower her average blood sugar by ~13% and tightened the standard deviation by ~2%.
There are still multiple patterns present within the signals (e.g., intraday periodic signal occurring in the early morning hours), and these will require different strategies to address them.
The power of data
Despite these encouraging results, I know the rate-of-change test was very limited in scope and came from a small sample size. I used this approach to weed out different items of concern, such as the quality of data collected, the type of data, different analytical approaches, etc. For a weekend’s worth of analysis, I found the results promising enough to continue with more sophisticated analytical methods and algorithms.
As a data scientist, I find the countless applications of predictive analytics astounding. Just thinking about how much data can help us manage my daughter’s condition gives me hope for her future. It also shows me just how important data – and the predictive tools we use to work with it – really is in today’s world.
Data science and predictive analytics can be used for more than optimizing assets or predicting future outcomes; they can be used impact people’s lives on a daily basis.
In the coming weeks, plan to apply new analytic methods to my daughter’s Type 1 diabetes to help her as much as possible. Hopefully, with the use of these predictive tools, we can help her manage her condition and improve her quality of life.
In the upcoming months, I will continue to share my work. If you’re interested, follow along or, if you’re trying to monitor your own case of Type 1, get involved!
—
Robert Chong is a Principal Data Scientist at Clockwork Solutions. He is very active in the Austin Data Science community and currently Vice-Chair of the Austin Association for Computing Machinery SIGKDD group.