He shared the big data trends the data geeks are watching:
EMR is posed to provide the largest cache of clinical data ever. It will blow away the idea of a sample and create a holistic view of American health. Its eventual integration with social media / social intelligence will show the health connections between people and populations. Some researchers have been able to use social – without the EMR connection – to start to map population health. Adam Sadilek at the University of Rochester and his team analyzed 4.4 million GPS-tagged Tweets from over 600,000 users in New York City over the course of one month. They trained their artificial intelligence algorithm to ignore tweets by healthy people such as those claiming they were ‘sick’ of a particular song, and trained it to find those who were really ill. They were able to create models that not only showed the instance of disease, but also predicted who would catch the flu next.
Privacy remains a concern for most Americans. When it comes to health content, people are most concerned when they’re on their smartphones (primarily because of the portability and perceived risk of losing their device) – 70% are concerned with health privacy on their mobile; 48%, tablet; 46% PC. (ComScore, Jan 2012)
Crowdsourcing is a growth area for big data. One example is Google’s new smartphone mapping tools. As you follow directions, the map knows how fast you’re proceeding. That makes every user a sensor, sending constant, real time information about traffic. The Nest thermostat has been able to do something similar for its customers – by tracking the behavior of people and machines over time. One example, your air conditioner is made up of two energy consuming parts–the compressor and the fan. The compressor uses a lot of electricity while the fan uses very little. The Nest found that the compressor coils can generate cold air for 5-10 minutes after the compressor is off. Once it knew people’s behavior and when they planned to change the temperature, it was able to turn that compressor off sooner, saving electricity in individual homes and across the grid.
Micro and biorhythmic measurement – using tools like Fitbit – will create both personal trending and longitudinal pictures of personal action vs. health, potentially creating new insights into how to preserve and protect our health as we age.
Best practices of privacy-protecting research:
- Only capture what you need: People won’t share data that seems disconnected to your stated goal (it’s a trust factor)
- Aggregate data: Use the back design test to be sure it’s truly anonymous
- Use methodologies that obfuscate data: Limit what you collect at the point of collection to limit liability
- Disconnect PII from data: Keep the personal information in a different database than behavior
- Sometimes its better to ask: If it’s really sensitive, ask subjects to provide their own data
*ComScore has one of the best views into our online behavior today. They have a panel of 2 million people who allow them to collect all of their online data + they have deep web analytics data from their clients, ongoing monthly surveys of hundreds of thousands of people, and more.