My vacation in Gatlinburg, TN had an inauspicious start on July 9th. A news flash on the TV in the hotel lobby indicated that the Department of Health was investigating a disease outbreak based on about 20 postings on the Trip Advisor and the Facebook pages of a local zip lining attraction, and on "IWasPoisoned.com," a website that tracks reports of food-bourne illness. On July 12th, the Department of Health announced that more than 500 cases of GI illness had been reported in connection with that attraction but that they were unable to confirm the cause of the illness. Stating that their water test results had come back clean, the attraction's management suggested that the problem could have been due to stomach flu. Within a couple of more days, however, the Health Department confirmed e coli bacterial contamination in their well as the source of the outbreak.
How does this episode relate to Health Data Matters? Social media generate copious amounts of big data. Every click of a "like" button or completion of a Facebook is information sold to data brokers who aggregate that with other information such as your credit card purchases or prescriptions to enable marketers to target ads based on your predicted preferences or insurance companies to predict your risk of hospitalization. Algorithms created outside of the public eye form the basis of value created by big data. The precise role played by the social media reports in halting the Tennessee outbreak isn't clear. But the opportunity for health departments to proactively mine crowdsourced data is. For more than three weeks prior to our arrival, attraction visitors reported on social media that they became violently ill after drinking water from the attraction's coolers.
Natural language processing of free text, such as in social media posting or Google searches, and analytics on metadata are beginning to provide early identification of outbreaks, new diseases, or terrorist threats. A firm that tracks calls to a social service hotline demonstrated how a human trafficking cell could be identified by emergence of a particular pattern of phone calls.
My point is that there is a growing need and opportunity for academics to engage with the public sector around understanding the algorithms that surround us and to create and utilize them for the public good. Concerns about how these algorithms reflect and reinforce bias (see Weapons of Math Destruction and AI Now) underscore the need for transparency and accountability that academic partners can bring. I understand why epidemiologic investigation and lab testing must be completed before an outbreak source can be confirmed and stopped. Yet surely there is a happy medium whereby academics work with public sector officials to create algorithms that can reduce the time needed for the investigation. Divergent patterns such as those identified in the social service phone calls are the sentinel indicator of a problem.
Data on our sister site, HDM.LiveStories.com, can be made available through or fed by APIs. APIs are key to making use of real time data. We're eager to work with academic and public sector colleagues to obtain data that could help create these algorithms, or to help you develop algorithms and even create apps that leverage real-time data. Share your ideas with us!