Social Media as a Sensor – Leveraging Crowd-Sourced Data for Early Warning and Response
By
Co-authored with Bill Hyjek
A recent story published on Wired.com discussed the findings of group of researchers at the Indiana University School of Informatics and Computing who developed a method for predicting changes in the Dow Jones Industrial Average through the analysis of Twitter updates. The research team leveraged open-source mood-tracking tools like OpenFinder to sort Tweets into positive and negative bins based off of emotionally charged words, the research team was able to predict the ups and downs of the stock market at closing bell three days later to within 86.7% accuracy.
Now consider leveraging data collected in this manner via Twitter and other social media tools for other types of predictions. The implications of this type of data collection for early warning and/or confirmation of information – social media as a sensor – are significant if applied to the field of public safety.
Earlier this year, Federal Computer Week highlighted a group of Namibian officials who, with assistance from an international team of experts including representatives from NASA and the National Oceanic and Atmospheric Administration (NOAA), developed a geospatial application tapping and combining satellite imagery and river-height sensors to get an early read on possible flooding in Namibia. Leveraging sensory data, officials are now able to predict, prepare for, and respond to events much sooner than previously possible. Furthermore, aggregating and geospatially depicting data provides contextual understanding of a large volume of information very quickly.
By combining social media data with geospatial analysis, officials may be able to prepare for and respond to a disaster faster than ever before. Sensory data like that collected via river-height gauges and seismic monitors, when combined with social media data and/or sentiment analysis, provides both the “what,” or that an event has just occurred or is about to occur, and the “who,” the “why,” and the “how” – or the context of an event, including the public’s level of understanding, its reaction to and knowledge of factual information, may even assist in predicting second and third-level events that might arise as a result of the original disaster.
Emergency response officials already monitor seismic data provided by the United States Geological Survey (USGS) for early detection of earthquakes. Why not combine seismic data with key word searches for “earthquake,” “shaking,” etc. within specific geographical locations? Going further, why not overlay both seismic data and geospatially mapped data from Twitter with historical event data, critical infrastructure data, hazard and mitigation data, etc.? The resulting mash-up could provide an unprecedented level of contextual understanding to response agencies experiencing resource cutbacks and struggling to keep up with the volume of information available on the internet.
Despite the benefits of collecting crowd-sourced data during an emergency, it has not yet been adopted by incident response agencies for a variety of reasons. Many in the incident response community are reticent to social media data a valid information source. In large part, this is due to the difficulty in vetting the potentially vast amounts of data during a major operation. The inability to process this information, in turn, raises other issues for decision-makers, including potential liability concerns. To a lesser degree, the incident response community is steeped in tradition, with a strong proclivity to favor only proven methods and tools for the conduct of their mission. Dramatically divergent concepts are likely to meet with some cultural resistance.
For agencies to begin using social media and other types of sensory data for early warning and response, several changes must occur. First, rather than constantly monitoring Tweetdeck or similar other tools and attempting to physically sift through the data that is rapidly coming in from social media, news wires, etc., imagine if a predetermined aggregation and filtering mechanism could automatically filter through the information and geographically map it so you could look at all the information in context to an event as it unfolds. Incoming Tweets and sensory data could then be visualized as points on a map, and additional tools could enable you to pull in relevant information from other sources including government agencies, public information offices, and non-governmental organizations. This information too could be automatically sorted and mapped for further analysis. Additional tools could then enable more rapid and accurate analysis of the information allowing for efficient and effective decision making. Virtual USA, the Department of Homeland Security’s flagship program, sponsored by the White House Open Government Initiative and DHS Secretary Napolitano, has already made these concepts a reality.
Second, although social media tools enable access to a great deal of information from multiple sources prior to and during an incident which can, in turn, greatly enhance decision making and situational awareness, the wide scale use of social media during an event can also present significant challenges in monitoring and sorting through large amounts of data in order to authenticate information for real time decision making.
To harness crowd-sourced and sensory data most effectively, agencies need the ability to successfully aggregate, filter, integrate, map, prioritize, assign, and follow up on data collected via these methods. Accomplishing this requires that:
- Data aggregation and analysis tools to be developed to assist organizations in decision making;
- Data should be geospatially enabled for additional context; and
- Applicable governance framework, policies and challenges (e.g., liability and privacy issues, etc.) must be identified and addressed.
Leveraging crowd-sourced and sensory data may prove useful for alert and early warning of several types of events, including:
- Shooting and other violent acts as they occur;
- Disease, outbreaks, symptom clusters;
- Bird and other animal deaths;
- Floods, tornados, wildfires, and other natural events; and
- Traffic.
I am interested in hearing what others have to say about the types of data that might serve to assist public safety organizations in responding to events within their jurisdictions. Often it is the real-world application and identification of an information gap that drives the development of new and innovative technologies and methodologies.