The past few years have seen an explosion in the amount of data generated. Social network interactions, scientific experiments like the Large Hadron Collider and genome sequencing, government data, data from sensors, and consumer and sales data from retail and e-commerce companies are some of the large sources of data available today. The dual problem of this “big data” is making sense of it and using it to make decisions.
Along with the data explosion, there has also been a rapid development of tools and techniques used to work with big data and and the rise of the field of “data science.” Both of these bring a raft of changes to the way businesses operate, including a shift to more open, flexible cloud-based systems and commoditized data management.
For example a recent report in the New York Times mentions that retailers like Walmart constantly analyse sales, demographic, and even weather data to tailor product selections at stores and to determine pricing markdowns. Shipping companies like UPS mine their traffic data and delivery times to improve routing and online match-making services constantly sift through personal data to improve their algorithms.
This predictive power of data is immense, and is used in various fields ranging from economic forecasting to public health. An interesting example of the latter is Google Flu Trends where search data from Google can predict an outbreak of flu even before the hospitals or health services report it.
Big Data does come with a few caveats; the old aphorism of “lies, damned lies and statistics” still holds true, but despite that, big data is here to stay. As Hal Varian, Chief Economist at Google says, “the ability to take data – to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it – that’s going to be a hugely important skill in the next decades.”