Big Data Is Exactly What You Think It Isn’tWritten by Todd L. Michaud
Todd Michaud runs Power Thinking Media, which helps retailers and restaurants tackle the convergence of social, mobile and retail technologies. He spent nine years delivering technology solutions to more than 10,000 retail locations as VP of IT for Focus and Director of Retail Technology for Dunkin’ Brands.
Who cares about Big Data? You should. All of a sudden, Web logs that were kept simply for troubleshooting purposes can now be mined to determine valuable information about customers’ preferences.
Logs that are created by physical machines can now be analyzed en masse to look for information to help advance a business. Data from social networks can now be mined for customer sentiment. These problems were too big and too complex before. But now, answers are within reach.
The Big Data space is filled with so many posers, fakers and wannabes it’s ridiculous. Everybody is trying to catch the Big Data wave by getting their name attached to this hot new trend. Let’s start by all getting on the same page in terms of what I consider the definition of Big Data:
When someone in an organization gets the idea that they would like to pull useful information out of bunch of data that is so large and complex that the CIO says, “Well, how the hell are we going to do that?” That is Big Data.
What’s new is that a bunch of really smart people have solved the two biggest challenges when it comes to analyzing data of large size and complexity. First, they have removed the need for giant, expensive, specialized hardware platforms (instead using a large number of small “commodity servers” or even cloud servers). And second, they have also removed the need to structure the data in a given format prior to running analysis.
These two technological advancements (and the dozens of other underlying technologies that support them) have unleashed a tremendous number of possibilities when it comes to gaining insight from information that would have previously required millions of dollars worth of hardware and a staff of data experts to process. In fact, with open-source software and Amazon EC2 virtual hardware, you can now process a job on 1,000 servers that, even if it lasted for two hours, would cost less than $200.
Being in the Big Data space, I try to read as much as I can that is published on the topic, and I find myself pleasantly surprised when I finish an article that doesn’t leave me wanting to get those 3 to 5 minutes of my life back.
Unlike the “cloud” hype-cycle, where every company on the planet decided to start pitching the fact that it is, in fact, “in the cloud” (including helping the poor housewife catch her missed episode of TV), this insanity seems to be brought upon by writers and analysts who are covering the space but do not have the slightest idea what they are writing about. Cramming buzzwords into a story may help with search engine rank, but it sure does give me a headache.
Regardless of the software platform, the database and the data analysis components that may win or lose in this land grab, I see Big Data going through a three-phase evolution:
- Decisions Support. The first phase of Big Data will be as companies implement systems that create dashboards and reports that help them run their businesses. Big Data insights will help executives determine how to best steer their companies. This might be in the form of determining new product launches, existing product enhancements, potential market opportunities, etc.