Any customer trying to make sense of Big Data has a tough set of tasks ahead of them. These days, when they turn outside for help, many find themselves hit with all types of advice, some useful but most of them confusing because they come from the hype created by vendors’ marketing organizations.
We’ve already blogged on our view of Big Data but instead of talking about “what works for Big Data”, today we’d like to talk about what *doesn’t work* for Big Data.
Business Intelligence Illusionists
Business Intelligence has been around for decades – with some successes and many failures. Business Intelligence tools’ usability is notoriously horrific and customers struggle with getting regular employees to use the BI applications they have built for them: word on the street is that about a quarter of BI solutions built actually get adopted…and the situation is about to get worse.
As data volume grows and requirements around larger and more complex datasets are becoming more pervasive, organizations find themselves challenged to succeed. Although Business Intelligence has shown to be very valuable for sub-terabyte data analysis range, Gartner’s Magic Quadrant highlights that the average data set sizes for most BI vendors is in the very low terabyte range (about 3TB). That’s not Big Data. As EMA research highlights, Big Data starts at least at 10 terabytes.
So, if traditional approaches for Business Intelligence didn’t work so well for traditional Business Intelligence problems, why should we believe they will work in the new world of data?
When deploying a Data Analysis solution for Big Data, there are 3 key factors you should be aware of. More factors are documented for you here, but please see below a few highlights.
Waiting for a sluggish BI system reduces user productivity and causes employees to miss out on important business opportunities. Waiting days or weeks for IT to set up a sound analytics backbone reduces productivity and can cause missed opportunities. According to Oracle, companies that are not planning to leverage Big Data properly are at risk to lose $71.2M.
The devil’s in the details
Hiding data is not good. Data aggregation might have been a tactic that worked in the past, because Business Intelligence tools couldn’t scale to large datasets. Now, users want to see all the details – this means drilling down to daily data and combining internal with external data for broader understanding. If your Big Data Analytics solution does not allow you to store and mash-up all details, you won’t be able to get the insights you need. Research shows that 1 question breeds at least another 6. What’s your data granularity today? Beware of the costs of “hidden data”!
Think “inside” the box
The popular approach to solving Big Data analytics problem typically follows two patterns: either upgrade to proprietary and expensive boxes to scale to large datasets OR distribute query processing on networked commodity machines. What if there was a way to “scale inside a single box”? What if you could use commodity hardware and scale on one node?