The term “Big Data” has lost its relevance. The fact remains, though: every dataset is becoming a Big Data set, whether its owners and users know (and understand) that or not. Big Data isn’t just something that happens to other people or giant companies like Google and Amazon. It’s happening, right now, to companies like yours.
Yesterday at Eureka!, our annual client conference, I presented on the evolution of Big Data technologies including the different approaches that support the complex and vast amount of data organizations are now dealing with. In this post, I’ll break down some of my presentation and dig into the current state of Big Data, the trends driving its evolution, and one major shift that’ll deliver up massive value for companies in the next wave of Big Data’s growth.
Big Data Today
As hinted at above, even if you don’t think your data is Big Data, it’s probably well on its way there or already has some of the attributes of Big Data. Take a minute and think about where your company’s data comes from: do you have a large number of tables? Even if they don’t have millions or billions of rows, that’s a Big Data characteristic. Using multiple sources? What about live sources? Does your data come in at high speeds and change rapidly? Those are all Big Data challenges that traditional analytics and BI platforms just can’t adequately handle.
If you’re not thinking about how to scale your data and analytics infrastructure, then you’ll find you don’t have the tools to survive as your datasets continue to grow in size, scope, depth, and complexity. If this sounds intense, that’s because companies of all shapes and sizes who don’t reckon with the trends changing the data world will be in trouble.
Trends Changing Big Data
First off, IoT, the Internet of Things. The Internet has always, technically, been on “things”. Servers, computers, routers, wires, etc. are all things. But now sensors have enabled us to track and thereby digitize anything from retail to healthcare to banking and everything in between. These “things” probably have some kind of associated app where you can look at that data and manipulate it, getting insights on your products, customers, and operations in the past week, month, or year.
The IoT is everywhere and there are more pieces of technology connected to it every day. All these devices funnel more and more bits of data into warehouses and lakes the world over and that data is bought, sold, shared, sliced, diced, and drilled into to reveal a wide array of insights (it also gets totally ignored until someone figures out what to do with it).
Next up, the proliferation of how we interact and query data. We’re in a world where there are more ways of pulling insights out of a dataset than ever before. There are dozens of languages specifically for querying databases. The current wave of analytics platforms empowers non-technical users to query their data via simple language. Augmented analytics systems learn from prior user behavior and serve up insights without humans even asking for them. There are even platforms that allow users to query just by speaking to them in normal, human language. This both requires more storage and query data.
Lastly, the rise of the cloud. Remember the early days of cloud computing? It seemed like a risky place to store data at first. Businesses, as is often the case, were slow to adopt it. Everyone wanted their data where they could get to it. Fast-forward to today: everything is on the cloud (and with the rise of the IoT, every thing is also connected to the cloud, dumping data there). Your phone automatically uploads your pictures there, videogames, software, websites, and apps all live on the cloud. If you’re starting a business today and you deal with data (that is to say, you’re starting any kind of business), you’re going to put that data on the cloud or you are not going to have a business at all. We have come full circle, where decisions today are around choosing only technologies that are cloud-first and cloud-only.
If I could leave you with a word about the future, it would be data lakes.
Data lakes have become a key component in modern data-intensive environments, and yet many companies are still challenged when it comes to turning these investments into tangible results. To succeed, these organizations should focus less on the mechanics of the lakes themselves and more on the business initiatives that turn data into action. The key is to take an agile approach to managing governance and modeling in data lakes, trying a few approaches in tandem that expose data quickly so that they can explore and iterate quickly.
Imagine building a machine learning model that can be tested on a real-time stream of data, which would lead to rapid insights and improvement. The ability to explore and analyze data quickly in lakes creates an opportunity to see in real-time what’s going on in your business. The ability to understand the raw data becomes a critical component to realizing value from it. AI-assisted experiments like this, which pull insights out of otherwise unwieldy amounts of data, will be instrumental in helping organizations with Big Data sets thrive in the future.
Big Data has been replaced by Big Analytics. Data is a new driver of innovation and the barrier is not skills, datasets, or infrastructure; it’s mindset. Bet on the cloud, digitize all you can, and leverage analytics to turn data into opportunity.