Data Management in the MultiCloud Analytics Era

What comes after big data?

Over the last decade, data has done more than explode, it’s completely changed the way companies operate. When technology made it not only affordable, but advantageous to store a wide variety of information, Big Data took the business world by storm. Once all of that data was wrangled and stored, organizations of all sizes collectively came to terms with a new challenge: creating value with all of that information. The truth is that the ambition to collect data had far outpaced business’s abilities to analyze it. Organizations across all industries had an extremely wide range of data inputs and an equally distributed approach to storing all that data. 

Coinciding with the sharp increase in data collection was another trend: cloud storage. 

Organizations today use a mixture of on-prem, public cloud, and private cloud environments to manage their data. Depending on their unique data needs, teams are faced with the challenge of incorporating all of those technologies together into a data pipeline that can be used to answer more and more complex business questions. This type of setup, merging different cloud technologies together into a single workflow is called a multicloud architecture.

Solving the traditional reliability/agility tradeoff

Traditionally, data infrastructure has been designed to deliver either reliability or agility. Being able to move quickly with data meant sacrificing uptime. Having a stable data environment limited a team’s ability to respond to change. Before multicloud infrastructures, data stacks existed somewhere on a spectrum with reliability at one end and agility at the other. Movement toward one end was also an equal move away from the other.

With the introduction of cloud-native strategies, that linear spectrum has turned into a quadrant, with the possibility of a position that optimizes both reliability and agility. Organizations that effectively structure multicloud environments benefit from both excellent performance and the ability to innovate quickly to adapt to changing market or customer demands.

The introduction of cloud-native techniques has brought on a development process that operates more like the traditional DevOps process used in software engineering functions. Cloud-based microservices in containers give teams flexibility that had never been possible before. There’s an easy-to-see future where data teams take advantage of this structure to deliver advanced data insights to stakeholders in real time. This ideal multicloud structure offers data teams a lot of attractive benefits: faster time to turn data into value, optimized costs, increased business agility, productivity optimizations, and a more robust end-user experience.

“Multicloud” vs. “cloud agnostic”

When it comes to generating value from different cloud providers, “multicloud” is a bit of a misnomer. It’s unrealistic to try to design a cloud environment that uses a little bit of each of the three big cloud providers, combining specific parts of each to build the most complete full stack. The costs of transferring data from one cloud to another make that structure unattractive. 

Instead, the most innovative modern organizations are building stacks that can operate smoothly in any of the big cloud providers. A cloud-agnostic stack runs efficiently on any of the three big cloud vendors, but can easily be moved together as a unit to any of the other vendors if that shift becomes advantageous for the business. When we talk about the benefits of a “multicloud” stack, what we really mean is a stack that can run holistically on multiple cloud vendors and incorporate on-prem elements, not one that runs on multiple clouds at the same time. 

In short, “multicloud” is about portability. The value of the multicloud architecture is not generated from a wide stack that spans different cloud vendors, it’s from a vertical stack that can move laterally to any single cloud vendor whenever the CIO determines that move to be advantageous. 

Complications of a multicloud environment

A multicloud architecture offers a lot of advantages when it is executed well, but it’s not as easy as simply flipping a switch. The reliable and agile system described in the previous section is a desirable end state, but one that has several obstacles along the way, especially for companies with data infrastructures that predate cloud technology. Here are a few of the biggest blockers for modern companies that want to adopt a multicloud strategy:

  • Complexity/latency: Data comes from a lot of different sources and lives in a lot of different places, including different cloud environments. Moving all that data from place to place requires additional work from the data team and can introduce room for error with data dependencies or latency.
  • Existing data infrastructure: A stack of data tools that are entirely cloud-native is great, but most organizations began building their stacks long before cloud technology was possible. Because of the connectivity required of each part of a stack, combining tools with varying levels of cloud accessibility is not always easy or even possible.
  • Compliance: With new technology often comes new regulations to ensure that data is both protected and used appropriately. When new compliance measures are passed, companies have to invest a lot of resources in meeting those requirements.
  • Security: Adding new cloud technology can also be an easy way to introduce new threats to the security of your data. Organizations need to make sure that they have sufficient security practices in place before building a multicloud stack. 
  • Cost: Adopting new technologies, even if they are improvements, often means purchasing new tools and laying the groundwork for them to work together. While the end result can offer significant cost reductions, the expenses of the upfront structural work can still be prohibitive for some companies.

The current cloud landscape

Today’s cloud landscape has three major players: Amazon Web Services, Microsoft Azure, and Google Cloud Platform. From a cost perspective, it’s relatively affordable to transfer data inside a single cloud provider and relatively expensive to transfer data from one cloud to another. Those three platforms do a great job spinning up more and more targeted services for their customers to get more valuable experience from a single provider, but it’s still rare that an organization can find a single cloud provider to manage every part of their data stack.

While a single provider can provide robust in-cloud data environments, moving all of an organization’s data to a single cloud provider can create a dependency on that cloud. This is a dilemma for CIOs. The value of having a single cloud provider for an entire stack needs to be balanced against the threat of becoming beholden to that specific vendor. Forward-looking CIOs have to keep an eye on how agile their stack can be and construct their architecture in a way that makes it easily transferred from one cloud to another if that becomes necessary.

“Cloud-ready” vs. “cloud-native”

When discussing cloud technologies, it’s important to understand the difference between “cloud-ready” tools and “cloud-native” ones. Understanding those classifications is the only way to truly understand the relationships between tools in your stack and build something that takes advantage of all that multicloud has to offer. Here’s a short explanation: cloud-ready tools can run on the cloud in a limited capacity; cloud-native tools are intended to perform in a cloud environment in conjunction with other cloud-native technology. Cloud-ready technology might be able to emulate cloud-native tools, but the lack of nuanced scalability seen in cloud-native architectures prevents cloud-ready options from being genuinely considered equal.

Cloud-native technology is set apart largely because of microservices architectures that are readily supported via Kubernetes, running on Linux (which has become the default operating system for the cloud). Applications built to run on a Kubernetes stack are portable — they will run on a single laptop or on a much more powerful AWS-backed collection of resources. Cloud-ready tools that were built to run on a laptop or an on-prem box might be able to run in the cloud on a single virtual machine, but that sacrifices all of the fine-grain scalability that makes multicloud valuable. 

Cloud-ready vendors offer limited functionality if you’re building a system that utilizes the full extent of multicloud technology. Consider a tool that was originally built for on-prem or outdated private clouds: That tool might be able to produce data that can be added to your cloud stack with read-only functionality, but if the rest of the pieces in your stack can all load data into the cloud, transform it there, and write back to other sources, your cloud-ready tool is already outdated and will only fall further behind as more cloud-native tools are added to that environment.

Building a stack of all cloud-native tools is the easiest way to make sure your data infrastructure is up-to-date. Depending on your organization’s unique data inputs and outputs, it’s possible to have a fully functional stack that includes a single piece of cloud-ready technology, but as you try to fit more and more of those pieces together, the difficulties compound and the performance of your system suffers.

Building a cloud-native stack

Ultimately, the choice about whether a tool is cloud-native or merely cloud-ready is one that is made by the vendor, not the customer. The decision to build a fully portable, cloud-native data infrastructure is one that an organization needs to make before it starts looking for tools to buy. To build the most modern cloud architecture, buyers need to make cloud-native structure a requirement for their vendor options.

Building a stack with some cloud-ready vendors might seem like a viable option at first, but if the data team makes a decision to move their data from one cloud to another, that technology is going to prove problematic. For example, a tool like Tableau is built to run on-prem but can be cloud-ready in a multicloud stack. It might be functional to run in the cloud because of workarounds that you can build, but that limits how much a team can take advantage of the flexibility that comes with a true multicloud setup. 

Cloud-Native Sisense

Sisense’s cloud-native solution is built to give data teams the most flexibility in their multicloud stack. That architecture is built for Linux and runs on Kubernetes and Docker, using container-based microservices to deliver its benefits. That Kubernetes architecture allows organizations to use resources efficiently, coordinating the scaling and management of microservice lifecycles effortlessly without the complex server management needs of a legacy approach. 

Since all of today’s major cloud offerings run on Linux and support Kubernetes technology, Sisense can be a part of the most modern portable multicloud stack. It’s fully self-contained in a portable environment that operates the same no matter where your team needs it to run. Cloud-Native Sisense runs smoothly on a single stack with any of the three major cloud providers or on a custom Sisense Managed Cloud stack that a data team has tailored to its specific needs.

This incredibly agile tool fits seamlessly into any team’s Infrastructure-as-Code and DevOps processes, making it easier and faster to provision, upgrade, and scale nodes when needed. It also scales to suit the needs of any data team’s workflow, which allows companies to use their resources more efficiently and reduce the system’s TCO.

However your organization approaches the multicloud landscape, Cloud-Native Sisense can help modernize your data stack. To see a demo or talk to one of our cloud experts, click the button below.

Explore Cloud-Native Sisense