Why Great BI Needs Best-In-Breed Open Source Technologies
Introduction: What’s Happening in the World of Open Source?
“Open source” was once an approach that split opinion down the middle. But today it underpins many of the most exciting developments, products, software and entire technology companies in the world.
Linux is perhaps the best known champion of open source, but other examples of influential companies, based on open source, which has taken the industry by storm include GitHub, MySQL, MongoDB and Docker.
Even companies like Microsoft - companies that have in the past vehemently opposed to the concept - have gone full circle and now build some of their own products on the back of open source.
In fact, a survey by Black Duck found that an incredible 96% of new software products in 2016 used open source somewhere in the mix, while 78% of companies run on open source.
Today, it seems there’s nothing holding it back.
There are many good reasons for this enthusiasm. Open source technology offers far better security, easier deployment and great scope for scalability than many proprietary equivalents.
Ultimately, today’s customers and users expect ever-increasing speed, quality, prioritization, discipline and adaptability from their software and networks. Open source helps developers to provide that.
That said, it’s crucial to understand that using open source isn’t just about delivering a certain type of software. It’s about the kind of development culture you create within your business.
Using open source helps you to embrace responsive design and adaptation, feedback loops and constant evolution. It allows you to take an MVP approach, for example, continually improving and updating a product rather than seeing it as reaching its definitive iteration the moment it hits the market. It helps you to build agile projects, which are shown to be 28% more successful than traditional ones.
Put simply, open source is agile and agile is the key to future-proofing your business.
In this white paper, we’ll walk you through the most important open source technologies when it comes to BI, explaining what they do, how they fit together, and what that means for your business.
What is Open Source?
Open source refers to any software that has a source code (the code underlying the software) that anyone can get hold of - and can then improve, manipulate and modify for their own purposes. It’s the opposite of proprietary software, which is owned by one company or person, who dictates that only they can copy, share, change or use it.
Open source software authors make their code freely available to anyone who wants to use it. It’s all about collaboration, transparency and open exchange. The great thing about this is that open source technology quickly becomes a global project, with many minds working on it at once. Any programmer can add features that improve the code or can fix any parts of it that don’t work correctly.
In the early days of open source, many companies (perhaps unsurprisingly) thought it was a terrible idea. They fiercely rejected the idea of giving away code for free or of modifying an open source code that they would then need to make available to others. Today, though, an unexpected outcome of the open source movement is that it’s actually ended up driving innovation in the private sector.
That’s because, with so many people inspecting, using, and suggesting improvements to open source codes, these have often become highly robust. As such, even those who were initially suspicious of open source have realized that using a tried-and-tested, incredibly strong code brings enormous value to their own work, preventing them from continually reinventing the wheel.
As a result, today, open source technology is driving devops across all kinds of sectors, underpinning swathes of IT developments - including in BI and data analytics.
What are the Benefits of Using Open Source?
Let’s take a look at the 7 top benefits of incorporating open source in your devops:
Quality & Stability
Imagine if you could gather up the world’s best developers, all experts on open source, and get them to check, test and hone your code - without even having to put them on your payroll? That’s what you’re getting with open source. And that means, too, that the code is getting better and better all the time.
Freedom & Customizability
The great thing about open source is that you can use it for whatever you want. The code is specifically designed to be taken and manipulated or adapted in whatever way you see fit, and all for free. You can do anything you like with it, customizing it for your business or your products in any way you see fit.
Open source projects tend to be modularly architected, which means they are more flexible to use: you can lift out the parts you want. Since it’s been designed that way from the outset, the code is also more robust and can cope with being manipulated.
As the name suggests, open source programs are designed to be as open as possible. They’re also non-profit. That means they aren’t dictated by licensing agreements with particular companies or programs or designed to work better with certain systems for commercial reasons. As a result, they’re more likely to work with lots of different OS and software than proprietary equivalents.
One of the most attractive aspects of open source projects is that they typically come with a lower TCO and certainly lower upfront costs to implement. That’s because there’s no need to reinvent the wheel each time you want to use it. Many brains have already worked on the code and you can simply build on that. It also means you can shift workers from low value to high-value work, instead of getting them to focus on small problems like fixing glitches and bugs in the code.
Training & Support
Another great thing about open source is that your developers are probably already familiar with any systems you decide to adopt. Even if they aren’t, it’s built around recognisable, accessible software with freely available documentation, making it easier to bring them up to speed. What’s more, you have an enormous pool of experts and companies, located all over the world, who can provide support and training if required.
With so many people testing this out in different ways, the collective hive mind is constantly working to dig out bugs, problems and weaknesses on your behalf. That means you’re part of a network that keeps striving to improve the code you use and offer patches that you can implement without having to apply your own resources to it.
Cloud-Native and Open Source
Another trend rapidly gaining traction is “cloud-native”: applications and software that are written to work well in the cloud. These are great because they are resilient and behave predictably in the cloud environment, while building on cloud-native architecture that’s agile, reliable, and scalable.
You also get to avoid the headaches that come with moving applications in the cloud when they’ve all been written differently. Cloud-native apps and software are designed to work with the platform’s automated, container-driven infrastructure, making them much more compatible with each other.
Cloud-native applications are also abstracted from the underlying infrastructure, which means you spend less time maintaining, configuring and patching operating systems. The right-sized capacity manages resource allocation, optimizes application lifecycle management, scales to meet demand and minimizes downtime by recovering much faster from failures.
Here’s where open source comes in.
Cloud-native computing uses an open-source software stack to deploy applications as microservices. This means that each part packages into its own container, orchestrated in a way that optimizes the way resources are used. Meanwhile, the microservices architecture avoids interdependence and means developers can make changes to individual services independently of the whole.
It’s a highly collaborative way to work. You avoid creating silos and facilitate DevOps, in keeping with an Agile approach. You can continually deliver software updates as soon as they’re ready, shrinking feedback loops and allowing developers to incorporate customer demands in a fast, constant cycle.
In short, software developers can build great products quickly - and can keep improving them even once they’re out in the world.
What Is Linux?
Linux is perhaps the best known, most widely adopted open-source project in the world. It’s the preferred OS for the cloud and enterprise, with 90% of public cloud applications running on Linux today. What’s more, it’s regularly commercialized by companies such as Red Hat.
A big reason for this is that Linux is cloud-native. It’s also incredibly flexible, designed to support many use cases, devices, and target systems. What’s more, it’s far more customizable and compartmentalizable than any other operating system, giving it the flexibility and adaptability needed for agile application development.
Linux is also the most available and reliable solution for critical workloads in data centers and cloud computing environments.
Let’s now take a look at other tools, programs and software that are frequently used with this powerful open source project.
Docker Containers and Kubernetes Orchestration
Often used together, Docker and Kubernates are two important parts of the software stack that act as microservices and containerization architecture.
What Are Docker Containers and Why Would You Use Them?
Docker is a program that performs virtualization at the operating system level. This is known as ‘containerization’ and you use it to run isolated software programs called containers.
Containers bundle together their own applications, tools and so on, but they are also able to communicate with each other through channels you have clearly defined. This allows you to abstract between the OS and processes taking place in the containers.
In this sense, containers function in a similar way to virtual machines - but they are far more efficient and lightweight. That’s because they are run by a single OS kernel.
Using containers means you are able to create identical environments across all your machines. You know exactly how a container will run, so there are no unexpected errors when you move it between machines or environments.
So why use Docker for this?
Well, it’s a great choice because it’s fast to start and deploy, and easy to manage and scale. It’s also very easy on your hardware, sticking to a low use of computing resources. Plus, it’s supported by lots of different OS and works exceptionally well with Linux.
What is Kubernetes Orchestration and Why Would You Use It?
Kubernetes is an open source container orchestration engine that’s used to automate the deployment, scaling and management of applications that have been containerized.
Depending on how you use it, you could think of it as a container platform, a microservices platform or a portable cloud platform.
The great thing about Kubernetes is that it’s extremely reliable - and that’s the main reason so many different cloud providers use it as their de facto program for container orchestration. It makes network infrastructure more efficient and opens up development opportunities in multi-cloud environments. It’s certainly the best option for container management with Docker.
Benefits of Using Docker with Kubernetes Orchestration
The important thing to understand is that Docker and Kubernetes each operate at a different level of the stack.
The trickiest element of using Docker is getting the different containers to talk to each other. On its own, it can be quite challenging to manage different start times and storage issues while preventing failures along the chain.
This is where Kubernetes comes in. It continually checks on the state of deployment and spins up a new container if one goes down. That means you don’t have to track down the server where a particular container failed, saving you a ton of time and hassle.
Think of it this way: if each individual container is like an airplane, Kubernetes is like the airport, with air control directing each plane, making sure it knows when and where to land, and handling central communication to tackle any issues.
What Other Technologies Should You Use with Your Open Source Project?
The incredible growth of Linux has sparked all kinds of software to support every process and task you can imagine. Here are some of our top recommendations to help you steam ahead with agile BI projects in your organisation.
What Is Helm and Why Would You Use It?
Helm is a tool for streamlining, installing and managing Kubernetes applications. It’s made up of two parts: a client (helm) and a server (tiller).
In short, the Help Charts assist you as you define, install, and upgrade Kubernetes applications, no matter how complex these might be. The Tiller then runs inside each Kubernetes cluster, and manages releases (or installations) of your charts.
Grafana and Prometheus Explained
These are two popular programs that each play an important role in BI processes. Prometheus is used to provide backend storage and collect metrics from any targets you are monitoring, while Grafana is an interface for analysis and a visualization layer.
When you bring them together, they become a monitoring stack that’s used to store and visualize time series data. It’s a super effective software stack that’s rapidly gaining in popularity among DevOps teams
What Is GlusterFS?
Gluster File System, shortened to GlusterFS, is an open source, distributed file system. You would used it for cloud computing, streaming media and content delivery. It’s also based on Linux.
What’s more, it’s easy on your hardware. That’s because it scales out in a building block style, so users can store petabytes of data while still working with low-cost commodity computers.
What Is FluentD?
FluentD is an open source data collector that’s used to build the unified logging layer. You simply install it on a server and let it run in the background, where it continues to collect, inspect, transform, analyze and store all different types of data.
What Is MongoDB?
This is an extremely popular open source, NoSQL database management system.
MongoDB supports all different forms of unstructured data through its document-oriented database model. It does this by using collections and documents instead of traditional, relational-database rows and tables. That makes it great for use with Big Data, as well as many other types of data that don’t fit comfortably within a relational database.
What Is Zookeeper?
This is an open source Apache project for use with large cluster environments.
Zookeeper works as a replicated synchronization service with data distributed between multiple nodes. It also maintains common objects that are needed to coordinate distributed processing across the cluster, such as configuration information and hierarchical naming spaces.
What Is RabbitMQ?
RabbitMQ gives a way for different parts of the system to talk to each other. Basically, it’s a messaging broker that acts as an intermediary for your applications, so that they have a shared platform for sending and receiving messages between themselves. Types of messaging it works with include push notifications, data delivery, work queues and so on.
All messages are stored safely on the platform until received, which allows software applications to connect to each other as if they were components within a bigger application. It also gives them a way to communicate with user devices and data. In doing so, it helps you to scale faster.
To Sum Up…
If you want to meet the expectations of today’s customers, it’s absolutely vital that you embrace Agile working practices that facilitate responsiveness, allow you to implement improvements rapidly, and run continual app development in line with your customer’s demands.
As we’ve seen, businesses are dealing with increasing pressure to get highly scalable, resilient software applications to market quickly. Not only that, to stay competitive they need to offer technology that does this at a low TCO - and the promise of swift ROI.
This has given rise to modern cloud-native architecture that enables large, complex deployments that scale, while supporting continuous delivery and agile DevOps processes.
That’s the broader picture for today’s technology. Most BI tools in the industry, however, lag far behind. The majority are built with traditional technology stacks, while the rest are rigid SaaS offerings that do not allow users to adapt and customize the deployment to meet their specific business needs.
Time is up for these dinosaurs. Customers simply won’t put up with it.
The future of BI will be cloud-native, Linux-based platforms that are built to accommodate Agile DevOps. It’s inevitable. By shifting to BI tools that are based on Linux, developers will rapidly be able to build and deploy custom-made, data-driven applications at scale, all at the lowest possible TCO.
Final Thoughts: Deploying Sisense on Linux
To date, Cloud-Native Sisense on Linux is the only cloud-native BI analytics platform that has been purpose-built from the ground up using best-of-breed technologies like Linux, Docker Containers and Kubernetes Orchestration. It’s been created to support highly scalable, flexible and robust custom analytic applications.
Sisense’s cloud-native architecture fits seamlessly into modern cloud environments. It enables DevOps teams and platform engineers to integrate analytics into their deployment workflows, scale and manage their BI with ease, and integrate this with their delivery and application lifecycle.
Not only that, it integrates seamlessly with cloud services like AWS. There’s automation to support rapid deployment and roll out upgrades quickly and easily. In-app dashboards provide on-premise, detailed monitoring from the server down to the data instance level. The system offers distributed, reliable and highly scalable shared data storage. It takes the pressure off your system, too - you can reuse hardware through tenant isolation on the same cluster or server.
What’s more, if you’re looking to move across from Windows to Linux, Sisense can provide a full migration kit.
With nothing holding you back, why fight it? The future is agile - and it’s right here at your fingertips.
Get Started with Sisense on Linux Today: