At Sisense, our mission is to empower users of all kinds with deep insights from even the most complex data. This requires a focus on many different areas of analytics including our ability to work with large, complex, disparate data; our ability to provide an easy to use interface for both working with data and building analytics; and our ability to go beyond dashboards with capabilities such as IoT integration and powerful APIs to support application developers. In addition, providing a world-class analytics platform requires a deep understanding of how to best leverage AI/ML to support the needs of all users from the novice to the most technical.
Yesterday, during Eureka!, our annual client conference, I gave a presentation that took a deep dive into artificial intelligence and subgroups including AI, ML, and statistics. I spoke about developing a comprehensive and impactful AI strategy and our AI roadmap for the coming year. If you were not in attendance at Eureka! or you want a recap of what I spoke about, this is for you.
Last September, Sisense announced that we would be making significant investments in AI throughout our platform. Soon after, we announced the release of Sisense Hunch which provides the ability to transform even the most massive data sets into a deep neural net which can be placed anywhere, even on an IoT device. Our investment in AI is paying off in spades. At Eureka! we gave attendees a sneak preview into AI Exploration Paths which provide an AI-guided experience for business users to dig deeper and explore their analytics. This is just the first of a long series of AI-driven capabilities that we will see throughout the product in the coming months. (Stay tuned for more blogs by me as we roll these capabilities out!)
Back to my session. The goal of the session was to discuss a holistic view of what is required to support a comprehensive AI strategy.
Living in a World of Big Data
It all starts with the data. More than a decade ago, when the term Big Data emerged, companies started to invest heavily in the infrastructure to gather and store their data, realizing the potential future value of their data. Some companies went further and defined how they’d monetize this data. In most of the cases, these plans turned into dark lakes of ever accumulating files that went unused due to the lack of skills, tools, or resources for analytics implementation.
There is still much work to be done but AI can both remove much of the tedium of trying to figure out how to connect all of this data as well as to help data professionals understand the data to determine how it should best be leveraged. This includes such things as AI-powered data cleansing and modeling as well as general statistical analytics of the underlying data itself.
When it comes to automating insights for the business users, on the other hand, there’s been unprecedented progress. In recent years (the new AI era), all areas of our lives as consumers are being facilitated by some kind of ML-driven solution and enterprises are busy trying to redefine their AI/ML strategies.
Unfortunately, even in the most mature enterprises, there is an invisible moat of data literacy between business people with the goal in their minds, the domain experts who can translate those goals into requirements, and the technical people who implement the solutions. Along with AI Exploration Paths, mentioned above, we see a large number of assistive technologies to surface insights to the end users. Last year, Sisense released both Sisense Narratives, which creates descriptive text to help end users understand their visualizations as well as Sisense Insight Miner, which looks for unusual relationships in the data and surfaces them for further analysis.
Why AI Now?
There are three major drivers for the growing acceleration in this space: more sophisticated and cheaper proprietary hardware (GPUs, FPGAs, ASICs), bigger and more detailed datasets to be trained upon, and more sophisticated and faster algorithms and methods. The improved hardware came from an unexpected domain: rich-graphics gaming. These systems were repurposed for tasks like matrix multiplication, which are the basis of ML computations. Prices went down for these usually expensive computers and Moore’s Law of the linear growth of computational power was broken. Larger datasets were readily available because many private companies and public institutions decided to make their data open for everyone: text, video, speech, you name it. Academia couldn’t stand to fall behind and thus new generations of more and more sophisticated algorithms were published, resulting in the boom of new architectures inspired by the human brain functioning, called Deep Neural Nets.
The hardware is there, the data is accumulating fast. It would be intuitive to say it’s easier and faster to implement data-driven application development than ever before. Data has gone mainstream. New professions, such as data engineer, have evolved. Everyone is adding “Data” to their title. But it doesn’t mean we are quite there yet.
From Data to Data-Powered Apps
It isn’t enough for an analytics platform to be infused with AI. To support a comprehensive enterprise AI strategy, we need to support the operationalization of the shelf libraries (AutoML) as well as proprietary AI algorithms. For all the people in the data, or should I say “Python” business, it’s a long way from the development environment (e.g. Jupiter Notebook) to scalable productized application deployment. The operational data science pipeline should be able to ingest new data hand in hand with the continuous support of model improvement which keeps the production system stable. It also deals with the volumes of training data, model ensemble performance, and dependencies on the external libraries and toolkits. It’s hard to productize ML due to the variety of environments and tools that need to be supported (API, Web, Mobile).
Volume, performance, and dependencies introduce big challenges in the process of turning data to algorithms to features:
The tooling for the machine learning model lifecycle could look like this:
With data ingestion and preparation starting on the left side and visualization/application wrapping up on the right side, we see the full productization cycle for the machine learning data-driven application:
The Sisense offering creates a comprehensive data analytics platform that caters to every data user at a company, from a data engineer to data scientist, to executive.
Having the data preparation, model development, and application deployment covered, there’s still a data skill gap in the developers who will operate these systems. Data literacy and data skills, which created the forgotten dark data lakes in the first place, are still scarce. For proof, just look at the skyrocketing salaries of data professionals (mainly data engineers and data scientists).
Building a Better Tomorrow
At Sisense, our mission is to provide the premier data analytics platform for all of our customers, regardless of their technical skills. This is a tall order.
For the most novice users, we are working on providing a natural language query (NLQ) interface which will allow users to simply ask a question and get an answer (you can already see components of this in our ability to support developers who want to integrate with Alexa).
As users become a bit savvier, they interact with dashboards making selections and applying filters. Here, our soon to be released AI Exploration Paths helps novice users understand which question they should be asking next by evaluating the behavior of all users on the platform. As users become more skilled, they may even create a dashboard of their own with our easy to use drag and drop UI (even here AI plays a role as it suggests fields and widget types for the user).
“All users” also includes data professionals and it’s here that the new tools fill the gap. AI-driven analysis of the data provides the ability to see the data distribution, accept the suggestion for the data normalization and cleansing, and play with the predefined clustering and segmentation. Now, when you understand your data completely, you can bring more of it from different sources, performing transformations on the ingestion. Sharpen your SQL/Python/R skills and jump into the editor that will come from our recent merger with Periscope Data which allows you to not only manage your own custom code but to tap into AutoML libraries.
Accessibility is The Guiding Light of AI
Data today is Big and it’s only getting bigger. Making that data accessible for all will require a new generation of applications built with AI, ML, and advanced analytics. Builders like data scientists and engineers need powerful tools to harness these features and make the data easily used by customers and business users. Customers and business users need simple UIs that let them build dashboards and even analytic apps and widgets of their own.
Whatever you intend to build on top of your data, the right analytics solution will be instrumental in getting it done.