The Business Intelligence (BI), Big Data and Analytics worlds have a case of in-memory fever. Emancipating implementers and users from the perils of spinning magnetic disk storage has become an obsession. But optimizing database engines for in-processor, rather than in-memory, operations turns out to be a much better strategy.
In-memory looks like nirvana, so how can a different approach be superior? Perhaps an analogy can help. Even if you don’t drink, you know that beer and data both take some work to craft, and are available from different sources, of varying convenience. Just as getting data from Random Access Memory (RAM) is faster than doing so from disk; getting beer from a local supermarket is a lot faster than going to a more distant commercial brewery. But getting beer from a nearby convenience store is even quicker than getting it from the supermarket, and grabbing a brew from your fridge is faster still.
Today’s central processing units (CPUs), be they in big servers or your own laptop, have special, high-speed storage areas on-board, called caches, that are a bit like your fridge. Most databases tend to make use of CPU cache storage in only an incidental – one might even say accidental – way. What the industry needs are database products that place data in a CPU’s cache intentionally, rather than haphazardly.
If a distributor delivered certain beers (which it already knew you liked) directly to your home and, as an added service, had a deliveryman place the bottles straight into your fridge, wouldn’t you sign up? If the distributor made sure other craft brews you liked were delivered to your local convenience store, wouldn’t that clinch the deal?
We think so. Prism 10X provides this kind of personalized, high-efficiency delivery; it’s just that Prism does it with data, instead of beer.
Prism pre-fetches data into cache, making intelligent decisions about which data should go into the smallest but fastest cache area, akin to your fridge, known as L1. SiSense also utilizes the more spacious, but slower, convenience store-like, L2 and L3 caches.
What if you had a bunch of beer aficionado friends over to your home, and you knew what kind of beer they liked? Grabbing their bottles out of the fridge all at once would be better than making separate trips to the kitchen, to get one bottle at a time. SiSense helps in a similar manner, by utilizing special processor instructions that handle multiple pieces of data together. Using these “single-instruction, multiple-data” (SIMD) capabilities of the CPU makes a database not just efficient at procuring data, but super-fast in processing it – just like the gracious host who knows what kind of beer his guests like, and serves them almost as soon as they walk through the door.
This approach isn’t just smart, it’s frugal. Going all-in on in-memory can be expensive, as RAM is very costly, relative to disk. By being tenacious with cache and SIMD operations, the CPU-optimized approach lets you keep more data on disk and still analyze data as quickly, or even more so, than databases that make you put everything in RAM.
Many database products on the market today stick with simply optimizing old approaches. SiSense Prism employs a modern approach by exploiting today’s CPU architectures. In-CPU analytics is where customers need to be now, but most vendors are still coming up the in-memory ramp.
Below a slideshare which will provide a quick overview of the technology and how SiSense plays in the space.
In the future, the industry excitement over in-memory will be just that – a memory. Meanwhile, we think SiSense Prism has reached that future position already, and it’s not looking back.
Don’t believe it? Most people don’t until they actually try it. That’s why we make the full version of our software available as a free trial download here.