The Hidden Gem that is StorPool

The Hidden Gem that is StorPool

Chris EvansAll-Flash Storage, Enterprise, NVMe, Storage

Sometimes you see a company or a piece of technology and recognise what can only be called a hidden gem.  This was exemplified in the presentations made by StorPool at a recent Tech Field Day event.  With such great technology in place, why is the company not more well-known?

Let’s have a quick look at what’s on offer.

StorPool 101

The StorPool platform is a high-performance scale-out block storage solution that runs on the Linux operating system.  StorPool instances are deployed on commodity servers with a mix of flash and HDD and use standard high-performance networking (either InfiniBand or Ethernet). 

A cluster of StorPool servers acts as a pool of storage that can be presented either as a dedicated storage platform or in a hyper-converged solution.  Linux servers consume storage using a local block device driver, with iSCSI support for other non-Linux operating systems and hypervisors. 

StorPool claims performance numbers of greater than 1 million IOPS per server and latency figures of less than 100µs (or just above the native capability of NAND flash).  These are impressive figures and we’ll discuss a specific use-case configuration later.

Architecture

The StorPool architecture uses self-developed on disk-formats, inter-node protocols and cluster management features.  High performance is achieved by implementing direct access to NVMe drives and network cards (SR-IOV) that bypass the Linux kernel.  The result is a highly efficient storage platform.

StorPool can be driven by API, CLI or GUI.  This provides integration with standard (typically open source) platforms like OpenStack and Open Nebula.  Performance metrics are tracked in the GUI, which is a SaaS-based solution. 

However, such low-level integration is not without its challenges.  StorPool solutions are still deployed and supported remotely by the company.  Component support based on a published hardware compatibility list.

You can learn more on the architecture in this Storage Field Day 18 video.

Architecture & Demonstration, Boyan Krosnov

Performance Proof Points

As you can see from the Tech Field Day presentations, the performance of the StorPool platform isn’t in doubt.  As a proof point, the StorPool team presented a comparison with a Microsoft HCI performance benchmark “record” of 13.8 million IOPS using Windows Server 2019,  Intel Optane DC NVDIMM and Intel P4510 NVMe drives.  The full Microsoft specification and an equivalent StorPool configuration is shown in the following table.

Microsoft HCI SpecificationStorPool Specification
12 nodes, each with:12 nodes, each with:
2x Intel Xeon Scalable processors
384GB DDR4 DRAM
1.5TB Optane DC NVDIMM
32TB NVME (4x 8TB Intel DC P4510)
2x Mellanox ConnectX-4 25Gb RDMA
Storage Spaces Direct, Hyper-V W2K19
312 VMs @ 10GB storage each & 4GB DRAM
2x Intel Xeon Scalable Processor
384GB DD4 DRAM

32TB NVMe (4x 8TB Intel DC P4150)
Intel XXV710-DA2 dual port 25Gbs Ethernet
StorPool, KVM, CentOS 7
96 VMs @ 500GB storage each & 3.1GB DRAM

Now, we can see some obvious comparison differences.  The StorPool configuration drops the need for an NVDIMM cache that is used by Storage Spaces Direct in the Microsoft test.  Also, networking is different (Ethernet rather than RDMA).

Obviously, the platforms here are also different with the Microsoft O/S, hypervisor and storage layer replaced by StorPool, KVM and CentOS7.  Where Microsoft achieved just short of 13.8 million IOPS, the StorPool configuration reached the same level of performance with fewer VMs and dropping the need for expensive Intel Optane DC NVDIMM. 

Whilst a lot of the hardware is similar, this configuration isn’t the same on other components, so we can’t directly attribute all the performance gains to StorPool.  However, what we can see is that an overall solution using open source components and StorPool could easily out-perform the latest Microsoft HCI test.

More details on performance can be found in this presentation, also from SFD18.

StorPool Performance Demo

SDS or Not?

So why isn’t StorPool more widely deployed?  The first issue is perhaps one of positioning.  Although StorPool is effectively a software-defined solution, the hardware needed to gain real performance is relatively specific.  The solution also needs to be installed and supported by the company.  You can’t simply download and try out StorPool on your own servers or in virtual machines.

Second, there is perhaps a question of market fit.  StorPool spans the world of service providers and enterprises.  Whereas service providers want cheap/free and are able to manage the Linux sysadmin skills needed, storage teams in the enterprise want solutions and platforms and usually don’t have the Linux skills. 

Third, there is possibly a question of funding and strategy.  StorPool claims to be cashflow positive, which is great.  However, companies like Nutanix, Pure Storage and the next generation of platform vendors like Datrium and Datera have achieved awareness by taking big money and going big on marketing.

The Architect’s View

As a pure technology play, I think StorPool has a lot to offer.  There are some technical gaps, notably, storage overhead (e.g. moving to more of an erasure coding model), some enterprise-class features and the need for such a hands-on approach. 

The company would benefit from a close partnership with the likes of Dell, Cisco or even HPE (although HPE has other partners in this area).  This would drive the GTM and expose the technology to a wider audience.

However, all that being said, I think StorPool makes for a very interesting solution for the right customer. 

Disclaimer: I was personally invited to attend Storage Field Day 18, with the event teams covering my travel and accommodation costs.  However, I was not compensated for my time.  I am not required to blog on any content; blog posts are not edited or reviewed by the presenters or the respective companies prior to publication. 

No reproduction without permission. Post #E760.