Intelligent computing and intelligent storage are the future. Intelligent computing is the ability to automatically adjust computing resources based on data and application needs, such as the acceleration of an analytics workload.
Intelligent storage ensures that applications can always access their data by transparently managing multiple disk tiers.
Intelligent compute and intelligent storage are both parts of an approach called Intelligent Data Management (IDM).
The 41st IT Press Tour had the opportunity to meet with Albert ChenFounder of Kalista. Albert was formerly a program manager at Microsoft. While at Western Digital he developed software for new emerging storage devices and created one of the first host-managed shingled magnetic recording drives (HM-SMR) storage system solutions. HM-SMR is an implementation where the host is responsible for everything ranging from managing data stream, to read/write operating and zone management.
From Albert’s perspective, storage devices have compatibility and performance problems. New storage devices are not compatible with current applications and operating systems due to shingled magnetic recording, energy-assisted magnetic recording, multi-actuator, variable capacity, and large sector.
SMR is a technology that allows vendors to eke out higher storage densities, netting more TB capacity on the same number of platters—or fewer platters, for the same amount of TB. As such, the drive is cheaper. However, it slows down the write speed for long sequential writes significantly getting bogged down during large writes typical of big data ingestion.
Another problem is that hard drive capacity is increasing, but performance is not. Maintaining consistent performance is hard due to declining IO density (IOPS/GB), contention, long-tail latency. This leads to a higher TCO. Money will solve the problem if you want to over-provision everything; However, this is a short-sighted, and costly business decision.
Kalista’s Phalanx Storage System enables applications to use next-gen storage devices without modification from device-friendly commands that enable consistent, predictable performance at every scale.
Kalista sits in between data and applications to take advantage of system growth and efficiency.
How It Works
The log-structured data layout supports SMR natively, minimizes the amount of seeks and contention thereby increasing IO density. It evenly distributes wear across available capacity, thereby preventing hotspots and reducing tail latency. SMR provides 30% more capacity because the tracks are closer together.
Kalista evenly distributes workloads across available devices (for system-wide wear leveling). It supports variable capacity devices natively and reduces read contention. As more devices are added, increased capacity reduces contention and increases concurrency.
Users are able to scale performance with capacity. Phalanx keeps devices level-headed. Longtail with a legacy stack (eg 260000 usec) is curtailed with Phalanx (50,000 usec) — 4.8 lower latency at 99.99 percentile
It supports all existing interfaces – user applications, access interfaces, and storage devices.
Current SMR compatibility solutions have dependencies and limitations. Phalanx is kernel agnostic and requires no application changes.
One Command Line to SMR:
- ]docker run –privileged -v /mnt:/mnt:rshared -v /md:/md:shared phalanx -d /dev/sdc -bm ▊
Easy to deploy, simple to operate, and run everywhere. Phalanx can be operated and operated using existing orchestration and provisioning frameworks including Kubernetes and vSphere. It is designed to fit within existing workflows and environments.
There are no storage silos. Phalanx supports both conventional and zoned devices. Users do not have to worry about mixing and matching devices.
Phalanx HM-SMR enables devices to work with applications and environments beyond cold archival storage. Kalista can expand the market for HM-SMR by decreasing the barrier to entry for SMR which has typically been hyperscalers with cold data.
Phalanx + SMR is going beyond blob storage with the ability to run Docker, MongoDB, GitLab, Kubernetes, and Apache. MongoDB’s performance with Phalanx versus an SMR drive is improved by up to 10%. MinIO S3-bench performance writes average IOPS are up 60% while read average IOPS are up 35%.
Phalanx + SMR + compute enables innovation with Ceph, Hadoop, Java, and C. Software compilation is 3.7 seconds faster on NGINX, 45 faster seconds on Linux, and 4 minutes faster on Hadoop. 16X more IOPS with fio random write; 19% faster throughput with Hadoop TestDFSIO read; 10X better performance consistency with Ceph Rados writes bench; 58% higher IOPS with Ceph Rados write bench.
Phalanx + SMR + crypto = HODLing your XCH/FIL. Phalanx provides all SMR all the time. A single storage system for both plotting and farming bitcoin results in 10% to 30% more plots/sectors compared to a conventional system. Users are able to do more with less. Using fewer drives means fewer component failures and lower maintenance costs. Fewer drives reduce the amount of space and energy consumed during farming/mining.
Intelligent storage will take manual intervention out of everything including:
- Self-optimizing data placement and IO prioritization.
- Proactive management of device health and performance.
- Data services that automatically index and tag stored data.