4 min read

Introducing Cloud Block Store

Panzura : May 4, 2021

Insights

Table of Contents

What happens when you point some of the smartest IT minds in your business at innovation, instead of operation?

It’s something we often ask our customers. Recently, we asked ourselves. The answer – this time around – is a product that is completely different, and yet completely in sync with Panzura’s focus on empowering organizations to do amazing things with unstructured data.

Cloud Block Store is a spin-off of the lightning-fast technology that powers Panzura Data Services, allowing billions of files from Panzura and other file shares to be ingested, searched, analyzed, audited and monitored in near real time. It provides a Hyper-Converged cloud-native storage platform that can be deployed on demand, and scaled as needed.

Available in the Google Cloud Platform (GCP) Marketplace for Google Kubernetes Engine (GKE) clusters, Panzura Cloud Block Store (CBS) is a Kubernetes web-scale persistent storage platform for containerized applications. Cloud Block Store has the capability to scale up when you need more resources, or scale down when less resources are required. With no scaling limits, CBS presents a scalable distributed read cache to containerized applications optimizing Kubernetes cluster resources for high performance workloads.

Why is Cloud Block Store important for Kubernetes?

Containers are now used in organizations from small startups to large enterprises. Organizations need different levels of data persistence for their containerized applications. Kubernetes applications have been designed to use volumes that follow the container at the pod level, meaning they get created and deleted along with the pods. These applications are known as stateless. Many container applications will require a volume storage which stores information during the use of the container, to be available if the container or pod is deleted. When the Pod or container restarts it must resolve any data changes that have occured.

In other words the volumes behave more like a database. These applications are known as stateful.

Panzura Cloud Block Store provides the persistent storage volumes required for both stateful and stateless applications by creating a scalable distributed read cache. This read cache is a cluster of scalable GKE nodes created for data high availability and easy to integrate with Kubernetes applications.

Optimization of Cloud Block Store focuses on performance and reliability. To accelerate ingesting blocks and to guard against their loss in the case of a node failure, Cloud Block Store deploys shared multiple redundant cache services for easy access to data. This creates optimal performance, if a read cache node were to fail, another node would be able to access the data from the cache services. All data is eventually stored in Google Cloud Storage for long term durability.

Cloud Block Store is implemented as a collection of pods and containers managed by Kubernetes. Kubernetes deploys the optimal number of each container type to maintain the desired level of service.

Auto scaling for more resources occurs when read cache hits exceed a threshold of hit misses. Scale down occurs when a cache hit threshold exceeds a lower hit miss limit or a bandwidth rate of 1 MB/sec or lower, meaning less resources are being used.

The scaling down lowers the costs of Cloud Block Store capacity usage as customers only pay $0.0003 per gigabyte/day of Google Cloud Storage. Additional Google Kubernetes Engine cluster costs apply and are separate from CBS. Below are some key highlights of the CBS features and architecture.

Features:

- Thin Provisioned, global deduplication & compression
- AES256 Encryption
- CSI Driver Complaint
- Unlimited Mountable Snapshots
- Simple to implement via CLI and Automation
- Real-time stats and reporting
- Support for Intel Optane in AppDirect mode (16TB cache)
- High Performance POSIX Volume Interface

Web-scale Architecture:

- Up to 1PB Volume Namespace
- 100TB or greater Distributed Read Cache on local & persistent SSD
- Auto Scale out and down, based on Read Cache hits (and cost)
- Backup, Archival and Analytics workloads I/O Performance Optimized

Basic flow of data flow from client containers to
Cloud Block Store through Kubernetes

Sample diagram of Cloud Block
Store basic architecture

The Data Processing Layer (DPL) accepts/processes client requests from a block device or a S3 Service. Global deduplication, compression and conversion to 4K block size is done in the DPL, providing data storage efficiency for performant data operations.

The Cloud Block Store is presented as a distributed read cache to client applications as a high performance POSIX interface. The journal cache service guarantees data that is not located in the distributed read cache is available in the Journal Cache, and metadata in other cache services is located in the Cassandra backend service. All cache services work together with each other to keep data available from any node in the GKE cluster for client requests.

Key highlights

- Cloud Block Store is a scale-out block-based cache that is shared by all compute instances within a GKE cluster.

- Kubernetes applications that employ a block device interface can benefit from Cloud Block Store.

- Cloud Block Store is presented to Kubernetes applications as a persistent volume utilizing a high performance POSIX interface as a mountable directory to a container or pod.

- The Distributed Read Cache service is scalable to add more nodes with additional local persistent SSD’s to grow the capacity.

- All ingested blocks into Cloud Block Store are eventually uploaded to S3 backend cloud for durable storage

- Only the most recently and frequently accessed data is stored in Cloud Block Store as a distributed read cache (nodes local SSD).

- When a block is requested by a user, any node in the distributed cache node can respond, no matter which node stores this block to the cache. This is due to global deduplication when ingested. Any block can be read at any time.

- If a block is not cached, it is retrieved from S3 (GCP storage) and stored back into the distributed read cache.

In summary, Panzura Cloud Block Store is at the leading edge of persistent container storage for cloud-native architectures for enterprise containerized applications. Using a variety of novel approaches to keep costs low, Cloud Block Store provides a clear ROI by allowing you to leverage high performance persistent Kubernetes container storage. Installation is easy from the Google Cloud Platform Marketplace. A Google Kubernetes Engine cluster is a prerequisite, documentation available at the marketplace site has recommended cluster specifications, sample gcloud commands for installation, and a CBS management interface to execute API calls for valuable insights from Cloud Block Store.