TECHNOLOGY DEEP DIVE
Global Deduplication & Compression
Reduce your data footprint by up to 80% and slow down your data growth with the most granular global deduplication in the industry.

(Massively) Reduce Storage Costs
Eliminate redundant data across all sites to drastically reduce the amount of storage you need.
Reduce Bandwidth Demands
Send only unique data blocks to the object store, minimizing network traffic while boosting speed.

Optimize Data Operations
Move the smallest amount of data over the shortest distance for immediate global file sync and data recovery.
File data typically contains a LOT of duplication. Users make endless copies so you end up storing similar or even identical files multiple times. Backup and disaster recovery processes make even more copies. That creates storage bloat, it's a win for your storage vendor, and a painfully expensive (and continually growing) storage bill for you.
The CloudFS hybrid cloud file platform takes a radically different approach. It's laser-focused on reducing the amount of data stored by squeezing out every last redundant byte of data, everywhere. Even at enterprise scale, that can be up to 80% of the total volume of data you're currently paying to store and back up.
Global Scope: Other deduplication solutions operate only within a single site or storage volume. That's better than nothing but it's far from as efficient as it could be. With CloudFS, deduplication is global. This means it identifies and eliminates duplicate data across all locations and in the central object storage.
Inline Deduplication: We like to say that only immediate is fast enough and that applies to everything we do with data. CloudFS performs deduplication inline — as data is being written and changed, it's compared to data that already exists. This is more efficient than post-process deduplication, which processes data after it's been written, as it prevents redundant data from ever being stored in the first place.
Block-Level Deduplication: CloudFS doesn't just look for duplicate files. It translates files into data blocks that are a tiny 128kb in size (that's as small as it gets) and creates metadata pointers that record which blocks make up the file in that moment. It then compares these individual blocks. If an identical block already exists anywhere within the global file system, it creates and stores a metadata pointer to that block. Instead of storing redundant megabytes, you're adding lightweight metadata.
Shared Metadata and Deduplication Reference Table: We do really smart things with metadata and this is the piece that means you squeeze out every last byte of data across your entire file system. Our deduplication reference table, which tracks unique data blocks and their locations, is embedded in the metadata that is instantly shared among all Panzura nodes. This means that every location has an up-to-date view of all unique data blocks stored globally.
Object Store as the Single Source of Truth: Panzura consolidates all data into a single, authoritative dataset in your chosen cloud object storage — AWS S3, Azure Blob Storage, Google Cloud Storage and others or on-premises object storage, for example Nutanix Objects, Cloudian and others. This object storage then acts as the central repository for the deduplicated data.
The deduplication data flow
Segmentation: When a file is created or modified, CloudFS breaks it into variable-length data blocks, 128kb in size.
Hashing and Comparison: Each data block is put through a hash algorithm to generate a unique fingerprint. That's compared against the global deduplication reference table.
Decision: If the block is unique, it's compressed, encrypted, and sent to the object store. The global deduplication reference table is updated with information about this new block. If it's a duplicate, CloudFS simply stores a metadata pointer to the existing block.
Global Updates: The deduplication reference table is instantly shared as part of the metadata across all Panzura nodes, so every location immediately benefits from data deduplicated by any other node in the network. That's why only unique data ever makes it to the object store.
Local Caching: Intelligent local caching at the edge provides local-feeling performance for users, even though the authoritative data resides in the object store. The cache is also aware of the global deduplication, further optimizing performance and reducing cloud egress costs by serving cached, deduplicated data locally.
Why Global Deduplication and Compression Matter
Leading Video Game Developer
Problem
- 1.5 PB of build files across 30 offices
- Spending millions of dollars on enterprise NAS systems, mirroring, and backups
Results
- Consolidated file data down to 45 TB of storage (99% reduction)
- Cloud economics enabled them to pay $4000 a month for all tiers of storage
Accelerate digital transformation with a powerful hybrid cloud file platform that:
Modernizes Storage Architecture
Allows organizations to significantly reduce storage costs by consolidating dispersed file data into a unified, deduplicated, compressed, and secured data set in the cloud or on premises.
Unlocks Organizational Productivity
Enables file data to look and feel local to users and processes everywhere. It uniquely empowers users to harness their collective skills by working collaboratively, regardless of location.

Delivers Seamless Cloud File Services
Turns public or private cloud storage into a high performance, immutable global file system that flawlessly delivers file data to people, processes, and AI, and makes it resilient to damage.
Reduces Operational Complexity
Lets IT teams turn their attention to innovation with less file data and infrastructure to manage and protect, fewer storage refreshes to plan for and less worry about recovering from file damage.