What is Metadata?

Metadata is the card catalog for your storage. It contains key information about each file: Where is it? When was it created? Who created it? When was it last modified, who modified it and who has access to it? If it’s deduplicated or compressed, what reference blocks need to be accessed? Like the physical card catalogs that we once used to find books in a library, metadata makes it much easier to find and work with large sets of data.

But now that data isn’t just in one place – it’s shared and accessed by users all around the globe in different offices. Traditional file systems were designed to manage metadata locally. For a distributed file system to work effectively across thousands of miles, metadata has to meet three requirements:

  • It has to be shared globally.
  • It has to be extremely fast.
  • Data needs an authoritative source.

How Panzura shares and manages metadata is a core capability of our file system.

Global Metadata

Within an office or datacenter, a traditional NAS or SAN typically handles file metadata and keeps it consistent across a cluster of controllers. But what happens if a file needs to be shared with users in another office? Offices are effectively islands of storage and don’t share a common file system.

So a file is usually copied, but now there are are two copies of the file, each with their own distinct metadata. Now you have to worry about different versions and have to reconcile differences manually to avoid losing data.

To support collaboration in a distributed organization, metadata has to be shared across sites. You need one source of truth about all your files. With a single file system like Panzura’s, there is no need to duplicate or mirror files between sites — there’s only one file system and one copy of the files.

For the Best User Experience, Keep Metadata in Flash

When you browse folders and files on your PC, the system is accessing file metadata. If you’re accessing a network folder, it’s often slightly slower – but not necessarily because your network is slow. Metadata operations on a file server are a stream of random requests from different users, so your experience depends on the system’s ability to respond to thousands of small random read requests.

In a distributed cloud file system, there are two aspects to fast — local access speed and global synchronization speed.

Fast local access

To make local sure metadata access is fast, the system has to be able to handle random IOPS workloads extremely quickly. It also needs to respond efficiently to sequential read requests when users try to open files. High IOPS and low latency are essential to making a file system work for you.

Solid State Drives (SSDs), or flash storage, are about 100 times faster than Hard Disk Drives (HDDs), and provide up to 400 times more IOPS. So even on a traditional array, it makes sense to put metadata on flash, where user requests will be served very quickly. All of the major enterprise storage solutions now use flash for metadata and caching.

Fast global synchronization

Flash storage makes sharing global metadata updates faster as well. If a user changes a file in Hong Kong, the metadata needs to update immediately in New York so users there see the changes. Flash makes accessing the metadata much faster.

To learn more about Metadata, download the white paper.

NEXT: Read more about Security & Encryption.