Summit Live Blog: Building Exascale Active Archives with Red Hat Ceph Storage

June 29, 2016

Software defined storage is a leading technology in our industry with more and more platforms and enterprises are using software defined storage (SDS) to store unstructured data. Today, object storage is a primary workload on SDS as organizations are looking to implement active archives for enhanced access and long term storage. Red Hat's Steve Bohac and Neil Levine covered the bright future for SDS: object storage, active archives, and how Red Hat Ceph provides a solid foundation for all of your storage needs.

First, some vocabulary:

An object is a certain piece of data and associated metadata. This is not a file!

Object storage does not include hierarchical naming system like file systems. Instead, objects are stored in a flat structure and grouped into pools. Access is provided only by a specific object ID.

An active archive is an archive that is:

always online
write once, read infrequently, never modified
heterogeneous media types
unstructured data

Active Archives, Object Storage, and Red Hat Ceph

Active archives are driven by two trends: data capacity needs are continually growing and the price per byte of storage is declining. This pushes enterprises to be aggressive in data storage. Active archives help organizations get the most utility.

OK cool, how do I use active archives to enable my business? Mobile and web apps that perform at scale use active archives to power dynamic content to millions of users, digital libraries can use active archives to store and retrieve multimedia content, and historical big data projects need someplace to store all that data. Object storage is a great mechanism for storing active archives as the functionality provided (flat system, metadata aware, distributed scale) are all requirements of an active archive system.

So you know you want active archives and object storage is the way to go, how do we get there? Red Hat Ceph is well suited for storing large data sets because of its underlying RADOS layer, which scales the object store peer-to-peer. It also has an efficient access mechanism (RGW) and can work on a variety of hardware. All of this is on top of an open source framework, which makes it a reliable storage solution that can scale to meet the needs of the most extreme workloads.

Best practices

The biggest hardware issue is the ratio of storage to CPU power in single machines for varying workloads. Some might be IOPS optimized and others Capacity optimized. Neil covered several metrics the Ceph team has worked on developing that can help you estimate your hardware needs from your requirements. Check it out below.

Also, make sure to use a load balancer (like an HA Proxy) to help distribute calls to the underlying RADOS layer. You can even distribute calls via client metadata, for example, routing all of your priority callers to SSD storage and everyone else to tape backed storage.

Roadmap

Look to the Ceph project to provide:

data tiering (move or maintain data in disparate sources like AWS, Tape, RADOS, etc).
metadata searching capabilities.
enhanced security (soon to include cloud and server-side encryption on an object-by-object basis).

This was a great presentation and looking forward to hearing more from the Red Hat Ceph team.

Last updated: March 16, 2018

Report a website issue

Linux

Java runtimes & frameworks

Kubernetes

Integration & App Connectivity

Automation

Developer tools

Developer Sandbox for Red Hat OpenShift

Programming Languages & Frameworks

System Design & Architecture

Developer Productivity

Secure Development & Architectures

Platform Engineering

Automated Data Processing

Start exploring in the Developer Sandbox for free

Interactive Lessons and Learning Paths

Developer Sandbox Activities

E-Books

Tutorials

Cheat Sheets

API Catalog

Red Hat Learning

Tech Talks

Deep Dives

Red Hat Summit 2024

Summit Live Blog: Building Exascale Active Archives with Red Hat Ceph Storage

Create a Red Hat OpenShift AI environment

How to share secrets across Red Hat OpenShift projects

APIs without borders: The world of locationless API management

C# 12: Collection expressions and primary constructors

Red Hat Trusted Software Supply Chain is now available

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue