# CEPH

## CEPH

> In computing, Ceph (pronounced /ˈsɛf/ or /ˈkɛf/) is a free-software storage platform, implements object storage on a single distributed computer cluster, and provides interfaces for object-, block- and file-level storage. Ceph aims primarily for completely distributed operation without a single point of failure, scalable to the exabyte level, and freely available. \[Wikipedia]\(<https://en.wikipedia.org/wiki/Ceph_(software>))

* [CEPH Homepage](https://ceph.com/)
* [CEPH Git](https://git.ceph.com/?p=ceph.git;a=summary)
* [Bright Computing](http://www.brightcomputing.com/)
  * *Our software automates the process of building and managing Linux clusters in your data center and in the cloud*
* [The Definitive Guide: Ceph cluster On Raspberry Pi](http://bryanapperson.com/blog/the-definitive-guide-ceph-cluster-on-raspberry-pi/)

## CEPH :: Videos

* [Architecture Good A Gentle Introduction to Ceph](https://www.youtube.com/watch?v=5xoYFGkFTkM)
* [Implementation Good Ceph OSD Hardware - A Pragmatic Guide](https://www.youtube.com/watch?v=kc7GIHyk57M)

## CEPH :: Philosophy

* Open Source
* Community Focused

## CEPH :: Features

* Software Defined Storage Solution
* Distributed object storage
* Redundancy
  * Replication
  * Erasure
  * Cache Tiering
* Efficient scale out
* Build on commodity hardware
* Most popular choice of distributed storage for OpenStack: Nova (VM virtual disks), Glance (images), Cinder (block storage), RadosGW
* Copy-On-Write (Glance image to Nova/Cinder)

## CEPH :: Storage Cluster

* Self healing
* Self managed
* No bottlenecks

## CEPH :: 3 Interfaces

* Object Access (like Amazon S3)
* Block Access
* Distributed File System (cephfs)

## CEPH :: Architecture

* [Ceph Intro & Architectural Overview](https://www.youtube.com/watch?v=7I9uxoEhUdY)
* RADOS (Reliable Autonomic Distributed Object Store) .. Documentation Ceph Storage Cluster
  * radosgw (object storage) .. Documentation Ceph Object Storage
    * RESTful Interface
    * S3 and Swift APIs
  * rbd (block device) .. DocumentationCeph Block Device
    * Block devices
    * Up to 16 [EiB](https://en.wikipedia.org/wiki/Exbibyte)
    * [Thin Provisioning](https://en.wikipedia.org/wiki/Thin_provisioning)
    * Snapshot
  * CephFS (File System) .. Documentation Ceph Filesystem
    * POSIX Compliant
    * Separate Data and Metadata
    * For use e.g. with Hadoop
    * We recommend using XFS
      * [XFS: the filesystem of the future?](https://lwn.net/Articles/476263/)
* Head Node (Controller)
  * SQL Database
  * CMDaemon [1](https://slurm.schedmd.com/slurm_ug_2011/Bright_Computing_SLURM_integration.pdf)
    * Cluster Management GUI (JSON + SSL)
    * Cluster Management Shell
    * Web Based User Portal
  * Third Party Applications
  * Node-005 (Nova App)
  * Node-004 (CEPH OSD)
  * Node-003 (CEPH OSD)
  * Node-002 (Nova Compute)
  * Node-002 (Nova Compute)

## CEPH :: Components

* Server CEPH OSDs Node
  * Type
    * Fat Node
      * Many Cores / Sockets 20+ HDDs, 1+ Journal SSDs
    * Thin Node
      * Faster recovery
      * 1 Socket is enough
  * Physical Disk
  * SSDs Journals Fast Vs HDD Slow
  * File System (btrfs, xfs)
  * One Object Storage Daemon
    * OSD serve object storage to clients
    * Peer to perform replication and recovery
* Server CEPH Monitor
  * Store Cluster Map (at least 3)
  * Brain of the Cluster
  * Do not server stored objects to clients
* Server CEPH Metadata (for CephFS)

## CEPH :: Conceptual Components

* Pool
  * Logical container for storage objects
  * Parameters
    * Name, ID
    * Replicas
    * CRUSH rules
  * Operations
    * Create / Read / Write Objects
* Placement Groups (PGs)
  * Balance data across OSD
  * 1 PG spans several OSD
  * 1 OSD serves many PGs
  * Tunable (50-100 per OSD)
* Control Replication Under Scalable Hashing
  * Monitors mantain CRUSH map
  * Clients understand CRUSH

## CEPH :: Playground

* Standalone Storage System&#x20;
* Back End for OpenStack Block Storage

## CEPH :: Implementation

* [Implementation Good Ceph OSD Hardware - A Pragmatic Guide](https://www.youtube.com/watch?v=kc7GIHyk57M)
* [Designing for High Performance Ceph at Scale](https://www.youtube.com/results?search_query=designing+for+high+performance+ceph+at+scale)
* [Ceph at CERN: A Year in the Life of a Petabyte-Scale Block Storage Service](https://www.youtube.com/results?search_query=ceph+at+cern)
* Storage Nodes
  * CPU 1.5 GHz per OSD
  * Memory 1 or 2 GB per Terabyte of Storage (16GB)
  * Storage Controller
  * SSDs for OSD Journal
  * HDDs
* Krusty The Cloud
  * 17 Hypervisor Nodes
  * 400 VMs
  * 7 CEPH OSDs (Extending to 10)

Considerations

* Network determines the number of SSDs
* Number of SSDs determine number of HDDs
* Number of HDDs determine number of CPU core count
* Size count of HDDs determines the amount of memory needed
* Network
  * Single Fabric
    * Single Switch, VLANs
    * Problems: One broadcast domain, bandwidth
  * Multiple Fabric
    * Fabric for VLAN/VXLAN
    * CEPH Access (ceph-public)
    * CEPH Cluster (ceph-cluster)
  * NICs
    * 1 GigE, 10 GigE
  * MTUs
    * 1500 Vs 9000
* Disks
  * SSD Journals
  * Amount of data to write before failure
  * 1 GigE good for SATA SSD
  * 10 GigE good for PCIe SSD
* Hard Disk
  * [Intel® Solid-State Drive DC S3700 Series: Specification](https://www.intel.com/content/www/us/en/solid-state-drives/ssd-dc-s3700-spec.html)
  * 5 HDDs per 1 SSD (from 4 to 8 is common)
  * 3 SSDs per OSD node (on 10 GigE)
  * 5 OSD Daemon per node
* Processor
  * 1 Socket but how many cores
    * Depends on SSD and networking
      * 1 CPU core per Daemon Disk
      * 1 SATA SSD Journal per \~4-6 HDD
      * 1 PCIe SSD Journal per \~6-20 HDD
      * Example. 2 SATA SSDs could handle 12 OSDs which would require 12 core CPU
  * Hyper Threading Cores Vs Physical Cores
    * HT enabled
* Memory
  * 0.5 GB - 1 GB per TB per Daemon
  * More is better (Linux VFS caching)
  * OSD node with 4 x 2 TB Disks (4 Daemons) -> 8 GB of RAM
  * OSD node with 16 x 2 TB Disks (16 Daemons) -> 32 GB of RAM

## CEPH :: Integration

* [Unlikely How to integrate Ceph with OpenStack](http://superuser.openstack.org/articles/ceph-as-storage-for-openstack/)
* [Likeyly How to build a Ceph Distributed Storage Cluster on CentOS 7](https://www.howtoforge.com/tutorial/how-to-build-a-ceph-cluster-on-centos-7/)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://theiotlearninginitiative.gitbook.io/edgecomputingsolutions/introduction/edge-computing-solutions-architect/storage/ceph.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
