iSt0ne's Notes

使用ceph-deploy部署Ceph集群

准备环境

在Openstack开4个虚拟机,如下:

Openstack VM

ceph-admin 为集群部署节点
ceph-node1 为监控节点,后添加osd节点
ceph-node2 为osd节点,后添加监控节点
ceph-node3 为osd节点,后添加监控节点

创建部署账号,并配置无密码登陆

四台虚拟机上添加ceph-admin账号,并添加无密码sudo权限

ceph-deploy工具必须以普通用户登录Ceph节点,且此用户拥有无密码使用sudo的权限,因为它需要在安装软件及配置文件的过程中,不必输入密码。

较新版的ceph-deploy支持用–username选项提供可无密码使用sudo的用户名(包括root,虽然不建议这样做)。使用ceph-deploy –username {username} 命令时,指定的用户必须能够通过无密码SSH连接到Ceph节点,因为ceph-deploy中途不会提示输入密码。

[root@ceph-admin ~]# useradd -d /home/ceph-admin -m ceph-admin
[root@ceph-admin ~]# echo "ceph-admin ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/ceph-admin
ceph-admin ALL = (root) NOPASSWD:ALL
[root@ceph-admin ~]# chmod 0440 /etc/sudoers.d/ceph-admin

在ceph-admin上生成公钥

[root@ceph-admin ~]# su - ceph-admin
[ceph-admin@ceph-admin ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/ceph-admin/.ssh/id_rsa):
Created directory '/home/ceph-admin/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/ceph-admin/.ssh/id_rsa.
Your public key has been saved in /home/ceph-admin/.ssh/id_rsa.pub.
The key fingerprint is:
4c:1a:71:72:ef:f8:8e:d9:40:06:87:9e:24:75:79:6b ceph-admin@ceph-admin
The key's randomart image is:
+--[ RSA 2048]----+
|      + +.       |
|     . B...      |
|    . = o...     |
|     + O oE      |
|      + S..      |
|       o .       |
|        . .      |
|         *       |
|        o o      |
+-----------------+
[ceph-admin@ceph-admin ~]$ cat /home/ceph-admin/.ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDKx5MnUMkxigPr14cZAfSr10ClvQgMJaVwxum99pZL7FhQu8aU2KIK1EOhV5Bf2Hkb23y6pLNxce3aMyZqi8okh4wCGH4LTFh/Rs95GSdlaS1jrfyO908cQz9MYbVACbTWX6jcQ6CgJidPAh3s7w7RhyazXlgBweo2L/sV8MoyCsP8feL5HTSAYN9QLmdxI/gme2XDY1jhPVyVRYrE6RdtC4I8qQEYwJdlKO0h7xqLFdLnDH/h9Pr4EJLKlS3uMFDs4lCBonUOJojw193G4YSMe+ZQUofhm4SC1W735/X1iOzbGUGo53OrmZ83hMm1TOqqbM6jax6umW2JQ6ddOBY5 ceph-admin@ceph-admin

将生成的公钥添加到ceph-node1、ceph-node2、ceph-node3虚拟机的ceph-admin账号下

[root@ceph-node1 ~]# su - ceph-admin
[ceph-admin@ceph-node1 ~]$ echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDKx5MnUMkxigPr14cZAfSr10ClvQgMJaVwxum99pZL7FhQu8aU2KIK1EOhV5Bf2Hkb23y6pLNxce3aMyZqi8okh4wCGH4LTFh/Rs95GSdlaS1jrfyO908cQz9MYbVACbTWX6jcQ6CgJidPAh3s7w7RhyazXlgBweo2L/sV8MoyCsP8feL5HTSAYN9QLmdxI/gme2XDY1jhPVyVRYrE6RdtC4I8qQEYwJdlKO0h7xqLFdLnDH/h9Pr4EJLKlS3uMFDs4lCBonUOJojw193G4YSMe+ZQUofhm4SC1W735/X1iOzbGUGo53OrmZ83hMm1TOqqbM6jax6umW2JQ6ddOBY5 ceph-admin@ceph-admin' >> /home/ceph-admin/.ssh/authorized_keys
[ceph-admin@ceph-node1 ~]$ chmod 600 /home/ceph-admin/.ssh/authorized_keys

在四台虚拟机上添加以下hosts到/etc/hosts

192.168.101.167 ceph-admin
192.168.101.173 ceph-node1
192.168.101.174 ceph-node2
192.168.101.176 ceph-node3

四台虚拟全部关闭Selinux

# 永久关闭修改/etc/selinux/config,SELINUX=disabled并重启服务器
[root@ceph-node1 ~]# setenforce 0

四台虚拟全部安装优先级/首选项包且已启用

[root@ceph-node1 ~]# yum install yum-plugin-priorities

四台虚拟机全部安装ntp服务

[root@ceph-admin ~]# yum install ntp ntpdate ntp-doc

在ceph-admin上安装ceph-deploy

[root@ceph-admin ~]# yum install -y yum-utils && yum-config-manager --add-repo https://dl.fedoraproject.org/pub/epel/7/x86_64/ && yum install --nogpgcheck -y epel-release && rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 && rm /etc/yum.repos.d/dl.fedoraproject.org*
[root@ceph-admin ~]# rpm -ivh https://download.ceph.com/rpm-jewel/el7/noarch/ceph-release-1-1.el7.noarch.rpm
[root@ceph-admin ~]# yum update && yum install -y ceph-deploy

创建集群

先在管理节点上创建一个目录,用于保存ceph-deploy生成的配置文件和密钥对。ceph-deploy会把文件输出到当前目录,所以请确保在此目录下执行 ceph-deploy。

[root@ceph-admin ~]# su - ceph-admin
[ceph-admin@ceph-admin ~]$ mkdir ceph-cluster
[ceph-admin@ceph-admin ~]$ cd ceph-cluster/

创建集群

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy --username ceph-admin new ceph-node1

在当前目录下用 ls 和 cat 检查 ceph-deploy 的输出,应该有一个 Ceph 配置文件、一个 monitor 密钥环和一个日志文件。

把Ceph配置文件(ceph.conf)里的默认副本数从3改成2,这样只有两个OSD也可以达到active + clean状态。把下面这行加入[global]段:

osd pool default size = 2

如果你有多个网卡,可以把public network写入Ceph配置文件的[global]段下。

public network = 192.168.101.0/24

安装Ceph

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy install ceph-admin ceph-node1 ceph-node2 ceph-node3

配置初始monitor(s)、并收集所有密钥

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy mon create-initial

完成上述操作后,当前目录里应该会出现这些密钥环:

ceph.client.admin.keyring
ceph.bootstrap-osd.keyring
ceph.bootstrap-mds.keyring
ceph.bootstrap-rgw.keyring

Note 只有在安装Hammer或更高版时才会创建bootstrap-rgw密钥环

添加两个OSD 。为了快速地安装,这篇快速入门把目录而非整个硬盘用于OSD守护进程。登录到Ceph节点、并给OSD守护进程创建一个目录。

[ceph-admin@ceph-admin ceph-cluster]$ ssh ceph-node2
[ceph-admin@ceph-node2 ~]$ mkdir /var/local/osd0
[ceph-admin@ceph-node2 ~]$ chown -R ceph.ceph /var/local/osd0
[ceph-admin@ceph-node2 ~]$ exit


[ceph-admin@ceph-admin ceph-cluster]$ ssh ceph-node3
[ceph-admin@ceph-node3 ~]$ mkdir /var/local/osd1
[ceph-admin@ceph-node3 ~]$ chown -R ceph.ceph /var/local/osd1
[ceph-admin@ceph-node3 ~]$ exit

然后,从管理节点执行 ceph-deploy 来准备 OSD

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy osd prepare ceph-node2:/var/local/osd0 ceph-node3:/var/local/osd1

最后,激活OSD

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy osd activate ceph-node2:/var/local/osd0 ceph-node3:/var/local/osd1

用ceph-deploy把配置文件和admin密钥拷贝到管理节点和Ceph节点,这样你每次执行Ceph命令行时就无需指定monitor地址和ceph.client.admin.keyring了

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy admin ceph-admin ceph-node1 ceph-node2 ceph-node3

确保你对ceph.client.admin.keyring有正确的操作权限

[ceph-admin@ceph-admin ceph-cluster]$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring

检查集群的健康状况,等peering完成后,集群应该达到active + clean状态

[ceph-admin@ceph-admin ceph-cluster]$ ceph health
HEALTH_OK
[ceph-admin@ceph-admin ceph-cluster]$ ceph -w
    cluster 350ce444-6987-49ab-b821-4416a4f53e1d
     health HEALTH_OK
     monmap e1: 1 mons at {ceph-node1=192.168.101.173:6789/0}
            election epoch 3, quorum 0 ceph-node1
     osdmap e10: 2 osds: 2 up, 2 in
            flags sortbitwise,require_jewel_osds
      pgmap v646: 64 pgs, 1 pools, 0 bytes data, 0 objects
            12760 MB used, 89616 MB / 102376 MB avail
                  64 active+clean

2017-03-16 02:02:04.771106 mon.0 [INF] pgmap v646: 64 pgs: 64 active+clean; 0 bytes data, 12760 MB used, 89616 MB / 102376 MB avail

扩展集群

一个基本的集群启动并开始运行后,下一步就是扩展集群。在ceph-node1上添加一个OSD守护进程和一个元数据服务器。然后分别在ceph-node2和ceph-node3上添加Ceph Monitor,以形成Monitors的法定人数。

添加OSD

[ceph-admin@ceph-admin ceph-cluster]$ ssh ceph-node1
[ceph-admin@ceph-node1 ~]$ mkdir /var/local/osd2
[ceph-admin@ceph-node1 ~]$ chown -R ceph.ceph /var/local/osd2
[ceph-admin@ceph-node1 ~]$ exit

然后,从ceph-deploy节点准备OSD

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy osd prepare ceph-node1:/var/local/osd2

最后,激活OSD

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy osd activate ceph-node1:/var/local/osd2

一旦你新加了OSD,Ceph集群就开始重均衡,把归置组迁移到新OSD

[ceph-admin@ceph-admin ceph-cluster]$ ceph -w
    cluster 350ce444-6987-49ab-b821-4416a4f53e1d
     health HEALTH_OK
     monmap e1: 1 mons at {ceph-node1=192.168.101.173:6789/0}
            election epoch 3, quorum 0 ceph-node1
     osdmap e15: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds
      pgmap v659: 64 pgs, 1 pools, 0 bytes data, 0 objects
            19155 MB used, 131 GB / 149 GB avail
                  64 active+clean

2017-03-16 02:10:25.182296 mon.0 [INF] pgmap v659: 64 pgs: 64 active+clean; 0 bytes data, 19155 MB used, 131 GB / 149 GB avail
2017-03-16 02:10:29.858625 mon.0 [INF] pgmap v660: 64 pgs: 64 active+clean; 0 bytes data, 19155 MB used, 131 GB / 149 GB avail

添加元数据服务器

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy mds create ceph-node1

添加RGW例程,要使用Ceph的Ceph对象网关组件,必须部署RGW例程,RGW例程默认会监听7480端口

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy rgw create ceph-node1

添加 MONITORS

Ceph存储集群需要至少一个Monitor才能运行。为达到高可用,典型的Ceph存储集群会运行多个Monitors,这样在单个Monitor失败时不会影响Ceph存储集群的可用性。Ceph使用PASOX算法,此算法要求有多半monitors(即1、2:3、3:4、3:5、4:6等)形成法定人数。

新增两个监视器到Ceph集群

如下修改ceph.conf

mon_initial_members = ceph-node1,ceph-node2,ceph-node3
mon_host = 192.168.101.173,192.168.101.174,192.168.101.176

部署监视器

[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy mon add ceph-node2
[ceph-admin@ceph-admin ceph-cluster]$ ceph-deploy mon add ceph-node3

[ceph-admin@ceph-admin ceph-cluster]$ ceph -w
    cluster 350ce444-6987-49ab-b821-4416a4f53e1d
     health HEALTH_OK
     monmap e3: 3 mons at {ceph-node1=192.168.101.173:6789/0,ceph-node2=192.168.101.174:6789/0,ceph-node3=192.168.101.176:6789/0}
            election epoch 6, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3
     osdmap e23: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds
      pgmap v1508: 104 pgs, 6 pools, 1588 bytes data, 171 objects
            19198 MB used, 131 GB / 149 GB avail
                 104 active+clean

2017-03-16 03:18:25.064298 mon.0 [INF] pgmap v1507: 104 pgs: 104 active+clean; 1588 bytes data, 19197 MB used, 131 GB / 149 GB avail
2017-03-16 03:18:26.110870 mon.0 [INF] pgmap v1508: 104 pgs: 104 active+clean; 1588 bytes data, 19198 MB used, 131 GB / 149 GB avail

存入/检出对象数据

我们先创建一个对象,用rados put命令加上对象名、一个有数据的测试文件路径、并指定存储池

[ceph-admin@ceph-admin ceph-cluster]$ echo hello ceph > testfile.txt

创建OSD pool

[ceph-admin@ceph-admin ceph-cluster]$ ceph osd pool create data 128
pool 'data' created

存入对象数据

[ceph-admin@ceph-admin ceph-cluster]$ rados put test-object-1 testfile.txt --pool=data

为确认Ceph存储集群存储了此对象,可执行

[ceph-admin@ceph-admin ceph-cluster]$ rados -p data ls
test-object-1

定位对象

[ceph-admin@ceph-admin ceph-cluster]$ ceph osd map data test-object-1
osdmap e25 pool 'data' (6) object 'test-object-1' -> pg 6.74dc35e2 (6.62) -> up ([0,2], p0) acting ([0,2], p0)

检出对象

[ceph-admin@ceph-admin ceph-cluster]$ rados get test-object-1 testoutfile.txt --pool=data
[ceph-admin@ceph-admin ceph-cluster]$ cat testoutfile.txt
hello ceph

删除对象

[ceph-admin@ceph-admin ceph-cluster]$ rados rm test-object-1 --pool=data