pg概述
- ReplicatedPG::do_request
|- ReplicatedPG::do_op //仅仅分析请求类型为”CEPH_MSG_OSD_OP”
ReplicatedPG::issue_repop|- ReplicatedPG::find_object_context |-ReplicatedPG::execute_ctx |- ReplicatedPG::get_object_context |- ReplicatedPG::prepare_transaction |- ReplicatedPG::complete_read_ctx |- ReplicatedPG::start_async_reads |- ReplicatedPG::calc_trim_to |- ReplicatedPG::issue_repop //向副本发送同步请求op |- ReplicatedPG::eval_repop //检查发向各个副本的同步操作是否reply成功
|-ReplicatedBackend::submit_transaction|- ReplicatedBackend::issue_op |- ReplicatedBackend::parent_transactions |- OSDService::send_message_osd_cluster |- ReplicatedPG::queue_transactions |- FileStore
acting set
pg对应副本所在的OSD列表,列表是有序的,第一个osd 为 primary. 在通常情况下,up set和acting set 相同up set
假设:acting set [0, 1, 2], 此时osd.0故障,导致monitor重新分配pg的acting set为[3, 1, 2], 此时osd.3不能承载pg的读io,所以向monitor申请一个临时的pg的osd.1 为主osd来承载读写,此时acting set为[3, 1, 2], up set [1, 3, 2]; acting set 与 up set不一致;
当osd.3 backfill完成之后, up set, acting set 均为[3, 1, 2]current interval && past_interval
在序列(interval)之内,pg的acting set 和 up set不会变化; current是当前的序列,past则是上一个阶段的序列;last_epoch_started: pg peering完成之后的epoch
- last_epoch_clean: pg recovery完成,处于clean状态的epoch
PGBackend
PGBackend定义了逻辑上处理IO和副本
- 处理client 操作
- 处理对象恢复
- 处理对象访问
处理scrub, deep-scrub, repair
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28// osd/PGBackend.h
class PGBackend{
protected:
ObjectStore *store;
const coll_t coll;
ObjectStore::CollectionHandle &ch;
//PGBackend 回调接口
public:
class Listener{
public:
// Recovery
......
struct RecoveryHandle{
.....
}
}
}
struct PG_SendMessageOnConn: public Context{
PGBackend::Listener *pg;
...
}
struct PG_RecoveryQueueAsync : public Context{
PGBackend::Listener *pg;
...
}
ReplicatedBackend(多副本后端)
1 | // osd/ReplicatedBackend.h |
1 | // osd/ReplicatedPG.h |
1 | struct C_OnMapCommit : public Context{ |
ObjectStore