CEPH错误: HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
Every 1.0s: ceph -s Wed Dec 8 10:55:43 2021
cluster:
id: 48ff8b6e-1203-4dc8-b16e-d1e89f66e28f
health: HEALTH_ERR
1 scrub errors
Possible data damage: 1 pg inconsistent
services:
mon: 3 daemons, quorum ceph-node-1,ceph-node-2,ceph-node-3 (age 12h)
mgr: ceph-node-2(active, since 4d), standbys: ceph-node-1, ceph-node-3
osd: 20 osds: 19 up (since 16h), 19 in (since 16h)
data:
pools: 2 pools, 513 pgs
objects: 2.19M objects, 8.1 TiB
usage: 24 TiB used, 45 TiB / 70 TiB avail
pgs: 512 active+clean
1 active+clean+inconsistent
io:
client: 3.6 MiB/s rd, 14 MiB/s wr, 896 op/s rd, 1.41k op/s wr
收到CEPH错误报告,一个擦洗错误,CEPH会按设定时间定期检查所有pg校对多副本数据是否一致,而当数据不一致,又无法自身做出决断修复时就会报告错误。常规修复流程:
[root@ceph-node-4 ~]#
[root@ceph-node-4 ~]# ceph health detail
HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
[ERR] OSD_SCRUB_ERRORS: 1 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 38.47 is active+clean+inconsistent, acting [16,19,5]
[root@ceph-node-4 ~]#
[root@ceph-node-4 ~]# ceph pg deep-scrub 38.47
instructing pg 38.47 on osd.16 to deep-scrub
[root@ceph-node-4 ~]#
[root@ceph-node-4 ~]# ceph pg repair 38.47
instructing pg 38.47 on osd.16 to repair
[root@ceph-node-4 ~]#
[root@ceph-node-4 ~]# rados list-inconsistent-obj 38.47 --format=json-pretty
{
"epoch": 175475,
"inconsistents": [
{
"object": {
"name": "rbd_data.f17676f0b5abab.00000000000010bb",
"nspace": "",
"locator": "",
"snap": "head",
"version": 21131043
},
"errors": [
"data_digest_mismatch"
],
"union_shard_errors": [],
"selected_object_info": {
"oid": {
"oid": "rbd_data.f17676f0b5abab.00000000000010bb",
"key": "",
"snapid": -2,
"hash": 55223367,
"max": 0,
"pool": 38,
"namespace": ""
},
"version": "175211'21131043",
"prior_version": "175211'21131025",
"last_reqid": "client.18983740.0:3304642",
"user_version": 21131043,
"size": 4194304,
"mtime": "2021-12-20T03:08:03.347890+0800",
"local_mtime": "2021-12-20T03:08:03.372374+0800",
"lost": 0,
"flags": [
"dirty"
],
"truncate_seq": 0,
"truncate_size": 0,
"data_digest": "0xffffffff",
"omap_digest": "0xffffffff",
"expected_object_size": 4194304,
"expected_write_size": 4194304,
"alloc_hint_flags": 0,
"manifest": {
"type": 0
},
"watchers": {}
},
"shards": [
{
"osd": 5,
"primary": false,
"errors": [],
"size": 4194304,
"omap_digest": "0xffffffff",
"data_digest": "0xace1929a"
},
{
"osd": 16,
"primary": true,
"errors": [],
"size": 4194304,
"omap_digest": "0xffffffff",
"data_digest": "0xace1929a"
},
{
"osd": 19,
"primary": false,
"errors": [],
"size": 4194304,
"omap_digest": "0xffffffff",
"data_digest": "0xa2918d4b"
}
]
}
]
}
常规的ceph pg repaire方式无法完成修复。只有主副本是好的,从副本有问题时,才能直接使用ceph pg repair。主副本损坏导致的不一致,需要使用其他方法修复。
运行脚本修复:
#!/bin/bash
for PG in $(ceph pg ls inconsistent -f json | jq -r .pg_stats[].pgid)
do
echo Checking inconsistent PG $PG
if ceph pg ls repair | grep -wq ${PG}
then
echo PG $PG is already repairing, skipping
continue
fi
# disable other scrubs
ceph osd set nodeep-scrub
ceph osd set noscrub
# bump up osd_max_scrubs
ACTING=$(ceph pg $PG query | jq -r .acting[])
for OSD in $ACTING
do
ceph tell osd.${OSD} injectargs -- --osd_max_scrubs=3 --osd_scrub_during_recovery=true
done
ceph pg repair $PG
sleep 10
for OSD in $ACTING
do
ceph tell osd.${OSD} injectargs -- --osd_max_scrubs=1 --osd_scrub_during_recovery=false
done
# disable other scrubs
ceph osd unset nodeep-scrub
ceph osd unset noscrub
done
该脚本把三副本的osd的osd_max_scrubs都先调大,等到修复好后再调回为1。
相关说明:
ceph pg repair这一操作会先进行pg scrub,得到该PG中不一致的对象,然后再进行recovery。
pg scrub时主副本和从副本均会进行资源预约,只有当scrubs_pending + scrubs_active < _conf->osd_max_scrubs时scrub才能继续进行,也即repair才能进行,否则,repair会失效。
scrubs_pending:该osd已经预约成功,即将进行scrub的pg。
scrubs_active:该osd正在进行scrub的对象。
osd_max_scrubs:一个osd同一时刻默认只能有一个pg做scrub。
脚本摘录自:https://github.com/cernceph/ceph-scripts/blob/master/tools/scrubbing/autorepair.sh
还有另外一种情况:
[root@k8snode001 ~]# ceph pg 2.2b query
{
"......
"recovery_state": [
{
"name": "Started/Primary/Active",
"enter_time": "2020-07-21 14:17:05.855923",
"might_have_unfound": [],
"recovery_progress": {
"backfill_targets": [],
"waiting_on_backfill": [],
"last_backfill_started": "MIN",
"backfill_info": {
"begin": "MIN",
"end": "MIN",
"objects": []
},
"peer_backfill_info": [],
"backfills_in_flight": [],
"recovering": [],
"pg_backend": {
"pull_from_peer": [],
"pushing": []
}
},
"scrub": {
"scrubber.epoch_start": "10370",
"scrubber.active": false,
"scrubber.state": "INACTIVE",
"scrubber.start": "MIN",
"scrubber.end": "MIN",
"scrubber.max_end": "MIN",
"scrubber.subset_last_update": "0'0",
"scrubber.deep": false,
"scrubber.waiting_on_whom": []
}
},
{
"name": "Started",
"enter_time": "2020-07-21 14:17:04.814061"
}
],
"agent_state": {}
}
这时候可以回退旧版
[root@k8snode001 ~]# ceph pg 2.2b mark_unfound_lost revert
还是无法解决想放弃时候或也可以直接删除(慎重。。。)
[root@k8snode001 ~]# ceph pg 2.2b mark_unfound_lost delete
摘录学习自:
https://www.jianshu.com/p/2ebad7f89731
https://www.codetd.com/en/article/12326328