메뉴 닫기

ceph 구성 팜중 일부 서버 교체시 troubleshooting

ceph 구성중 osd를 재구성하기 위해 설치하는 중 다음과 같은 오류가 발생할 수 있다.

다음의 오류 배경은 기존에 사용하던 osd의 raid 구성을 초기화 한후 raid 재구성시 나올 수 있는 error 메시지.

결론 부터 말하자면 osd 서버를 리부팅 해주어야 하다.

ceph의 구성 팜에 재구성 하기 이전의 osd 의 디스크 정보로 인하여 새로 재구성된 osd disk 에 대해 충돌을 일이키는 것으로 파악된다.( 오류 메시지중 일부 “You should reboot now before making further changes.”)

—————————————————————————————————-

aution: invalid backup GPT header, but valid main header; regenerating
[osd-no2][WARNIN] backup header from main header.
[osd-no2][WARNIN]
[osd-no2][WARNIN] Warning! Main and backup partition tables differ! Use the ‘c’ and ‘e’ options
[osd-no2][WARNIN] on the recovery & transformation menu to examine the two tables.
[osd-no2][WARNIN]
[osd-no2][WARNIN] Warning! One or more CRCs don’t match. You should repair the disk!

—————————————————————————————————-

[osd-no2][WARNIN] Caution: invalid backup GPT header, but valid main header; regenerating
[osd-no2][WARNIN] backup header from main header.
[osd-no2][WARNIN]
[osd-no2][WARNIN] Invalid partition data!
[osd-no2][DEBUG ] GPT data structures destroyed! You may now partition the disk using fdisk or
[osd-no2][DEBUG ] other utilities.
[osd-no2][WARNIN] ceph-disk: Error: Command ‘[‘/sbin/sgdisk’, ‘–zap-all’, ‘–‘, ‘/dev/sdb’]’ returned non-zero exit status 2
[osd-no2][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: /usr/sbin/ceph-disk zap /dev/sdb

—————————————————————————————————-

[osd-no2][WARNIN] No data was received after 300 seconds, disconnecting…
[ceph_deploy.osd][DEBUG ] Calling partprobe on zapped device /dev/sda
[osd-no2][INFO ] Running command: partprobe /dev/sda
[osd-no2][WARNIN] Error: Partition(s) 1 on /dev/sda have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
[osd-no2][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: partprobe /dev/sda

—————————————————————————————————-

[osd-no2][WARNIN] No data was received after 300 seconds, disconnecting…
[ceph_deploy.osd][DEBUG ] Calling partprobe on zapped device /dev/sda
[osd-no2][INFO ] Running command: partprobe /dev/sda
[osd-no2][WARNIN] Error: Partition(s) 1 on /dev/sda have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
[osd-no2][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: partprobe /dev/sda

해당 osd 리부팅 후 osd 추가 과정인 ceph-deploy disk zpa [hostname]:[장치] 를 다시 한번 진행 해보길 바란다.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x