OCP4.2.2 主機標準化檢查系統配置項修復clocksource 報錯

Linux运维-Friend發表於2024-06-23

適應版本:

社群版本OCP:4.2.2-20240315150922

背景描述

  • OCP納管主機後進行主機標準化時,set clock source一直沒有成功

  • 自動修復後還是有問題

分析過程

  • 檢視官方ocp.4.2文件,有相關資訊

  • 執行相關命令再次檢視檔案並未寫入tsc

  • 重新檢查

  • 自動修復,報錯一樣,說明剛設定的沒有生效

  • 檢視報錯日誌

Bash
2024-05-10 14:44:37.552 INFO 823423 --- [pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d] c.o.o.e.internal.template.HttpTemplate : POST request to agent, url:http://10.186.61.51:62888/api/v1/system/setClockSource, request body:SetClockSourceRequest(sourceType=tsc), params:null

2024-05-10 14:44:37.565 ERROR 823423 --- [pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d] c.o.o.c.c.i.r.methods.RepairClockSource : set clock source to tsc failed: [AgentClient]:http request is failed, response:Unexpected error: symlink /usr/lib/systemd/system/set_clocksource.service /etc/systemd/system/multi-user.target.wants/set_clocksource.service: file exists

2024-05-10 14:44:37.586 ERROR 823423 --- [pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d] c.o.o.c.c.i.h.SystemCheckerHelperImpl : Failed to repair 277. Please see the log for details

2024-05-10 14:44:37.592 ERROR 823423 --- [pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d] c.o.ocp.core.util.ExceptionUtils : Checked Exception: com.oceanbase.ocp.core.exception.UnexpectedException occurred with code error.common.unexpected, and args [4]

2024-05-10 14:44:37.597 ERROR 823423 --- [pool-manual-subtask-executor16,82ea1ce829564495,4c251a8e816d] c.o.o.c.t.e.c.w.subtask.SubtaskExecutor : An unknown error has occurred. Cause: 4. Error message: {1}. Contact the administrator.

com.oceanbase.ocp.core.exception.UnexpectedException: [OCP UnexpectedException]: status=500 INTERNAL_SERVER_ERROR, errorCode=COMMON_UNEXPECTED, args=4
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.oceanbase.ocp.core.util.ExceptionUtils.newException(ExceptionUtils.java:96)
at com.oceanbase.ocp.core.util.ExceptionUtils.throwException(ExceptionUtils.java:90)
at com.oceanbase.ocp.core.util.ExceptionUtils.unExpected(ExceptionUtils.java:71)
at com.oceanbase.ocp.compute.checker.internal.task.RepairCheckItemTask.run(RepairCheckItemTask.java:59)
at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.execute(JavaSubtaskRunner.java:64)
at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.doRun(JavaSubtaskRunner.java:32)
at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.run(JavaSubtaskRunner.java:26)
at com.oceanbase.ocp.core.task.engine.runner.RunnerFactory.doRun(RunnerFactory.java:76)
at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.doRun(SubtaskExecutor.java:203)
at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.redirectConsoleOutput(SubtaskExecutor.java:197)
at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.lambda$submit$2(SubtaskExecutor.java:134)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolEx
ecutor.java:624)
at java.lang.Thread.run(Thread.java:750)


Set state for subtask: 2609, operation:EXECUTE, state: DISREGARDED

  • 檢視/usr/lib/systemd/system和/etc/systemd/system/multi-user.target.wants/已經設定了軟連結,說明設定了systemd開機啟動。

問題結論

Bash
OCP 納管主機時,已經將 set clock source 會寫入到/etc/systemd/system中,但在自動修復時,會重新載入到/etc/systemd/system中,如果自動修復檢查時已經有這個檔案則報錯檔案已存在

處理方案

Bash
從以上來看OCP 納管主機時,已經將 set clock source 會寫入到/etc/systemd/system中,但在自動修復時,會重新載入到/etc/systemd/system中,如果自動修復檢查時已經有這個檔案則報錯檔案已存在
[root@localhost multi-user.target.wants]# systemctl list-unit-files | egrep set_clocksource.service
set_clocksource.service enabled
[root@localhost multi-user.target.wants]#

--方案
將/etc/systemd/system/multi-user.target.wants/set_clocksource.service 重新命名
mv /etc/systemd/system/multi-user.target.wants/set_clocksource.service /etc/systemd/system/multi-user.target.wants/set_clocksource.service.bak

  • 白屏再進行修復,發現建立了一個相同的檔案連結,同時報錯已修復

補充:

Bash
用OAT部署的會寫入在/etc/rc.local中
[root@10-186-57-25 ~]# cat /etc/rc.local
#!/bin/bash
# THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES
#
# It is highly advisable to create own systemd services or udev rules
# to run scripts during boot instead of using this file.
#
# In contrast to previous versions due to parallel execution during boot
# this script will NOT be run after all other services.
#
# Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure
# that this script will be executed during boot.

touch /var/lock/subsys/local
/usr/local/bin/set_deadline.sh
echo never > /sys/kernel/mm/transparent_hugepage/enabled
/usr/local/sbin/set_nic_irq_ob.sh start
echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource
/usr/local/bin/auto_start_ob.sh >> /var/log/ob.autostart.log 2>&1 &
/usr/local/bin/set_cpufreq.sh

相關文章