[已解決] [HiveCatalog]Kerberos GSS initiate failed, No valid credentials provided, Cannot read from System.in

一杯半盏發表於2024-09-06

問題說明

部署一個連線Hive的Java應用程式,遇到這個Kerberos報錯的問題,查了一天,記錄一下

問題現象

  • Kerberos GSS initiate failed
  • No valid credentials provided (Mechanism level: Attempt to obtain new INITIATE credentials failed! (null))
  • Cannot read from System.in
javax.security.sasl.SaslException: GSS initiate failed
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) ~[na:1.8.0_351]
        at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) [hive-exec-1.1.0-cdh5.12.1-slankka.jar:1.1.0-cdh5.12.1]       
        at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) ~[hive-exec-1.1.0-cdh5.12.1-slankka.jar:1.1.0-cdh5.12.1]
        at .....
        at java.lang.Thread.run(Thread.java:750) ~[na:1.8.0_351]
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Attempt to obtain new INITIATE credentials failed! (null))
        at sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:386) ~[na:1.8.0_351]
        ... 44 common frames omitted
Caused by: javax.security.auth.login.LoginException: Cannot read from System.in
        at com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:871) ~[na:1.8.0_351]
        at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:710) ~[na:1.8.0_351]
        at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) ~[na:1.8.0_351]

排查過程

開啟 Kerberos debug:

-Dsun.security.krb5.debug=true

關鍵資訊

>>>KinitOptions cache name is /opt/userdata/krb5cache/0/krb5cc_0.jSzu-aKO
>>>KinitOptions cache name is /opt/userdata/krb5cache/0/krb5cc_0.jSzu-aKO
>>>KinitOptions cache name is /opt/userdata/krb5cache/0/krb5cc_0.jSzu-aKO
>>>KinitOptions cache name is /opt/userdata/krb5cache/0/krb5cc_0.jSzu-aKO
>>>KinitOptions cache name is /opt/userdata/krb5cache/0/krb5cc_0.jSzu-aKO
>>>KinitOptions cache name is /opt/userdata/krb5cache/0/krb5cc_0.jSzu-aKO

網路搜尋

https://bugs.openjdk.org/browse/JDK-6832353
https://community.spiceworks.com/t/pam-keeps-setting-the-krb5ccname-env-variable/940232
https://linux.die.net/man/5/pam_krb5

分析直接原因

KRB5CCNAME 這個環境變數被改了,與實際的KRB5CCNAME不一致。

查詢根本原因

啟動指令碼是透過 su - hue -c "springboot-app.jar start" 這種方式啟動的

以前踩過一個坑 su - hue 執行的shell環境帶hue環境變數,su hue 不帶hue環境變數

最終原因

啟動的指令碼有錯

#!/bin/bash

CURRENT_USER=$(whoami)
COMMAND='HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop /apps/springboot-app.jar start'
if [ $CURRENT_USER=='hue' ]; then
  echo "1. executing as $CURRENT_USER"
  bash -c "$COMMAND"
elif [ $CURRENT_USER=='root' ]; then
  echo "2. executing as hue... from $CURRENT_USER"
  su - hue -c "$COMMAND"
else
  echo "permission denied."
fi

結果發現是 if [ ] 表示式有錯,列印的是

1. executing as hue

實際上執行的是 bash -c "$COMMAND" 而不是 su - hue -c "$COMMAND"

改正後成功列印

2. executing as hue... from root

結論

曾經懷疑過 su 是不是不支援 KERBEROS 相關的認證,結果證明是沒問題的

只要認準 KRB5CCNAME 變數設定正確,就不會有問題。

最坑的是 Linux shell 語法,從 chatGPT 上覆制尤其需要注意

相關文章