Hudi測試

集君發表於2024-07-22

原文網址 : https://www.cnblogs.com/chq3272991/p/18315612

實驗環境

minio-8.0.10 http://192.168.137.100:32000/minio/bigdata/
spark-operator-1.1.26
spark-history-server 3.2.2 http://192.168.137.100:32627/

測試案例

案例hudi-spark-test001

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: hudi-spark-test001
  namespace: spark
spec:
  type: Scala
  mode: cluster
  image: "umr/spark:3.2.2_v2"
  imagePullPolicy: IfNotPresent
  mainClass: cc.hudi.HoodieSparkQuickstart
  mainApplicationFile: "s3a://bigdatas/jars/bigdataDemo-1.0-SNAPSHOT.jar"
  sparkVersion: "3.2.2"
  timeToLiveSeconds: 259200
  restartPolicy:
    type: Never
  volumes:
    - name: "test-volume"
      hostPath:
        path: "/tmp"
        type: Directory
  driver:
    cores: 1
    coreLimit: "1200m"
    memory: "512m"
    labels:
      version: 3.2.2
    serviceAccount: spark
    volumeMounts:
      - name: "test-volume"
        mountPath: "/tmp"
  executor:
    cores: 1
    instances: 1
    memory: "512m"
    labels:
      version: 3.2.2
    volumeMounts:
      - name: "test-volume"
        mountPath: "/tmp"
  sparkConf:
    spark.ui.port: "4045"
    spark.eventLog.enabled: "true"
    spark.eventLog.dir: "s3a://sparklogs/all"
    spark.hadoop.fs.s3a.access.key: "minio"
    spark.hadoop.fs.s3a.secret.key: "minio123"
    spark.hadoop.fs.s3a.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
    spark.hadoop.fs.s3a.endpoint: "http://10.19.64.205:32000"
    spark.hadoop.fs.s3a.connection.ssl.enabled: "false"
    spark.hadoop.fs.s3a.path.style.access: "true"
    spark.hadoop.fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"

報錯：

ClusterRole許可權不足， ClusterRole賬戶缺失persistentvolumeclaims的許可權：

3/07/07 06:26:54 ERROR Utils: Uncaught exception in thread main
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/spark-operator/persistentvolumeclaims?labelSelector=spark-app-selector%3Dspark-a9d7e8f78bc6459c9282db57a02815d9. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. persistentvolumeclaims is forbidden: User "system:serviceaccount:spark-operator:spark-operator" cannot list resource "persistentvolumeclaims" in API group "" in the namespace "spark-operator".

修改許可權：

# 部分示例
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - persistentvolumeclaims
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - configmaps
  verbs:
  - '*'

2.案例hudi-spark-test001報錯：

23/07/10 01:20:41 WARN SparkSession: Cannot use org.apache.spark.sql.hudi.HoodieSparkSessionExtension to configure session extensions.
java.lang.ClassNotFoundException: org.apache.spark.sql.hudi.HoodieSparkSessionExtension
	at java.base/java.net.URLClassLoader.findClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at java.base/java.lang.Class.forName0(Native Method)
	at java.base/java.lang.Class.forName(Unknown Source)
	at org.apache.spark.util.Utils$.classForName(Utils.scala:216)
	at org.apache.spark.sql.SparkSession$.$anonfun$applyExtensions$1(SparkSession.scala:1194)
	at org.apache.spark.sql.SparkSession$.$anonfun$applyExtensions$1$adapted(SparkSession.scala:1192)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$applyExtensions(SparkSession.scala:1192)
	at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:956)
	at cc.utils.HoodieExampleSparkUtils.buildSparkSession(HoodieExampleSparkUtils.java:60)
	at cc.utils.HoodieExampleSparkUtils.defaultSparkSession(HoodieExampleSparkUtils.java:53)
	at cc.hudi.HoodieSparkQuickstart.main(HoodieSparkQuickstart.java:39)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.base/java.lang.reflect.Method.invoke(Unknown Source)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

3.github原始碼編譯錯誤：

[ERROR] Failed to execute goal on project hudi-utilities_2.12: Could not resolve dependencies for project org.apache.hudi:hudi-utilities_2.12:jar:0.14.0-SNAPSHOT: The following artifacts could not be resolved: io.confluent:kafka-avr
o-serializer:jar:5.3.4, io.confluent:common-config:jar:5.3.4, io.confluent:common-utils:jar:5.3.4, io.confluent:kafka-schema-registry-client:jar:5.3.4: io.confluent:kafka-avro-serializer:jar:5.3.4 was not found in http://10.41.31.10:9081/repository/maven-public/ during a previous attempt. This failure was cached in the local repository and resolution is not reattempted until the update interval of chinaunicom has elapsed or updates are forced -> [Help 1]      
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <args> -rf :hudi-utilities_2.12

參考：

github的原始碼使用指南：

Developer Setup | Apache Hudi

社群查詢問題：

Apache Hudi - ASF JIRA

測試測試測試測試測試測試
2024-09-03
測試—測試方法
2020-10-06
測試測試用
2019-01-03
Flutter 學習之路 - 測試（單元測試，Widget 測試，整合測試）
2019-04-10
Flutter
App測試、Web測試和介面測試一般測試流程
2022-05-07
APPWeb
測試面試-測試用例
2020-12-05
面試
Apache Hudi使用簡介
2020-12-27
Apache
介面測試測試流程
2022-05-05
介面測試，負載測試，併發測試，壓力測試區別
2021-08-31
負載
測試CMS同步測試CMS同步測試CMS同步
2020-07-29
（一）效能測試（壓力測試、負載測試）
2020-12-15
負載
介面測試 - 引數測試
2024-05-17
Jmeter介面測試+效能測試
2024-04-16
JMeter
【軟體測試】——介面測試
2020-12-26
微服務測試之介面測試和契約測試
2019-04-08
微服務
測試之Java單元測試、Android單元測試
2018-06-23
JavaAndroid
黑盒測試、白盒測試與灰盒測試方法
2021-12-24
Apache Hudi與Apache Flink整合
2020-10-13
Apache
重磅！Vertica整合Apache Hudi指南
2022-03-29
Apache
hudi clustering 資料聚集（一）
2021-11-11
hudi clustering 資料聚集（二）
2021-11-12
Apache Hudi初學者指南
2020-11-27
Apache
功能測試、自動化測試、效能測試的區別
2024-03-06
小白測試系列:介面測試與效能測試的區別
2020-07-31
軟體測試中的功能測試和非功能測試
2022-12-30
API 測試 | 瞭解 API 介面測試 | API 介面測試指南
2023-04-07
API
測試——水杯的測試用例
2018-08-05
Burpsuite安全測試測試指導
2019-08-26
UI
軟體測試-測試計劃
2020-06-20
單元測試，只是測試嗎？
2020-08-18
測試平臺之介面測試
2020-08-21
測試用例和測試方法
2020-11-23
測試物件和測試級別
2019-06-14
物件
微服務測試之效能測試
2019-04-22
微服務
去測試化≠測試失業
2024-05-25
黑盒測試和白盒測試
2020-12-24
介面測試要測試什麼?
2021-09-01
效能測試之測試指標
2021-10-18
指標

Hudi測試

實驗環境

測試案例

報錯：

相關文章