問題描述
在Hadoop集中中,使用ADLS 作為資料來源,在執行PUT操作(上傳檔案到ADLS中),遇見 400錯誤【put: Operation failed: "An HTTP header that's mandatory for this request is not specified.", 400】
啟用Debug輸出詳細日誌:
錯誤訊息文字內容:
[hdfs@hadoop001 ~]$ hadoop fs -put a.txt abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt 22/07/13 15:46:05 DEBUG util.Shell: setsid exited with exit code 0 22/07/13 15:46:05 DEBUG conf.Configuration: parsing URL jar:file:/usr/hdp/3.1.4.0-315/hadoop/hadoop-common-3.1.1.3.1.4.0-315.jar!/core-default.xml 22/07/13 15:46:05 DEBUG conf.Configuration: parsing input stream sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream@4fe3c938 22/07/13 15:46:05 DEBUG conf.Configuration: parsing URL file:/etc/hadoop/3.1.4.0-315/0/core-site.xml 22/07/13 15:46:05 DEBUG conf.Configuration: parsing input stream java.io.BufferedInputStream@467aecef 22/07/13 15:46:05 DEBUG security.SecurityUtil: Setting hadoop.security.token.service.use_ip to true 22/07/13 15:46:05 DEBUG security.Groups: Creating new Groups object 22/07/13 15:46:05 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library... 22/07/13 15:46:05 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library 22/07/13 15:46:05 DEBUG security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution 22/07/13 15:46:05 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping 22/07/13 15:46:05 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000 22/07/13 15:46:06 DEBUG core.Tracer: sampler.classes = ; loaded no samplers 22/07/13 15:46:06 DEBUG core.Tracer: span.receiver.classes = ; loaded no span receivers 22/07/13 15:46:06 DEBUG security.UserGroupInformation: hadoop login 22/07/13 15:46:06 DEBUG security.UserGroupInformation: hadoop login commit 22/07/13 15:46:06 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hdfs 22/07/13 15:46:06 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: hdfs" with name hdfs 22/07/13 15:46:06 DEBUG security.UserGroupInformation: User entry: "hdfs" 22/07/13 15:46:06 DEBUG security.UserGroupInformation: UGI loginUser:hdfs (auth:SIMPLE) 22/07/13 15:46:06 DEBUG core.Tracer: sampler.classes = ; loaded no samplers 22/07/13 15:46:06 DEBUG core.Tracer: span.receiver.classes = ; loaded no span receivers 22/07/13 15:46:06 DEBUG fs.FileSystem: Loading filesystems 22/07/13 15:46:06 DEBUG fs.FileSystem: file:// = class org.apache.hadoop.fs.LocalFileSystem from /usr/hdp/3.1.4.0-315/hadoop/hadoop-common-3.1.1.3.1.4.0-315.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: viewfs:// = class org.apache.hadoop.fs.viewfs.ViewFileSystem from /usr/hdp/3.1.4.0-315/hadoop/hadoop-common-3.1.1.3.1.4.0-315.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: har:// = class org.apache.hadoop.fs.HarFileSystem from /usr/hdp/3.1.4.0-315/hadoop/hadoop-common-3.1.1.3.1.4.0-315.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: http:// = class org.apache.hadoop.fs.http.HttpFileSystem from /usr/hdp/3.1.4.0-315/hadoop/hadoop-common-3.1.1.3.1.4.0-315.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: https:// = class org.apache.hadoop.fs.http.HttpsFileSystem from /usr/hdp/3.1.4.0-315/hadoop/hadoop-common-3.1.1.3.1.4.0-315.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: hdfs:// = class org.apache.hadoop.hdfs.DistributedFileSystem from /usr/hdp/3.1.4.0-315/hadoop-hdfs/hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: webhdfs:// = class org.apache.hadoop.hdfs.web.WebHdfsFileSystem from /usr/hdp/3.1.4.0-315/hadoop-hdfs/hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: swebhdfs:// = class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem from /usr/hdp/3.1.4.0-315/hadoop-hdfs/hadoop-hdfs-client-3.1.1.3.1.4.0-315.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: gs:// = class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem from /usr/hdp/3.1.4.0-315/hadoop-mapreduce/gcs-connector-1.9.10.3.1.4.0-315-shaded.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: s3n:// = class org.apache.hadoop.fs.s3native.NativeS3FileSystem from /usr/hdp/3.1.4.0-315/hadoop-mapreduce/hadoop-aws-3.1.1.3.1.4.0-315.jar 22/07/13 15:46:06 DEBUG fs.FileSystem: Looking for FS supporting abfs 22/07/13 15:46:06 DEBUG fs.FileSystem: looking for configuration option fs.abfs.impl 22/07/13 15:46:06 DEBUG fs.FileSystem: Filesystem abfs defined in configuration option 22/07/13 15:46:06 DEBUG fs.FileSystem: FS for abfs is class org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem 22/07/13 15:46:06 DEBUG azurebfs.AzureBlobFileSystem: Initializing AzureBlobFileSystem for abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt 22/07/13 15:46:06 DEBUG security.Groups: GroupCacheLoader - load. 22/07/13 15:46:06 WARN utils.SSLSocketFactoryEx: Failed to load OpenSSL. Falling back to the JSSE default. 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_RSA_WITH_AES_256_GCM_SHA384 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_DHE_RSA_WITH_AES_256_GCM_SHA384 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_DHE_DSS_WITH_AES_256_GCM_SHA384 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_RSA_WITH_AES_128_GCM_SHA256 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_DHE_RSA_WITH_AES_128_GCM_SHA256 22/07/13 15:46:06 DEBUG utils.SSLSocketFactoryEx: Removed Cipher - TLS_DHE_DSS_WITH_AES_128_GCM_SHA256 22/07/13 15:46:06 DEBUG services.AbfsClientThrottlingIntercept: Client-side throttling is enabled for the ABFS file system. 22/07/13 15:46:06 DEBUG azurebfs.AzureBlobFileSystem: AzureBlobFileSystem.getFileStatus path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt 22/07/13 15:46:06 DEBUG azurebfs.AzureBlobFileSystem: ABFS authorizer is not initialized. No authorization check will be performed. 22/07/13 15:46:06 DEBUG azurebfs.AzureBlobFileSystemStore: Get root ACL status 22/07/13 15:46:06 DEBUG oauth2.AccessTokenProvider: AADToken: no token. Returning expiring=true 22/07/13 15:46:06 DEBUG oauth2.AccessTokenProvider: AAD Token is missing or expired: Calling refresh-token from abstract base class 22/07/13 15:46:06 DEBUG oauth2.AccessTokenProvider: AADToken: refreshing client-credential based token 22/07/13 15:46:06 DEBUG oauth2.AzureADAuthenticator: AADToken: starting to fetch token using client creds for client ID 0392543e-5eab-4de2-881b-9bd8a9fe9deb 22/07/13 15:46:06 DEBUG oauth2.AzureADAuthenticator: Requesting an OAuth token by POST to https://login.partner.microsoftonline.cn/fc54511d-de79-4bae-bfc9-3a42945d1b27/oauth2/token 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Request Headers 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Connection=close 22/07/13 15:46:06 DEBUG oauth2.AzureADAuthenticator: Response 200 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Response Headers 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: HTTP Response=HTTP/1.1 200 OK 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: x-ms-ests-server=2.1.13156.10 - CNN2LR1 ProdSlices 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: X-Content-Type-Options=nosniff 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Connection=close 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Pragma=no-cache 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: P3P=CP="DSP CUR OTPi IND OTRi ONL FIN" 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Date=Wed, 13 Jul 2022 07:46:06 GMT 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Strict-Transport-Security=max-age=31536000; includeSubDomains 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Cache-Control=no-store, no-cache 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Set-Cookie=*cookie info* 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Expires=-1 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Content-Length=1427 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: X-XSS-Protection=0 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: x-ms-request-id=b63779e3-ec7a-4d78-a950-fc5cd47b2f01 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Content-Type=application/json; charset=utf-8 22/07/13 15:46:06 DEBUG oauth2.AzureADAuthenticator: AADToken: fetched token with expiry Wed Jul 13 16:46:05 CST 2022 22/07/13 15:46:06 DEBUG services.AbfsClient: Authenticating request with OAuth2 access token 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Request Headers 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Accept-Charset=utf-8 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Accept=application/json, application/octet-stream 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: User-Agent=Azure Blob FS/3.1.1.3.1.4.0-315 (JavaJRE 1.8.0_232; Linux 3.10.0-862.el7.x86_64; SunJSSE-1.8) User-Agent: APN/1.0 Hortonworks/1.0 HDP/3.1.4.0-315 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: x-ms-client-request-id=14467eed-4c13-4e36-9a5d-35603fe87d0a 22/07/13 15:46:06 DEBUG services.AbfsIoUtils: Content-Type= 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Response Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: HTTP Response=HTTP/1.1 200 OK 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-lease-status=unlocked 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Server=Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-lease-state=available 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Last-Modified=Mon, 11 Jul 2022 08:15:08 GMT 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Date=Wed, 13 Jul 2022 07:46:06 GMT 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-blob-type=BlockBlob 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept-Ranges=bytes 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-server-encrypted=true 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-access-tier-inferred=true 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-meta-hdi_isfolder=true 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-access-tier=Hot 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: ETag="0x8DA631578216222" 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-creation-time=Mon, 11 Jul 2022 08:15:08 GMT 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Length=0 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-request-id=4a211e55-f01e-0058-388c-9679fc000000 22/07/13 15:46:07 DEBUG services.AbfsClient: HttpRequest: 200,,cid=14467eed-4c13-4e36-9a5d-35603fe87d0a,rid=4a211e55-f01e-0058-388c-9679fc000000,sent=0,recv=0,HEAD,https://xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/adsl//?upn=false&action=getAccessControl&timeout=90 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystemStore: getFileStatus filesystem: adsl path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt isNamespaceEnabled: true 22/07/13 15:46:07 DEBUG services.AbfsClient: Authenticating request with OAuth2 access token 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Request Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept-Charset=utf-8 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept=application/json, application/octet-stream 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: User-Agent=Azure Blob FS/3.1.1.3.1.4.0-315 (JavaJRE 1.8.0_232; Linux 3.10.0-862.el7.x86_64; SunJSSE-1.8) User-Agent: APN/1.0 Hortonworks/1.0 HDP/3.1.4.0-315 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-client-request-id=756615aa-bed9-4487-88ae-b69f859f0b51 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Type= 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Response Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Transfer-Encoding=chunked 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: HTTP Response=HTTP/1.1 404 The specified blob does not exist. 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Server=Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-error-code=BlobNotFound 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-request-id=4a211e7e-f01e-0058-5d8c-9679fc000000 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Date=Wed, 13 Jul 2022 07:46:06 GMT 22/07/13 15:46:07 DEBUG services.AbfsClient: HttpRequest: 404,,cid=756615aa-bed9-4487-88ae-b69f859f0b51,rid=4a211e7e-f01e-0058-5d8c-9679fc000000,sent=0,recv=0,HEAD,https://xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/adsl/test/a.txt?upn=false&timeout=90 22/07/13 15:46:07 DEBUG fs.FileSystem: Looking for FS supporting file 22/07/13 15:46:07 DEBUG fs.FileSystem: looking for configuration option fs.file.impl 22/07/13 15:46:07 DEBUG fs.FileSystem: Looking in service filesystems for implementation class 22/07/13 15:46:07 DEBUG fs.FileSystem: FS for file is class org.apache.hadoop.fs.LocalFileSystem 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystem: AzureBlobFileSystem.getFileStatus path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystem: ABFS authorizer is not initialized. No authorization check will be performed. 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystemStore: getFileStatus filesystem: adsl path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test isNamespaceEnabled: true 22/07/13 15:46:07 DEBUG services.AbfsClient: Authenticating request with OAuth2 access token 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Request Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept-Charset=utf-8 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept=application/json, application/octet-stream 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: User-Agent=Azure Blob FS/3.1.1.3.1.4.0-315 (JavaJRE 1.8.0_232; Linux 3.10.0-862.el7.x86_64; SunJSSE-1.8) User-Agent: APN/1.0 Hortonworks/1.0 HDP/3.1.4.0-315 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-client-request-id=b48f18e8-ba8e-4a44-956f-5ef889b828e5 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Type= 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Response Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: HTTP Response=HTTP/1.1 200 OK 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-lease-status=unlocked 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Server=Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-lease-state=available 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Last-Modified=Tue, 12 Jul 2022 10:03:44 GMT 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Date=Wed, 13 Jul 2022 07:46:06 GMT 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-blob-type=BlockBlob 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept-Ranges=bytes 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-server-encrypted=true 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-access-tier-inferred=true 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-meta-hdi_isfolder=true 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-access-tier=Hot 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Cache-Control=max-age=0 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: ETag="0x8DA63EDCE5D3F3C" 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-creation-time=Tue, 12 Jul 2022 10:03:44 GMT 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Length=0 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-request-id=4a211e93-f01e-0058-718c-9679fc000000 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Type=application/octet-stream 22/07/13 15:46:07 DEBUG services.AbfsClient: HttpRequest: 200,,cid=b48f18e8-ba8e-4a44-956f-5ef889b828e5,rid=4a211e93-f01e-0058-718c-9679fc000000,sent=0,recv=0,HEAD,https://xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/adsl/test?upn=false&timeout=90 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystem: AzureBlobFileSystem.getFileStatus path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt._COPYING_ 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystem: ABFS authorizer is not initialized. No authorization check will be performed. 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystemStore: getFileStatus filesystem: adsl path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt._COPYING_ isNamespaceEnabled: true 22/07/13 15:46:07 DEBUG services.AbfsClient: Authenticating request with OAuth2 access token 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Request Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept-Charset=utf-8 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept=application/json, application/octet-stream 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: User-Agent=Azure Blob FS/3.1.1.3.1.4.0-315 (JavaJRE 1.8.0_232; Linux 3.10.0-862.el7.x86_64; SunJSSE-1.8) User-Agent: APN/1.0 Hortonworks/1.0 HDP/3.1.4.0-315 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-client-request-id=8a6491b6-e13e-4d4e-be3b-8d183c727442 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Type= 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Response Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Transfer-Encoding=chunked 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: HTTP Response=HTTP/1.1 404 The specified blob does not exist. 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Server=Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-error-code=BlobNotFound 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-request-id=4a211ea5-f01e-0058-7f8c-9679fc000000 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Date=Wed, 13 Jul 2022 07:46:06 GMT 22/07/13 15:46:07 DEBUG services.AbfsClient: HttpRequest: 404,,cid=8a6491b6-e13e-4d4e-be3b-8d183c727442,rid=4a211ea5-f01e-0058-7f8c-9679fc000000,sent=0,recv=0,HEAD,https://xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/adsl/test/a.txt._COPYING_?upn=false&timeout=90 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystem: AzureBlobFileSystem.create path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt._COPYING_ permission: { masked: rw-r--r--, unmasked: rw-rw-rw- } overwrite: true bufferSize: 33554432 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystem: ABFS authorizer is not initialized. No authorization check will be performed. 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystemStore: createFile filesystem: adsl path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt._COPYING_ overwrite: true permission: { masked: rw-r--r--, unmasked: rw-rw-rw- } umask: ----w--w- isNamespaceEnabled: true 22/07/13 15:46:07 DEBUG services.AbfsClient: Authenticating request with OAuth2 access token 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Request Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-umask=0022 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept-Charset=utf-8 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept=application/json, application/octet-stream 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: User-Agent=Azure Blob FS/3.1.1.3.1.4.0-315 (JavaJRE 1.8.0_232; Linux 3.10.0-862.el7.x86_64; SunJSSE-1.8) User-Agent: APN/1.0 Hortonworks/1.0 HDP/3.1.4.0-315 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-permissions=0644 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-client-request-id=bf71e98f-886d-4529-b62b-7898c655fadf 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Type= 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Response Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: HTTP Response=HTTP/1.1 400 An HTTP header that's mandatory for this request is not specified. 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Server=Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-error-code=MissingRequiredHeader 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Length=301 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-request-id=4a211eae-f01e-0058-088c-9679fc000000 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Date=Wed, 13 Jul 2022 07:46:06 GMT 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Type=application/xml 22/07/13 15:46:07 DEBUG services.AbfsHttpOperation: ExpectedError: org.codehaus.jackson.JsonParseException: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null') at [Source: sun.net.www.protocol.http.HttpURLConnection$HttpInputStream@38145825; line: 1, column: 5] at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1433) at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521) at org.codehaus.jackson.impl.JsonParserMinimalBase._reportUnexpectedChar(JsonParserMinimalBase.java:442) at org.codehaus.jackson.impl.Utf8StreamParser._handleUnexpectedValue(Utf8StreamParser.java:2090) at org.codehaus.jackson.impl.Utf8StreamParser._nextTokenNotInObject(Utf8StreamParser.java:606) at org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:492) at org.apache.hadoop.fs.azurebfs.services.AbfsHttpOperation.processStorageErrorResponse(AbfsHttpOperation.java:379) at org.apache.hadoop.fs.azurebfs.services.AbfsHttpOperation.processResponse(AbfsHttpOperation.java:285) at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:172) at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:125) at org.apache.hadoop.fs.azurebfs.services.AbfsClient.createPath(AbfsClient.java:254) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.createFile(AzureBlobFileSystemStore.java:342) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.create(AzureBlobFileSystem.java:189) at org.apache.hadoop.fs.FilterFileSystem.create(FilterFileSystem.java:193) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:987) at org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.create(CommandWithDestination.java:521) at org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:485) at org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:408) at org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:343) at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:278) at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:263) at org.apache.hadoop.fs.shell.Command.processPathInternal(Command.java:367) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:304) at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:258) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:286) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:270) at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:229) at org.apache.hadoop.fs.shell.CopyCommands$Put.processArguments(CopyCommands.java:295) at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:120) at org.apache.hadoop.fs.shell.Command.run(Command.java:177) at org.apache.hadoop.fs.FsShell.run(FsShell.java:328) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.fs.FsShell.main(FsShell.java:391) 22/07/13 15:46:07 DEBUG services.AbfsClient: HttpRequest: 400,,cid=bf71e98f-886d-4529-b62b-7898c655fadf,rid=4a211eae-f01e-0058-088c-9679fc000000,sent=0,recv=301,PUT,https://xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/adsl/test/a.txt._COPYING_?resource=file&timeout=90 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystem: AzureBlobFileSystem.getFileStatus path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt._COPYING_ 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystem: ABFS authorizer is not initialized. No authorization check will be performed. 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystemStore: getFileStatus filesystem: adsl path: abfs://adsl@xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/test/a.txt._COPYING_ isNamespaceEnabled: true 22/07/13 15:46:07 DEBUG services.AbfsClient: Authenticating request with OAuth2 access token 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Request Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept-Charset=utf-8 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Accept=application/json, application/octet-stream 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: User-Agent=Azure Blob FS/3.1.1.3.1.4.0-315 (JavaJRE 1.8.0_232; Linux 3.10.0-862.el7.x86_64; SunJSSE-1.8) User-Agent: APN/1.0 Hortonworks/1.0 HDP/3.1.4.0-315 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-client-request-id=4aeb3b46-587d-489c-a246-5d6d668a84a5 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Content-Type= 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Response Headers 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Transfer-Encoding=chunked 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: HTTP Response=HTTP/1.1 404 The specified blob does not exist. 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-version=2018-11-09 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Server=Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-error-code=BlobNotFound 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: x-ms-request-id=4a211eb7-f01e-0058-108c-9679fc000000 22/07/13 15:46:07 DEBUG services.AbfsIoUtils: Date=Wed, 13 Jul 2022 07:46:06 GMT 22/07/13 15:46:07 DEBUG services.AbfsClient: HttpRequest: 404,,cid=4aeb3b46-587d-489c-a246-5d6d668a84a5,rid=4a211eb7-f01e-0058-108c-9679fc000000,sent=0,recv=0,HEAD,https://xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/adsl/test/a.txt._COPYING_?upn=false&timeout=90 put: Operation failed: "An HTTP header that's mandatory for this request is not specified.", 400, PUT, https://xxxxxxxxxxxxx.blob.core.chinacloudapi.cn/adsl/test/a.txt._COPYING_?resource=file&timeout=90, , "" 22/07/13 15:46:07 DEBUG azurebfs.AzureBlobFileSystem: AzureBlobFileSystem.close 22/07/13 15:46:07 DEBUG util.ShutdownHookManager: Completed shutdown in 0.004 seconds; Timeouts: 0 22/07/13 15:46:07 DEBUG util.ShutdownHookManager: ShutdownHookManger completed shutdown.
問題解答
雖然在Hadoop 中執行的 PUT指令如下:
./hadoop fs -put a.txt abfs://yourcontainername@youradlsname.blob.core.chinacloudapi.cn/test.txt
但實質上,也時傳送的REST API來操作ADLS資源。 所以參考PUT Blob的介面文件:https://docs.microsoft.com/en-us/rest/api/storageservices/put-blob#request-headers-all-blob-types
它必須的Header引數有:x-ms-version,x-ms-blob-type,x-ms-lease-id,Authorization,x-ms-date,Content-Length等。但是在Hadoop的日誌中,我們只發現了 x-ms-version為 2018-11-09,缺少了x-ms-blob-type。
基於這一發現,我們通過Postman復現了同樣的錯誤:
雖然找到了發生問題的根源,但是在Hadoop中,如何來解決呢? 為什麼使用 -put , -ls 等指令都會出現 HTTP Header miss 的問題呢? 按照Hadoop + ADLS 組合設計分析,不可能出現這樣的嚴重錯誤而不進行修復。
回想 ADLS Gen 2專為大資料操作而設計。並且還特別啟用了新的終結點(常規Blob操作終結點為:youradlsname.blob.core.chinacloudapi.cn , ADLS操作的終結點為:youradlsname.dfs.core.chinacloudapi.cn)
是否時我們在指令中使用了錯誤的終結點呢?
對比REST API 文件中,常規Blob的PUT操作和ADLS Create File的PUT操作,發現 ADLS PUT操作根本就不需要 x-ms-version,x-ms-blob-type 這兩個Header 為必須。
根據以上發現,在Hadoop put指令中修改 blob 為 dfs 測試。 問題完美解決!
以此次的錯誤,得出一個深刻的教訓:當使用ADLS進行大資料相關操作時(如hadoop,databricks)一定一定要使用ADLS專用終結點:
xxxxxxx.dfs.core.chinacloudapi.cn
參考資料
Filesystem - Create:https://docs.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/filesystem/createPut Blob: https://docs.microsoft.com/en-us/rest/api/storageservices/put-blob#request-headers-all-blob-types
[END]