Apache Hadoop文件翻譯之二（HDFS命令指南）

Kooola大資料發表於2018-09-29

概要

所有的HDFS命令使用bin/hdfs指令碼來呼叫。空引數執行該指令碼將展示所有命令的介紹。

使用方法: hdfs [SHELL_OPTIONS] COMMAND [GENERIC_OPTIONS] [COMMAND_OPTIONS]

Hadoop有一個選項解析框架，它採用解析通用選項以及執行類。

COMMAND_OPTIONS	Description
--config --loglevel	The common set of shell options. These are documented on the Commands Manual page.
GENERIC_OPTIONS	The common set of options supported by multiple commands. See the Hadoop Commands Manual for more information.
COMMAND COMMAND_OPTIONS	Various commands with their options are described in the following sections. The commands have been grouped into User Commands and Administration Commands.

使用者命令

對Hadoop叢集使用者有用的諸多命令。

classpath

用法: hdfs classpath [--glob |--jar |-h |--help]

COMMAND_OPTION	Description
--glob	expand wildcards
--jar path	write classpath as manifest in jar named path
-h, --help	print help

列印獲取Hadoop jar以及依賴庫所需的類路徑。如果不帶引數呼叫，則列印由命令指令碼設定的類路徑，該指令碼可能在類路徑條目中包含萬用字元。其他選項在萬用字元擴充套件後列印類路徑，或將類路徑寫入jar檔案的清單中。後者在無法使用萬用字元且擴充套件類路徑超過支援的最大命令列長度的環境中非常有用。

dfs

用法: hdfs dfs [COMMAND [COMMAND_OPTIONS]] 在hadoop支援的檔案系統上執行一個檔案系統命令。COMMAND_OPTIONS變數可以在檔案系統shell指南中找到。

fetchdt

用法: hdfs fetchdt <token_file_path>

COMMAND_OPTION	Description
--webservice NN_Url	Url to contact NN on (starts with http or https)
--renewer name	Name of the delegation token renewer
--cancel	Cancel the delegation token
--renew	Renew the delegation token. Delegation token must have been fetched using the –renewer name option.
--print	Print the delegation token
token_file_path	File path to store the token into.

從NameNode獲取委託令牌。詳細內容請參見 fetchdt

fsck

用法：

   hdfs fsck <path>
          [-list-corruptfileblocks |
          [-move | -delete | -openforwrite]
          [-files [-blocks [-locations | -racks | -replicaDetails | -upgradedomains]]]
          [-includeSnapshots]
          [-storagepolicies] [-maintenance] [-blockId <blk_Id>]
複製程式碼

COMMAND_OPTION	Description
path	Start checking from this path.
-delete	Delete corrupted files.
-files	Print out files being checked.
-files -blocks	Print out the block report
-files -blocks -locations	Print out locations for every block.
-files -blocks -racks	Print out network topology for data-node locations.
-files -blocks -replicaDetails	Print out each replica details.
-files -blocks -upgradedomains	Print out upgrade domains for every block.
-includeSnapshots	Include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it.
-list-corruptfileblocks	Print out list of missing blocks and files they belong to.
-move	Move corrupted files to /lost+found.
-openforwrite	Print out files opened for write.
-storagepolicies	Print out storage policy summary for the blocks.
-maintenance	Print out maintenance state node details.
-blockId	Print out information about the block.

執行HDFS檔案系統檢查實用程式。詳細內容請參見 fsck

getconf

用法：

hdfs getconf -namenodes
hdfs getconf -secondaryNameNodes
hdfs getconf -backupNodes
hdfs getconf -includeFile
hdfs getconf -excludeFile
hdfs getconf -nnRpcAddresses
hdfs getconf -confKey [key]
複製程式碼

COMMAND_OPTION	Description
-namenodes	gets list of namenodes in the cluster.
-secondaryNameNodes	gets list of secondary namenodes in the cluster.
-backupNodes	gets list of backup nodes in the cluster.
-includeFile	gets the include file path that defines the datanodes that can join the cluster.
-excludeFile	gets the exclude file path that defines the datanodes that need to decommissioned.
-nnRpcAddresses	gets the namenode rpc addresses
-confKey [key]	gets a specific key from the configuration

從配置目錄中獲取配置資訊，進行後處理。

groups

用法: hdfs groups [username ...] 返回指定的一個或多個使用者的組資訊。

lsSnapshottableDir

用法: hdfs lsSnapshottableDir [-help]

COMMAND_OPTION	Description
-help	print help

返回快照目錄列表。當使用超級使用者執行時，會返回所有快照目錄，否則返回屬於該用於的快照目錄。

jmxget

用法: hdfs jmxget [-localVM ConnectorURL | -port port | -server mbeanserver | -service service]

COMMAND_OPTION	Description
-help	print help
-localVM ConnectorURL	connect to the VM on the same machine
-port mbean server port	specify mbean server port, if missing it will try to connect to MBean Server in the same VM
-server	specify mbean server (localhost by default)
-service NameNode\|DataNode	specify jmx service. NameNode by default.

從服務中dump JMX資訊

oev

用法: hdfs oev [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE

必需的命令列引數：

COMMAND_OPTION	Description
-i,--inputFile arg	edits file to process, xml (case insensitive) extension means XML format, any other filename means binary format
-o,--outputFile arg	Name of output file. If the specified file exists, it will be overwritten, format of the file is determined by -p option

可選命令列引數：

COMMAND_OPTION	Description
-f,--fix-txids	Renumber the transaction IDs in the input, so that there are no gaps or invalid transaction IDs.
-h,--help	Display usage information and exit
-r,--recover	When reading binary edit logs, use recovery mode. This will give you the chance to skip corrupt parts of the edit log.
-p,--processor arg	Select which type of processor to apply against image file, currently supported processors are: binary (native binary format that Hadoop uses), xml (default, XML format), stats (prints statistics about edits file)
-v,--verbose	More verbose output, prints the input and output filenames, for processors that write to a file, also output to screen. On large image files this will dramatically increase processing time (default is false).

Hadoop離線編輯檢視器。有關詳細資訊，請參閱Offline Edits Viewer Guide

oiv

用法: hdfs oiv [OPTIONS] -i INPUT_FILE

必需的命令列引數：

COMMAND_OPTION	Description
-i	--inputFile input file

可選命令列引數：

COMMAND_OPTION	Description
-o,--outputFile output file	Specify the output filename, if the specified output processor generates one. If the specified file already exists, it is silently overwritten. (output to stdout by default) If the input file is an XML file, it also creates an .md5.
-p,--processor processor	Specify the image processor to apply against the image file. Currently valid options are Web (default), XML, Delimited, FileDistribution and ReverseXML.
-addr address	Specify the address(host:port) to listen. (localhost:5978 by default). This option is used with Web processor.
-maxSize size	Specify the range [0, maxSize] of file sizes to be analyzed in bytes (128GB by default). This option is used with FileDistribution processor.
-step size	Specify the granularity of the distribution in bytes (2MB by default). This option is used with FileDistribution processor.
-format	Format the output result in a human-readable fashion rather than a number of bytes. (false by default). This option is used with FileDistribution processor.
-delimiter arg	Delimiting string to use with Delimited processor.
-t,--temp temporary dir	Use temporary dir to cache intermediate result to generate Delimited outputs. If not set, Delimited processor constructs the namespace in memory before outputting text.
-h,--help	Display the tool usage and help information and exit.

針對image檔案的hadoop離線檢視器（hadoop2.4及以上版本）。詳見 Offline Image Viewer Guide

oiv_legacy

用法: hdfs oiv_legacy [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE

COMMAND_OPTION	Description
-i,--inputFile input file	Specify the input fsimage file to process.
-o,--outputFile output file	Specify the output filename, if the specified output processor generates one. If the specified file already exists, it is silently overwritten

可選命令列引數：

COMMAND_OPTION	Description
-p\|--processor processor	Specify the image processor to apply against the image file. Valid options are Ls (default), XML, Delimited, Indented, FileDistribution and NameDistribution.
-maxSize size	Specify the range [0, maxSize] of file sizes to be analyzed in bytes (128GB by default). This option is used with FileDistribution processor.
-step size	Specify the granularity of the distribution in bytes (2MB by default). This option is used with FileDistribution processor.
-format	Format the output result in a human-readable fashion rather than a number of bytes. (false by default). This option is used with FileDistribution processor.
-skipBlocks	Do not enumerate individual blocks within files. This may save processing time and outfile file space on namespaces with very large files. The Ls processor reads the blocks to correctly determine file sizes and ignores this option.
-printToScreen	Pipe output of processor to console as well as specified file. On extremely large namespaces, this may increase processing time by an order of magnitude.
-delimiter arg	When used in conjunction with the Delimited processor, replaces the default tab delimiter with the string specified by arg.
-h\|--help	Display the tool usage and help information and exit.

針對老版本的image檔案的hadoop離線檢視器。詳見HDFS Snapshot Documentation

version

用法: hdfs version 列印版本。

管理命令

對Hadoop叢集管理員有用的諸多命令。

balancer

用法：

hdfs balancer
          [-policy <policy>]
          [-threshold <threshold>]
          [-exclude [-f <hosts-file> | <comma-separated list of hosts>]]
          [-include [-f <hosts-file> | <comma-separated list of hosts>]]
          [-source [-f <hosts-file> | <comma-separated list of hosts>]]
          [-blockpools <comma-separated list of blockpool ids>]
          [-idleiterations <idleiterations>]
          [-runDuringUpgrade]
複製程式碼

COMMAND_OPTION	Description
-policy	atanode (default): Cluster is balanced if each datanode is balanced.blockpool: Cluster is balanced if each block pool in each datanode is balanced.
-threshold	Percentage of disk capacity. This overwrites the default threshold.
-exclude -f <hosts-file> \| <comma-separated list of hosts>	Excludes the specified datanodes from being balanced by the balancer.
-include -f <hosts-file> \| <comma-separated list of hosts>	Includes only the specified datanodes to be balanced by the balancer.
-source -f <hosts-file> \| <comma-separated list of hosts>	Pick only the specified datanodes as source nodes.
-blockpools <comma-separated list of blockpool ids>	The balancer will only run on blockpools included in this list.
-idleiterations <iterations>	Maximum number of idle iterations before exit. This overwrites the default idleiterations(5).
-runDuringUpgrade	Whether to run the balancer during an ongoing HDFS upgrade. This is usually not desired since it will not affect used space on over-utilized machines.
-h\|--help	Display the tool usage and help information and exit.

執行叢集平衡器應用程式，管理員可以簡單地通過Ctrl-C指令停止平衡器程式。更多資訊詳見Balancer。

注意，blockpool策略比datanode策略更嚴格。

除上述命令選項外，從2.7.0之後還引入了一個固定功能，以防止某些副本被平衡器/移動器移動。該功能預設情況下禁用，可通過配置屬性“dfs.datanode.block-pinning.enabled”來啟用。啟用時，此功能僅影響呼叫create()從而寫入指定節點的資料塊。對於HBase Regionserver等應用程式，我們希望維護資料區域性性時，此功能非常有用。

cacheadmin

用法：

hdfs cacheadmin [-addDirective -path <path> -pool <pool-name> [-force] [-replication <replication>] [-ttl <time-to-live>]]
hdfs cacheadmin [-modifyDirective -id <id> [-path <path>] [-force] [-replication <replication>] [-pool <pool-name>] [-ttl <time-to-live>]]
hdfs cacheadmin [-listDirectives [-stats] [-path <path>] [-pool <pool>] [-id <id>]]
hdfs cacheadmin [-removeDirective <id>]
hdfs cacheadmin [-removeDirectives -path <path>]
hdfs cacheadmin [-addPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-limit <limit>] [-maxTtl <maxTtl>]]
hdfs cacheadmin [-modifyPool <name> [-owner <owner>] [-group <group>] [-mode <mode>] [-limit <limit>] [-maxTtl <maxTtl>]]
hdfs cacheadmin [-removePool <name>]
hdfs cacheadmin [-listPools [-stats] [<name>]]
hdfs cacheadmin [-help <command-name>]
複製程式碼

更多資訊請檢視 HDFS Cache Administration Documentation

crypto

用法：

hdfs crypto -createZone -keyName <keyName> -path <path>
hdfs crypto -listZones
hdfs crypto -provisionTrash -path <path>
hdfs crypto -help <command-name>
複製程式碼

更多資訊請檢視HDFS Transparent Encryption Documentation

datanode

用法: hdfs datanode [-regular | -rollback | -rollingupgrade rollback]

COMMAND_OPTION	Description
-regular	Normal datanode startup (default).
-rollback	Rollback the datanode to the previous version. This should be used after stopping the datanode and distributing the old hadoop version.
-rollingupgrade rollback	Rollback a rolling upgrade operation.

執行一個HDFS datanode

dfsadmin

用法：

    hdfs dfsadmin [-report [-live] [-dead] [-decommissioning] [-enteringmaintenance] [-inmaintenance]]
    hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
    hdfs dfsadmin [-saveNamespace]
    hdfs dfsadmin [-rollEdits]
    hdfs dfsadmin [-restoreFailedStorage true |false |check]
    hdfs dfsadmin [-refreshNodes]
    hdfs dfsadmin [-setQuota <quota> <dirname>...<dirname>]
    hdfs dfsadmin [-clrQuota <dirname>...<dirname>]
    hdfs dfsadmin [-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>]
    hdfs dfsadmin [-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>]
    hdfs dfsadmin [-finalizeUpgrade]
    hdfs dfsadmin [-rollingUpgrade [<query> |<prepare> |<finalize>]]
    hdfs dfsadmin [-refreshServiceAcl]
    hdfs dfsadmin [-refreshUserToGroupsMappings]
    hdfs dfsadmin [-refreshSuperUserGroupsConfiguration]
    hdfs dfsadmin [-refreshCallQueue]
    hdfs dfsadmin [-refresh <host:ipc_port> <key> [arg1..argn]]
    hdfs dfsadmin [-reconfig <namenode|datanode> <host:ipc_port> <start |status |properties>]
    hdfs dfsadmin [-printTopology]
    hdfs dfsadmin [-refreshNamenodes datanodehost:port]
    hdfs dfsadmin [-getVolumeReport datanodehost:port]
    hdfs dfsadmin [-deleteBlockPool datanode-host:port blockpoolId [force]]
    hdfs dfsadmin [-setBalancerBandwidth <bandwidth in bytes per second>]
    hdfs dfsadmin [-getBalancerBandwidth <datanode_host:ipc_port>]
    hdfs dfsadmin [-fetchImage <local directory>]
    hdfs dfsadmin [-allowSnapshot <snapshotDir>]
    hdfs dfsadmin [-disallowSnapshot <snapshotDir>]
    hdfs dfsadmin [-shutdownDatanode <datanode_host:ipc_port> [upgrade]]
    hdfs dfsadmin [-evictWriters <datanode_host:ipc_port>]
    hdfs dfsadmin [-getDatanodeInfo <datanode_host:ipc_port>]
    hdfs dfsadmin [-metasave filename]
    hdfs dfsadmin [-triggerBlockReport [-incremental] <datanode_host:ipc_port>]
    hdfs dfsadmin [-listOpenFiles]
    hdfs dfsadmin [-help [cmd]]
複製程式碼

執行一個HDFS dfsadmin客戶端。

dfsrouter

用法: hdfs dfsrouter

執行DFS router。詳見Router

dfsrouteradmin

用法：

hdfs dfsrouteradmin
      [-add <source> <nameservice> <destination> [-readonly] -owner <owner> -group <group> -mode <mode>]
      [-rm <source>]
      [-ls <path>]
      [-safemode enter | leave | get]
複製程式碼

journalnode

用法: hdfs journalnode 該命令啟動一個journalnode,詳見HDFS HA with QJM

namenode

用法：

hdfs namenode [-backup] |
          [-checkpoint] |
          [-format [-clusterid cid ] [-force] [-nonInteractive] ] |
          [-upgrade [-clusterid cid] [-renameReserved<k-v pairs>] ] |
          [-upgradeOnly [-clusterid cid] [-renameReserved<k-v pairs>] ] |
          [-rollback] |
          [-rollingUpgrade <rollback|downgrade |started> ] |
          [-finalize] |
          [-importCheckpoint] |
          [-initializeSharedEdits] |
          [-bootstrapStandby [-force] [-nonInteractive] [-skipSharedEditsCheck] ] |
          [-recover [-force] ] |
          [-metadataVersion ]
複製程式碼

Apache Hadoop文件翻譯之一（HDFS架構）
2018-09-29
ApacheHadoop架構
Apache Hadoop文件翻譯之三（使用者指南）
2018-09-29
ApacheHadoop
Hadoop官網翻譯之HDFS Architecture
2019-01-01
Hadoop
Hadoop官網翻譯之HDFS Users Guide
2019-01-01
HadoopGUIIDE
hadoop官網翻譯之HDFS High Availability Using the Quorum Journal Manager
2019-01-02
HadoopAI
操作指南｜最詳盡文件翻譯志願指南
2022-01-26
HDFS 命令：用於管理HDFS的Hadoop Shell命令大全
2021-12-29
Hadoop
Serilog文件翻譯系列（一） - 入門指南
2024-08-28
hadoop hdfs 常用命令
2023-09-28
Hadoop
【活動】Apache Flink文件翻譯志願者招募！
2019-02-11
Apache
文件翻譯器怎麼用？如何翻譯Word文件？
2019-08-15
MPAndroidChart文件翻譯
2019-02-19
Android
有ppt文件翻譯軟體嗎？如何翻譯整篇ppt文件
2019-08-16
Moya官方文件翻譯
2018-03-21
Hadoop框架：HDFS簡介與Shell管理命令
2020-09-29
Hadoop框架
實用的Word文件翻譯方法分享，讓Word文件快速翻譯
2019-08-12
怎麼翻譯整篇Excel文件？Excel文件翻譯一招搞定
2019-08-14
Excel
Hadoop–HDFS
2018-12-20
Hadoop
怎麼把Excel文件翻譯成中文？Excel文件翻譯方法介紹
2019-08-21
Excel
Hadoop系列006-HDFS概念及命令列操作
2018-12-10
Hadoop命令列
【大資料】【hadoop】檢視hdfs檔案命令
2020-11-29
大資料Hadoop
docker官方文件翻譯3
2019-02-22
Docker
docker官方文件翻譯5
2018-05-08
Docker
docker官方文件翻譯2
2018-05-01
Docker
docker官方文件翻譯1
2018-05-01
Docker
influxdb官網文件翻譯
2018-07-30
UX
SnapKit 中文文件翻譯
2019-03-04
APK
rabbitmq 官方文件翻譯-2
2019-02-16
MQ
docker官方文件翻譯4
2019-02-17
Docker
Draft 文件翻譯 - API - ContentState
2021-09-09
RaftAPI
TailWind文件翻譯說明以及每日翻譯進度
2021-01-19
AI
Hadoop HDFS(一)
2018-05-24
Hadoop
Hadoop HDFS（二）
2018-05-31
Hadoop
hadoop官網翻譯第三天Hadoop Cluster Setup
2018-12-26
Hadoop
HTTPie 官方文件中文翻譯版
2019-02-16
HTTP
SpringAop英文文件部分翻譯
2018-12-15
Spring
Gin 框架中文文件（翻譯）
2018-07-13
框架
BBNorm官方指導文件翻譯
2018-04-01
ORM

Apache Hadoop文件翻譯之二（HDFS命令指南）

概要

使用者命令

classpath

dfs

fetchdt

fsck

getconf

groups

lsSnapshottableDir

jmxget

oev

oiv

oiv_legacy

version

管理命令

balancer

cacheadmin

crypto

datanode

dfsadmin

dfsrouter

dfsrouteradmin

journalnode

namenode

相關文章