hadoop Capacity Scheduler計算能力排程器配置
計算能力排程器介紹
Capacity Scheduler支援以下特性:
(1) 計算能力保證。支援多個佇列,某個作業可被提交到某一個佇列中。每個佇列會配置一定比例的計算資源,且所有提交到佇列中的作業共享該佇列中的資源。
(2) 靈活性。空閒資源會被分配給那些未達到資源使用上限的佇列,當某個未達到資源的佇列需要資源時,一旦出現空閒資源資源,便會分配給他們。
(3) 支援優先順序。佇列支援作業優先順序排程(預設是FIFO)
(4) 多重租賃。綜合考慮多種約束防止單個作業、使用者或者佇列獨佔佇列或者叢集中的資源。
(5) 基於資源的排程。 支援資源密集型作業,允許作業使用的資源量高於預設值,進而可容納不同資源需求的作業。不過,當前僅支援記憶體資源的排程。
配置方法為1. 複製$HADOOP_HOME/contrib/capacity-scheduler/hadoop-capacity-scheduler.jar 到$HADOOP_HOME/lib目錄中
2. 修改namenode節點中的conf/mapred-site.xml檔案
- <property>
- <name>mapred.jobtracker.taskSchedulername>
- <value>org.apache.hadoop.mapred.CapacityTaskSchedulervalue>
- property>
- <property>
- <name>mapred.queue.namesname>
- <value>default,hadoop,hivevalue>
- property>
- xml version="1.0"?>
- <!-- This is the configuration file for the resource manager in Hadoop. -->
- <!-- You can configure various scheduling parameters related to queues. -->
- <!-- The properties for a queue follow a naming convention,such as, -->
-
<!-- mapred.capacity-scheduler.queue.
.property-name. --> - <configuration>
- <!-- Capacity scheduler Job Initialization configuration parameters -->
- <property>
- <name>mapred.capacity-scheduler.init-poll-intervalname>
- <value>5000value>
- <description>The amount of time in miliseconds which is used to poll the job queues for jobs to initialize.
- description>
- property>
- <property>
- <name>mapred.capacity-scheduler.init-worker-threadsname>
- <value>5value>
- <description>Number of worker threads which would be used by
- Initialization poller to initialize jobs in a set of queue.
- If number mentioned in property is equal to number of job queues
- then a single thread would initialize jobs in a queue. If lesser
- then a thread would get a set of queues assigned. If the number
- is greater then number of threads would be equal to number of
- job queues.
- description>
- property>
- <property>
- <name>mapred.capacity-scheduler.maximum-system-jobsname>
- <value>30value>
- <description>Maximum number of jobs in the system which can be initialized,
- concurrently, by the Capacity Scheduler.
- description>
- property>
- <!--hadoop queue-->
- <property>
- <name>mapred.capacity-scheduler.queue.hadoop.capacityname>
- <value>30value>
- <description>Percentage of the number of slots in the cluster that are to be available for jobs in this queue.
- description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hadoop.maximum-capacityname>
- <value>-1value>
- <description>
- description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hadoop.supports-priorityname>
- <value>truevalue>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hadoop.minimum-user-limit-percentname>
- <value>100value>
- <description> description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hadoop.user-limit-factorname>
- <value>3value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hadoop.maximum-initialized-active-tasksname>
- <value>200000value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hadoop.maximum-initialized-active-tasks-per-username>
- <value>100000value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hadoop.init-accept-jobs-factorname>
- <value>10value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.default-maximum-initialized-jobs-per-username>
- <value>5value>
- <description>The maximum number of jobs to be pre-initialized for a user
- of the job queue.
- description>
- property>
- <!-- hive -->
- <property>
- <name>mapred.capacity-scheduler.queue.hive.capacityname>
- <value>30value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hive.maximum-capacityname>
- <value>-1value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hive.supports-priorityname>
- <value>truevalue>
- <description>If true, priorities of jobs will be taken into account in scheduling decisions.
- description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hive.minimum-user-limit-percentname>
- <value>100value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hive.user-limit-factorname>
- <value>4value>
- <description>The multiple of the queue capacity which can be configured to allow a single user to acquire more slots.
- description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hive.maximum-initialized-active-tasksname>
- <value>200000value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hive.maximum-initialized-active-tasks-per-username>
- <value>100000value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.hive.init-accept-jobs-factorname>
- <value>10value>
- <description>description>
- property>
- <!-- default -->
- <property>
- <name>mapred.capacity-scheduler.queue.default.capacityname>
- <value>40value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.default.maximum-capacityname>
- <value>-1value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.default.supports-priorityname>
- <value>truevalue>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.default.minimum-user-limit-percentname>
- <value>100value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.default.user-limit-factorname>
- <value>4value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.default.maximum-initialized-active-tasksname>
- <value>200000value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.default.maximum-initialized-active-tasks-per-username>
- <value>100000value>
- <description>description>
- property>
- <property>
- <name>mapred.capacity-scheduler.queue.default.init-accept-jobs-factorname>
- <value>10value>
- <description>description>
- property>
- configuration>
儲存檔案後,重啟jobtracker
以後修改capacity-scheduler.xml檔案後只需要執行命令hadoop mradmin -refreshQueues 就可以重新載入配置項。
4. 最後,如何使用該佇列呢:
mapreduce:在Job的程式碼中,設定Job屬於的佇列,例如hive:
conf.setQueueName("hive");
hive:在執行hive任務時,設定hive屬於的佇列,例如hive:
set mapred.job.queue.name=hive;
設定佇列的任務名稱set mapred.job.name=hadooptest;
設定佇列的優先順序別set mapred.job.priority=HIGH;
原文地址:http://blog.csdn.net/jiedushi/article/details/7920455
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29754888/viewspace-1247951/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Hadoop Yarn Capacity SchedulerHadoopYarn
- 配置hadoop 使用fair scheduler排程器HadoopAI
- Fair Scheduler與Capacity Scheduler介紹AI
- goroutine 排程器(scheduler)Go
- 【MySQL】事件排程器 (Event Scheduler)MySql事件
- 雲端計算課程實驗之安裝Hadoop及配置偽分散式模式的HadoopHadoop分散式模式
- golang 原始碼分析之scheduler排程器Golang原始碼
- Pod的排程是由排程器(kube-scheduler)
- Hadoop多使用者資源管理–Fair Scheduler介紹與配置(Yarn)HadoopAIYarn
- RxJava原始碼解析(二)—執行緒排程器SchedulerRxJava原始碼執行緒
- 邊緣計算|Hadoop——邊緣計算和Hadoop是什麼關係?Hadoop
- Hadoop排程器原理解析Hadoop
- Hadoop 簡介 雲端計算Hadoop
- 邊緣計算閘道器在智慧儲能中的能效管理
- CDH5 Fair scheduler 配置H5AI
- 【Hadoop】按照map-reduce的思想試述完整的pagerank計算過程Hadoop
- Mac計算器的計算過程怎麼看?教你一鍵檢視運算記錄!Mac
- hadoop配置歷史伺服器&&配置日誌聚集Hadoop伺服器
- 第七天 樹莓派+計算棒配置過程樹莓派
- 詳解 MySQL 用事件排程器 Event Scheduler 建立定時任務MySql事件
- 詳解MySQL用事件排程器Event Scheduler建立定時任務MySql事件
- Hadoop過程中配置SSH免密碼登入Hadoop密碼
- 計算器
- 雲端計算開發教程,雲端計算能幹什麼?
- Spark計算過程分析Spark
- css 選擇器優先順序的計算過程CSS
- oracle排程程式作業dbms_schedulerOracle
- oracle使用DBMS_SCHEDULER排程作業Oracle
- Oracle 排程程式作業( dbms_scheduler )Oracle
- 交大計算機課程(5):計算機網路計算機網路
- TSINGSEE青犀多模型、算力排程與智慧分析AI演算法中臺介紹及應用Gse模型AI演算法
- 4個優化方法,讓你能瞭解join計算過程更透徹優化
- Titan-hadoop 分散式圖計算框架Hadoop分散式框架
- Hadoop 分散式儲存分散式計算Hadoop分散式
- hadoop之旅4-centerOS7: hadoop配置yarn資源管理器HadoopROSYarn
- 雲端計算開發入門課程:Linux重器 vi編輯器Linux
- Java 計算器Java
- 日期計算器