hadoop如何得到JobID———JobSubmitter的submitJobInternal分析
Job類提交作業會呼叫JobSubmitter的submitJobInternal看看它的原始碼
JobStatus submitJobInternal(Job job, Cluster cluster) throws ClassNotFoundException, InterruptedException, IOException {
//validate the jobs output specs
checkSpecs(job);
Configuration conf = job.getConfiguration();
addMRFrameworkToDistributedCache(conf);
Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster, conf);
//*****myAdd
LOG.info("---->jobStagingArea: "+jobStagingArea);
//configure the command line options correctly on the submitting dfs
InetAddress ip = InetAddress.getLocalHost();
if (ip != null) {
submitHostAddress = ip.getHostAddress();
submitHostName = ip.getHostName();
conf.set(MRJobConfig.JOB_SUBMITHOST,submitHostName);
conf.set(MRJobConfig.JOB_SUBMITHOSTADDR,submitHostAddress);
}
JobID jobId = submitClient.getNewJobID();
job.setJobID(jobId);
Path submitJobDir = new Path(jobStagingArea, jobId.toString());
JobStatus status = null;
try {
conf.set(MRJobConfig.USER_NAME,
UserGroupInformation.getCurrentUser().getShortUserName());
conf.set("hadoop.http.filter.initializers",
"org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer");
conf.set(MRJobConfig.MAPREDUCE_JOB_DIR, submitJobDir.toString());
LOG.debug("Configuring job " + jobId + " with " + submitJobDir
+ " as the submit dir");
// get delegation token for the dir
TokenCache.obtainTokensForNamenodes(job.getCredentials(),
new Path[] { submitJobDir }, conf);
populateTokenCache(conf, job.getCredentials());
// generate a secret to authenticate shuffle transfers
if (TokenCache.getShuffleSecretKey(job.getCredentials()) == null) {
KeyGenerator keyGen;
try {
int keyLen = CryptoUtils.isShuffleEncrypted(conf)
? conf.getInt(MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS,
MRJobConfig.DEFAULT_MR_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS)
: SHUFFLE_KEY_LENGTH;
keyGen = KeyGenerator.getInstance(SHUFFLE_KEYGEN_ALGORITHM);
keyGen.init(keyLen);
} catch (NoSuchAlgorithmException e) {
throw new IOException("Error generating shuffle secret key", e);
}
SecretKey shuffleKey = keyGen.generateKey();
TokenCache.setShuffleSecretKey(shuffleKey.getEncoded(),
job.getCredentials());
}
copyAndConfigureFiles(job, submitJobDir);
Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir);
// Create the splits for the job
LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir));
int maps = writeSplits(job, submitJobDir);
conf.setInt(MRJobConfig.NUM_MAPS, maps);
LOG.info("number of splits:" + maps);
// write "queue admins of the queue to which job is being submitted"
// to job file.
String queue = conf.get(MRJobConfig.QUEUE_NAME,
JobConf.DEFAULT_QUEUE_NAME);
AccessControlList acl = submitClient.getQueueAdmins(queue);
conf.set(toFullPropertyName(queue,
QueueACL.ADMINISTER_JOBS.getAclName()), acl.getAclString());
// removing jobtoken referrals before copying the jobconf to HDFS
// as the tasks don't need this setting, actually they may break
// because of it if present as the referral will point to a
// different job.
TokenCache.cleanUpTokenReferral(conf);
if (conf.getBoolean(
MRJobConfig.JOB_TOKEN_TRACKING_IDS_ENABLED,
MRJobConfig.DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED)) {
// Add HDFS tracking ids
ArrayList<String> trackingIds = new ArrayList<String>();
for (Token<? extends TokenIdentifier> t :
job.getCredentials().getAllTokens()) {
trackingIds.add(t.decodeIdentifier().getTrackingId());
}
conf.setStrings(MRJobConfig.JOB_TOKEN_TRACKING_IDS,
trackingIds.toArray(new String[trackingIds.size()]));
}
// Set reservation info if it exists
ReservationId reservationId = job.getReservationId();
if (reservationId != null) {
conf.set(MRJobConfig.RESERVATION_ID, reservationId.toString());
}
// Write job file to submit dir
writeConf(conf, submitJobFile);
//
// Now, actually submit the job (using the submit name)
//
printTokens(jobId, job.getCredentials());
status = submitClient.submitJob(
jobId, submitJobDir.toString(), job.getCredentials());
if (status != null) {
return status;
} else {
throw new IOException("Could not launch job");
}
} finally {
if (status == null) {
LOG.info("Cleaning up the staging area " + submitJobDir);
if (jtFs != null && submitJobDir != null)
jtFs.delete(submitJobDir, true);
}
}
獲取JobID的程式碼為 JobID jobId = submitClient.getNewJobID(); 這裡的submitClient為YarnRunner,呼叫它的getNewJobID()會呼叫它內部的resMgrDelegate.getNewJobID(); 在ResourceMgrDelegate類中
public JobID getNewJobID() throws IOException, InterruptedException
try {
this.application = client.createApplication().getApplicationSubmissionContext();
this.applicationId = this.application.getApplicationId();
return TypeConverter.fromYarn(applicationId);
} catch (YarnException e) {
throw new IOException(e);
}
ResourceMgrDelegate中的client為YarnClientImpl 在YarnClientImpl類中createApplication()的原始碼如下
@Override public YarnClientApplication createApplication() throws YarnException, IOException {
ApplicationSubmissionContext context = Records.newRecord
(ApplicationSubmissionContext.class);
GetNewApplicationResponse newApp = getNewApplication();
ApplicationId appId = newApp.getApplicationId();
context.setApplicationId(appId);
return new YarnClientApplication(newApp, context);
}
getNewApplication()原始碼如下
private GetNewApplicationResponse getNewApplication() throws YarnException, IOException {
GetNewApplicationRequest request =
Records.newRecord(GetNewApplicationRequest.class);
//The interface used by clients to obtain a new ApplicationId for submitting new applications.
//The ResourceManager responds with a new, monotonically increasing, ApplicationId which is used by the client to submit a new application.
return rmClient.getNewApplication(request);
}
總結:
獲取JobID的大致流程如下
1、提交作業的客戶端YarnRunner呼叫getNewJobID方法,內部呼叫ResourceMgrDelegate的getNewJobID
2、ResourceMgrDelegate呼叫內部成員client(實際上是YarnClientImpl)的CreateApplication方法建立一個YarnApplication
3、YarnApplication建立流程為:①構造一個 ApplicationSubmissionContext context物件②構造一個GetNewApplicationRequest request ,③呼叫 rmClient.getNewApplication(request)獲得一個GetNewApplicationResponse newApp物件 newApp中則包含了ResourceManager分配的ApplicationId.
4.呼叫context的setApplicationId設定ApplicationId,將ResourceMgrDelegate的內部成員Application設定為context
5.將ResourceMgrDelegate的內部成員ApplicationId設定為context的ApplicationId
相關文章
- 深度分析如何在Hadoop中控制Map的數量Hadoop
- 如何得到Oracle Patch (zt)Oracle
- Hadoop的GroupComparator是如何起做用的(原始碼分析)Hadoop原始碼
- 如何讓資料分析產生價值得到業務方認可
- 【索引】使用索引分析快速得到索引的基本資訊索引
- 希望萬變不離其中,先分析如何得到keyfile的部分 (7千字)
- 如何用FGA得到繫結變數的值變數
- 如何得到繫結變數的輸入值變數
- 亞馬遜的Alexa的語義分析效能得到大幅度提高亞馬遜
- hadoop原始碼分析Hadoop原始碼
- 如何得到暴雪娛樂公司的工作機會?
- 如何及時得到Jdon網站的內容更新?網站
- 如何直接在頁面得到系統的時間
- 如何得到一個隨機密碼隨機密碼
- 如何在Google得到一份工作Go
- MySQL中如何得到許可權資訊MySql
- 如何得到javax.servlet.jsp包?JavaServletJS
- Hadoop的大資料分析技術Hadoop大資料
- hadoop中的TextInputFormat類原始碼分析HadoopORM原始碼
- 如何在耗時較長的操作完成後得到提醒?
- 在Oracle中,如何得到真實的執行計劃?Oracle
- Aaron Swartz:如何得到一份像我這樣的工作?
- hibernate中如何得到統計資料?
- 超越Hadoop的大資料分析之致謝Hadoop大資料
- Hadoop的Server及其執行緒模型分析HadoopServer執行緒模型
- Hadoop 的 Server 及其執行緒模型分析HadoopServer執行緒模型
- 如何學習HadoopHadoop
- 搭建直播平臺,android 如何得到本地影片的縮圖Android
- 《劍指offer》:[37]如何得到連結串列環的入口地址
- Hadoop2原始碼分析-HDFS核心模組分析Hadoop原始碼
- 如何在《狂怒 2》中得到最佳遊戲體驗遊戲
- Java 工程師如何得到一個好 OfferJava工程師
- Hadoop2原始碼分析-Hadoop V2初識Hadoop原始碼
- 如何掌握Spark和Hadoop的架構SparkHadoop架構
- Hadoop如何設定HDFS的塊大小Hadoop
- 如何高效的閱讀hadoop原始碼?Hadoop原始碼
- 得到promisePromise
- Hadoop學習——Client原始碼分析Hadoopclient原始碼