原文地址: www.jianshu.com/p/1ed50e6d4…
Bull是基於Redis的一個Node.js任務佇列管理庫,支援延遲佇列,優先順序任務,重複任務,以及原子操作等多種功能.
本文將從基本的使用來分析Bull的原始碼,對於repeat job,seperate processes等暫不展開.
Bull: Premium Queue package for handling jobs and messages in NodeJS.
相關的資訊如下:
- 原始碼地址: github.com/OptimalBits…
- Branch:
develop
- Last commit: 4f5744a
基本使用
Bull的使用分為三個步驟:
- 建立佇列
- 繫結任務處理函式
- 新增任務
如下示例:
const Bull = require('bull')
// 1. 建立佇列
const myFirstQueue = new Bull('my-first-queue');
// 2. 繫結任務處理函式
myFirstQueue.process(async (job, data) => {
return doSomething(data);
});
// 3. 新增任務
const job = await myFirstQueue.add({
foo: 'bar'
});
複製程式碼
建立佇列
建立佇列是先通過require
然後再通過new
來實現的,因此要先找到require
的入口.開啟package.json
:
{
"name": "bull",
"version": "3.7.0",
"description": "Job manager",
"main": "./index.js",
...
}
複製程式碼
看到入口為index.js
,開啟:
module.exports = require('./lib/queue');
module.exports.Job = require('./lib/job');
複製程式碼
從而找到目標函式所在檔案./lib/queue
:
module.exports = Queue;
複製程式碼
可以看到exports的是Queue
,接著去分析Queue
函式:
const Queue = function Queue(name, url, opts) {
...
// 預設設定
this.settings = _.defaults(opts.settings, {
lockDuration: 30000,
stalledInterval: 30000,
maxStalledCount: 1,
guardInterval: 5000,
retryProcessDelay: 5000,
drainDelay: 5, // 空佇列時brpoplpush的等待時間
backoffStrategies: {}
});
...
// Bind these methods to avoid constant rebinding and/or creating closures
// in processJobs etc.
this.moveUnlockedJobsToWait = this.moveUnlockedJobsToWait.bind(this);
this.processJob = this.processJob.bind(this);
this.getJobFromId = Job.fromId.bind(null, this);
...
};
複製程式碼
主要是進行引數初始化和函式的繫結.
繫結任務處理函式
該步驟是從myFirstQueue.process
開始的,先看process
函式:
Queue.prototype.process = function (name, concurrency, handler) {
...
this.setHandler(name, handler); // 1. 繫結handler
return this._initProcess().then(() => {
return this.start(concurrency); // 2. 啟動佇列
});
};
複製程式碼
該函式做了兩個事情:
- 繫結handler
- 啟動佇列
先看繫結handler:
Queue.prototype.setHandler = function (name, handler) {
...
if (this.handlers[name]) {
throw new Error('Cannot define the same handler twice ' + name);
}
...
if (typeof handler === 'string') {
...
} else {
handler = handler.bind(this);
// 將handler和名字儲存起來
if (handler.length > 1) {
this.handlers[name] = promisify(handler);
} else {
this.handlers[name] = function () {
...
}
}
};
複製程式碼
再看佇列的啟動:
Queue.prototype.start = function (concurrency) {
return this.run(concurrency).catch(err => {
this.emit('error', err, 'error running queue');
throw err;
});
};
複製程式碼
看run
函式:
Queue.prototype.run = function (concurrency) {
const promises = [];
return this.isReady()
.then(() => {
return this.moveUnlockedJobsToWait(); // 將unlocked的任務移動到wait佇列
})
.then(() => {
return utils.isRedisReady(this.bclient);
})
.then(() => {
while (concurrency--) {
promises.push(
new Promise(resolve => {
this.processJobs(concurrency, resolve); // 處理任務
})
);
}
this.startMoveUnlockedJobsToWait(); // unlocked job定時檢查
return Promise.all(promises);
});
};
複製程式碼
unlocked job(stalled job): job的執行需要鎖,正常情況下job在active時會獲取鎖(有過期時間
lockDuration
,定時延長lockRenewTime
),complete時釋放鎖,如果job在active時無鎖,說明程式被阻塞或崩潰導致鎖過期
看processJobs
:
Queue.prototype.processJobs = function (index, resolve, job) {
const processJobs = this.processJobs.bind(this, index, resolve);
process.nextTick(() => {
this._processJobOnNextTick(processJobs, index, resolve, job);
});
};
複製程式碼
再看_processJobOnNextTick
:
// 關鍵程式碼
const gettingNextJob = job ? Promise.resolve(job) : this.getNextJob();
return (this.processing[index] = gettingNextJob
.then(this.processJob)
.then(processJobs, err => {
...
}));
複製程式碼
上述程式碼可以作如下描述:
- job為空時用
getNextJob
函式來獲取job - 執行
processJob
函式 - 執行
processJobs
函式
先看getNextJob
:
if (this.drained) {
//
// Waiting for new jobs to arrive
//
console.log('bclient start get new job');
return this.bclient
.brpoplpush(this.keys.wait, this.keys.active, this.settings.drainDelay)
.then(
jobId => {
if (jobId) {
return this.moveToActive(jobId);
}
},
err => {
...
}
);
} else {
return this.moveToActive();
}
複製程式碼
運用Redis的PUSH/POP
機制來獲取訊息,超時時間為drainDelay
.
接著來看processJob
:
Queue.prototype.processJob = function (job) {
...
const handleCompleted = result => {
return job.moveToCompleted(result).then(jobData => {
...
return jobData ? this.nextJobFromJobData(jobData[0], jobData[1]) : null;
});
};
// 延長鎖的時間
lockExtender();
const handler = this.handlers[job.name] || this.handlers['*'];
if (!handler) {
...
} else {
let jobPromise = handler(job);
...
return jobPromise
.then(handleCompleted)
.catch(handleFailed)
.finally(() => {
stopTimer();
});
}
};
複製程式碼
可以看到任務處理成功後會呼叫handleCompleted
,在其中呼叫的是job的moveToCompleted
,中間還有一些呼叫,最終會呼叫lua指令碼moveToFinished
:
...
-- Try to get next job to avoid an extra roundtrip if the queue is not closing,
-- and not rate limited.
...
複製程式碼
該指令碼到作用是將job移動到completed或failed佇列,然後取下一個job.
在processJob
執行完後就又重複執行processJobs
,這就是一個迴圈,這個是核心,如下圖:
新增任務
直接看add函式:
Queue.prototype.add = function (name, data, opts) {
...
if (opts.repeat) {
...
} else {
return Job.create(this, name, data, opts);
}
};
複製程式碼
呼叫的是Job中的create函式:
Job.create = function (queue, name, data, opts) {
const job = new Job(queue, name, data, opts); // 1. 建立job
return queue
.isReady()
.then(() => {
return addJob(queue, job); // 2. 新增job到佇列中
})
...
};
複製程式碼
繼續沿著addJob
,最終會呼叫的是lua指令碼的addJob
,根據job設定將job存入redis.
問題
1. 為什麼會出現錯誤: job stalled more than allowable limit
在run函式中執行了函式this.startMoveUnlockedJobsToWait()
,來看看該函式:
Queue.prototype.startMoveUnlockedJobsToWait = function () {
clearInterval(this.moveUnlockedJobsToWaitInterval);
if (this.settings.stalledInterval > 0 && !this.closing) {
this.moveUnlockedJobsToWaitInterval = setInterval(
this.moveUnlockedJobsToWait,
this.settings.stalledInterval
);
}
};
複製程式碼
該函式是用來定時執行moveUnlockedJobsToWait
函式:
Queue.prototype.moveUnlockedJobsToWait = function () {
...
return scripts
.moveUnlockedJobsToWait(this)
.then(([failed, stalled]) => {
const handleFailedJobs = failed.map(jobId => {
return this.getJobFromId(jobId).then(job => {
this.emit(
'failed',
job,
new Error('job stalled more than allowable limit'),
'active'
);
return null;
});
});
...
})
...
;
};
複製程式碼
該函式會通過scripts的moveUnlockedJobsToWait
函式最終呼叫lua指令碼moveUnlockedJobsToWait
:
...
local MAX_STALLED_JOB_COUNT = tonumber(ARGV[1])
...
if(stalledCount > MAX_STALLED_JOB_COUNT) then
rcall("ZADD", KEYS[4], ARGV[3], jobId)
rcall("HSET", jobKey, "failedReason", "job stalled more than allowable limit")
table.insert(failed, jobId)
else
-- Move the job back to the wait queue, to immediately be picked up by a waiting worker.
rcall("RPUSH", dst, jobId)
rcall('PUBLISH', KEYS[1] .. '@', jobId)
table.insert(stalled, jobId)
end
...
return {failed, stalled}
複製程式碼
- MAX_STALLED_JOB_COUNT: 預設為1
該指令碼會將stalled的job取出並返回,從而生成如題錯誤.