python實現批次執行命令列

ben犇發表於2024-03-10

python實現批次執行命令列

背景:

對於不同引數設定來呼叫同一個介面,如果手動一條條修改再執行非常慢且容易出錯。尤其是這次引數非常多且長。比如之前都是輸入nohup python -u exe.py >> ../log/exp3.log 2>&1 & 來執行一次,在exe中會設定引數並呼叫介面執行preditction_uni(input_file_path, sheet_name, output_file_path, pic_path, cols_x, cols_y, n_jobs, sheet_name_res),由於每次這些引數都不同但又有一定的規律,所以可以嘗試批次執行。

解決方法:

使用python的subprocess

檔案結構是:

|--cmd_exe.py

--exe.py

--exp3.py

--interface.py

exp3.py中設定引數並呼叫interface.py中的介面,exe.py中是主函式呼叫exp3.py啟動執行。

現在修改了exp3.py中程式碼,迴圈的使用不同引數來帶入subprocess,進而呼叫cmd_exe.py執行原本要在terminal中執行的指令。這樣把原本在exp3.py中呼叫介面preditction_uni部分挪到新的cmd_exe.py中,exp3,py中增加command迴圈呼叫的程式碼來實現批次執行。

# exp3.py 部分程式碼

import subprocess
# ...
# 設定引數
# ···
n_jobs = int(5)
cols_x_str = f'"{cols_xs}"'

commmand = f"nohup python -u cmd_exe.py {input_file_path} {sheet_name} {output_file_path} {pic_path} {cols_x_str} {cols_y} {n_jobs} {sheet_name_res} >> ../log/exp3/{log_name}.log 2>&1 &"

print(commmand) # 一般會列印命令
subprocess.Popen(commmand, shell=True)
# cmd_exe.py 程式碼

import sys
import exp3
import ast


if __name__ == '__main__':
    input_file_path = sys.argv[1]
    sheet_name = sys.argv[2]
    output_file_path = sys.argv[3]
    pic_path = sys.argv[4]
    cols_x = ast.literal_eval(sys.argv[5])
    cols_y = sys.argv[6]
    n_jobs = sys.argv[7]
    sheet_name_res = sys.argv[8]

    exp3.preditction_uni(input_file_path, sheet_name, output_file_path, pic_path, cols_x, cols_y, n_jobs, sheet_name_res)
  • 傳引數問題1:

    exp3.py中我想傳給呼叫的preditction_uni函式引數有數字和包含多個字串的陣列。示例如下。

    n_jobs = 5 
    cols_xs = ['引數1', '引數2','引數1', '引數2','引數1', '引數2']
    

    但是在執行後會出現錯誤:

    Traceback (most recent call last):
      File "/home/P.py", line 11, in <module>
        cols_x = ast.literal_eval(sys.argv[5])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/anaconda3/envs/python310/lib/python3.11/ast.py", line 64, in literal_eval
        node_or_string = parse(node_or_string.lstrip(" \t"), mode='eval')
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/anaconda3/envs/python310/lib/python3.11/ast.py", line 50, in parse
        return compile(source, filename, mode, flags,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "<unknown>", line 1
        [引數1,
        ^
    SyntaxError: '[' was never closed
    

    嘗試使用ast.literal_eval()來解析,但是不行。cols_xs 是一個包含多個字串元素的列表。透過argv來傳遞字串列表是解析出來只有第一個元素。所以想把整個字串列表都傳過來那麼要把整個列表作為一個字串,即在外面再包一個雙引號f'"{cols_xs}"'

  • 傳引數問題2:

    Traceback (most recent call last):
      ...
      File "/home/P.py", line 179, in post_run
        y_pred_test, y_pred_prob_test, b_solver, b_c = train_model_Grid(estimator, param, cv=5, X_train=X_train_std, X_test=X_test_std, y_train=y_train, n_jobs=n_jobs)
                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...
    validate_parameter_constraints
        raise InvalidParameterError(
    sklearn.utils._param_validation.InvalidParameterError: The 'n_jobs' parameter of GridSearchCV must be an instance of 'int' or None. Got '5' instead.
    

    傳的引數是一個字串型別,即使它表示一個整數值。使用n_jobs = int(5)

    透過 sys.argv 傳遞的引數是字串型別(str)。sys.argv 是一個列表,其中包含了命令列引數的字串表示形式。列表的第一個元素是執行 Python 指令碼的檔案路徑,後續元素是命令列中提供的引數。

    如果需要將其解析為其他型別,例如整數或浮點數,需要使用相應的轉換函式(例如 int()float())進行轉換。

sys傳引數

透過 sys.argv 傳遞的引數是字串型別(str)

示例程式:

# test.py

import subprocess
import test1

nums = [1, 2, 3]

input_file_path = '../python/data'

n_jobs = 6

col_x =[
    ['test', 'is', 'just', 'a', 'demo'],
    ['tomorrow', 'never', 'wait', 'for', 'you'],
    ['just', 'do', 'it', 'now']
]

cols = zip(nums, col_x)

for num, colx in cols:
    # colx = f'"{colx}"'
    command = f"python -u test1.py {input_file_path} {n_jobs} {colx} >> ./test.log 2>&1 &"

    print(command)

    subprocess.Popen(command, shell=True)

    print(f"process {num} is running")
# test1.py
import sys

if __name__ == '__main__':

    print(f"sys.argv[0] = {sys.argv[0]} , type = {type(sys.argv[0])}")
    print(f"sys.argv[1] = {sys.argv[1]} , type = {type(sys.argv[1])}")
    print(f"sys.argv[2] = {sys.argv[2]} , type = {type(sys.argv[2])}")
    print(f"sys.argv[3] = {sys.argv[3]} , type = {type(sys.argv[3])}")
python test.py

結果:

python -u test1.py ../python/data 6 ['test', 'is', 'just', 'a', 'demo'] >> ./test.log 2>&1 &
process 1 is running
python -u test1.py ../python/data 6 ['tomorrow', 'never', 'wait', 'for', 'you'] >> ./test.log 2>&1 &
process 2 is running
python -u test1.py ../python/data 6 ['just', 'do', 'it', 'now'] >> ./test.log 2>&1 &
process 3 is running

# test.log
sys.argv[0] = test1.py , type = <class 'str'>
sys.argv[1] = ../python/data , type = <class 'str'>
sys.argv[2] = 6 , type = <class 'str'>
sys.argv[3] = [tomorrow, , type = <class 'str'>
sys.argv[0] = test1.py , type = <class 'str'>
sys.argv[1] = ../python/data , type = <class 'str'>
sys.argv[2] = 6 , type = <class 'str'>
sys.argv[3] = [test, , type = <class 'str'>
sys.argv[0] = test1.py , type = <class 'str'>
sys.argv[1] = ../python/data , type = <class 'str'>
sys.argv[2] = 6 , type = <class 'str'>
sys.argv[3] = [just, , type = <class 'str'>

在 command 之前增加colx = f'"{colx}"',之後的效果:

# test.log

sys.argv[1] = ../python/data , type = <class 'str'>
sys.argv[2] = 6 , type = <class 'str'>
sys.argv[3] = ['test', 'is', 'just', 'a', 'demo'] , type = <class 'str'>
sys.argv[0] = test1.py , type = <class 'str'>
sys.argv[1] = ../python/data , type = <class 'str'>
sys.argv[2] = 6 , type = <class 'str'>
sys.argv[3] = ['tomorrow', 'never', 'wait', 'for', 'you'] , type = <class 'str'>
sys.argv[0] = test1.py , type = <class 'str'>
sys.argv[1] = ../python/data , type = <class 'str'>
sys.argv[2] = 6 , type = <class 'str'>
sys.argv[3] = ['just', 'do', 'it', 'now'] , type = <class 'str'>

相關文章