Mistral-7B-Instruct-v0.2 執行嚐鮮,原始權重和pytorch/safetensor兩種方式

沙滩炒花蛤發表於2024-03-28

https://docs.mistral.ai/models/

Mistral-7B-Instruct-v0.2 raw_weights: https://models.mistralcdn.com/mistral-7b-v0-2/Mistral-7B-v0.2-Instruct.tar
md5sum: fbae55bc038f12f010b4251326e73d39

mistral-7B-v0_2: https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar

https://github.com/mistralai-sf24/hackathon

run our 7B model and to finetune it

Mistral 7B v0.2 基礎模型開源,魔搭社群微調教程和評測來啦! ModelScope小助理 2024-03-26

Mistral 7B v0.2是基礎模型,並不適合直接使用推理使用,推薦使用其instruct版本

qucik start with raw_weights, hackathon

下載原始模型權重檔案並執行

# download the model
$ wget -c https://models.mistralcdn.com/mistral-7b-v0-2/Mistral-7B-v0.2-Instruct.tar
$ md5sum Mistral-7B-v0.2-Instruct.tar

# 解壓, 得到 consolidated.00.pth、params.json、tokenizer.model, 把這三放到資料夾 Mistral-7B-v0.2-Instruct-raw 裡面
$ tar -xf Mistral-7B-v0.2-Instruct.tar

$ git clone https://github.com/mistralai-sf24/hackathon.git
$ cd hackathon
$ pip install -r requirements_hackathon.txt
$ python -m main demo ../Mistral-7B-v0.2-Instruct-raw

$ python -m main interactive ../Mistral-7B-v0.2-Instruct-raw

TypeError: ModelArgs.__init__() missing 1 required positional argument: 'sliding_window' 錯誤是因為 Mistral-7B-Instruct-v0.2 取消了滑動視窗,需要註釋掉程式碼裡的 sliding_window,最後執行成功。修改後的程式碼

Chat template

<s>[INST] Instruction [/INST] Model answer</s>[INST] Follow-up instruction [/INST]

例子:"[INST] What is your favourite condiment? [/INST]Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen! [INST] Do you have mayonnaise recipes? [/INST]"

apply_chat_template 的例子

原始模型權重檔案轉換為 huggingface 格式

轉換指令碼

$ python convert_mistral_weights_to_hf.py --input_dir Mistral-7B-v0.2-Instruct-raw --model_size 7B --output_dir Mistral-7B-v0.2-Instruct-hf
Traceback (most recent call last):
  File "/data/user/yicairun/repo/lm/mistralai/convert_mistral_weights_to_hf.py", line 276, in <module>
    main()
  File "/data/user/yicairun/repo/lm/mistralai/convert_mistral_weights_to_hf.py", line 264, in main
    write_model(
  File "/data/user/yicairun/repo/lm/mistralai/convert_mistral_weights_to_hf.py", line 92, in write_model
    sliding_window = int(params["sliding_window"])
KeyError: 'sliding_window'

不轉了,直接下載,可選映象站或者modelscope

model-00001-of-00003.safetensors SHA256: 63654d601820b88b1fa8b4a98df5714f700fbc5b3df2cc4ecbabdced35096d31
model-00002-of-00003.safetensors SHA256: a42716540ecb2385d371f2109835921ff535406cac8fe8ff28f2f0b5fc7895bd
model-00003-of-00003.safetensors SHA256: 5f86e15cb3ed9078e30ae6e72445e109d0e337d9cde59b9aeea4ce8e44e54a5d

pytorch_model-00001-of-00003.bin SHA256: d8836f675fe1c4c43f3ff4e93f4cc0e97ef7a13e8c240fb39ad02d37ff303ef5
pytorch_model-00002-of-00003.bin SHA256: 58a7ddffb463397de5dbe1f1e2ec1ccf6aae2b549565f83f3ded124e0b4c5069
pytorch_model-00003-of-00003.bin SHA256: 75824d68dcf82d02b731b2bdfd3a9711acb7c58b8d566f4c0d3e9efac52f9a21

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda:7" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("./")  # 優先載入 safetensors 模型檔案,刪除 model.safetensors.index.json 後才載入pytorch_model.bin
tokenizer = AutoTokenizer.from_pretrained("./")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

相關文章