本文包括第二期實戰營的第2課內容。本來是想給官方教程做做補充的,沒想到官方教程的質量還是相當高的,跟著一步一步做基本上沒啥坑。所以這篇筆記主要是拆解一下InternStudio封裝的一些東西,防止在本地復現時出現各種問題。
搭建環境
首先是搭建環境這裡,官方教程說:
進入開發機後,在 `terminal` 中輸入環境配置命令 (配置環境時間較長,需耐心等待):
studio-conda -o internlm-base -t demo
# 與 studio-conda 等效的配置方案
# conda create -n demo python==3.10 -y
# conda activate demo
# conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia
studio-conda
探秘
那麼,這句studio-conda -o internlm-base -t demo
究竟是什麼呢?我們直接檢視一下/root/.bashrc
,發現裡面就一句:
source /share/.aide/config/bashrc
繼續檢視/share/.aide/config/bashrc
,這個可長了,這裡給出最後兩句:
export HF_ENDPOINT='https://hf-mirror.com'
alias studio-conda="/share/install_conda_env.sh"
alias studio-smi="/share/studio-smi"
點選檢視/share/.aide/config/bashrc
的全部程式碼
#! /bin/bash
# ~/.bashrc: executed by bash(1) for non-login shells.
# see /usr/share/doc/bash/examples/startup-files (in the package bash-doc)
# for examples
# If not running interactively, don't do anything
case $- in
*i*) ;;
*) return;;
esac
# don't put duplicate lines or lines starting with space in the history.
# See bash(1) for more options
HISTCONTROL=ignoreboth
# append to the history file, don't overwrite it
shopt -s histappend
# for setting history length see HISTSIZE and HISTFILESIZE in bash(1)
HISTSIZE=1000
HISTFILESIZE=2000
# check the window size after each command and, if necessary,
# update the values of LINES and COLUMNS.
shopt -s checkwinsize
# If set, the pattern "**" used in a pathname expansion context will
# match all files and zero or more directories and subdirectories.
#shopt -s globstar
# make less more friendly for non-text input files, see lesspipe(1)
[ -x /usr/bin/lesspipe ] && eval "$(SHELL=/bin/sh lesspipe)"
# set variable identifying the chroot you work in (used in the prompt below)
if [ -z "${debian_chroot:-}" ] && [ -r /etc/debian_chroot ]; then
debian_chroot=$(cat /etc/debian_chroot)
fi
# set a fancy prompt (non-color, unless we know we "want" color)
case "$TERM" in
xterm-color|*-256color) color_prompt=yes;;
esac
# uncomment for a colored prompt, if the terminal has the capability; turned
# off by default to not distract the user: the focus in a terminal window
# should be on the output of commands, not on the prompt
#force_color_prompt=yes
if [ -n "$force_color_prompt" ]; then
if [ -x /usr/bin/tput ] && tput setaf 1 >&/dev/null; then
# We have color support; assume it's compliant with Ecma-48
# (ISO/IEC-6429). (Lack of such support is extremely rare, and such
# a case would tend to support setf rather than setaf.)
color_prompt=yes
else
color_prompt=
fi
fi
if [ "$color_prompt" = yes ]; then
PS1='${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '
else
PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w\$ '
fi
unset color_prompt force_color_prompt
# If this is an xterm set the title to user@host:dir
case "$TERM" in
xterm*|rxvt*)
PS1="\[\e]0;${debian_chroot:+($debian_chroot)}\u@\h: \w\a\]$PS1"
;;
*)
;;
esac
# enable color support of ls and also add handy aliases
if [ -x /usr/bin/dircolors ]; then
test -r ~/.dircolors && eval "$(dircolors -b ~/.dircolors)" || eval "$(dircolors -b)"
alias ls='ls --color=auto'
#alias dir='dir --color=auto'
#alias vdir='vdir --color=auto'
alias grep='grep --color=auto'
alias fgrep='fgrep --color=auto'
alias egrep='egrep --color=auto'
fi
# colored GCC warnings and errors
#export GCC_COLORS='error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01:quote=01'
# some more ls aliases
alias ll='ls -alF'
alias la='ls -A'
alias l='ls -CF'
# Add an "alert" alias for long running commands. Use like so:
# sleep 10; alert
alias alert='notify-send --urgency=low -i "$([ $? = 0 ] && echo terminal || echo error)" "$(history|tail -n1|sed -e '\''s/^\s*[0-9]\+\s*//;s/[;&|]\s*alert$//'\'')"'
# Alias definitions.
# You may want to put all your additions into a separate file like
# ~/.bash_aliases, instead of adding them here directly.
# See /usr/share/doc/bash-doc/examples in the bash-doc package.
if [ -f ~/.bash_aliases ]; then
. ~/.bash_aliases
fi
# enable programmable completion features (you don't need to enable
# this, if it's already enabled in /etc/bash.bashrc and /etc/profile
# sources /etc/bash.bashrc).
if ! shopt -oq posix; then
if [ -f /usr/share/bash-completion/bash_completion ]; then
. /usr/share/bash-completion/bash_completion
elif [ -f /etc/bash_completion ]; then
. /etc/bash_completion
fi
fi
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/root/.conda/condabin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/root/.conda/etc/profile.d/conda.sh" ]; then
. "/root/.conda/etc/profile.d/conda.sh"
else
export PATH="/root/.conda/condabin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
if [ -d "/root/.conda/envs/xtuner" ]; then
CONDA_ENV=xtuner
else
CONDA_ENV=base
fi
source activate $CONDA_ENV
cat /share/.aide/config/welcome_vgpu
#if [ $CONDA_ENV != "xtuner" ]; then
# echo -e """
# \033[31m 檢測到您尚未初始化xtuner環境, 建議執行> source init_xtuner_env.sh \033[0m
# """
#fi
export https_proxy=http://proxy.intern-ai.org.cn:50000
export http_proxy=http://proxy.intern-ai.org.cn:50000
export no_proxy='localhost,127.0.0.1,0.0.0.0,172.18.47.140'
export PATH=/root/.local/bin:$PATH
export HF_ENDPOINT='https://hf-mirror.com'
alias studio-conda="/share/install_conda_env.sh"
alias studio-smi="/share/studio-smi"
注意到倒數第二行:alias studio-conda="/share/install_conda_env.sh"
,也就是說studio-conda
是/share/install_conda_env.sh
的別名。我們在執行studio-conda -o internlm-base -t demo
的時候,實際上呼叫的是/share/install_conda_env.sh
這個指令碼。我們進一步檢視/share/install_conda_env.sh
:
HOME_DIR=/root
CONDA_HOME=$HOME_DIR/.conda
SHARE_CONDA_HOME=/share/conda_envs
SHARE_HOME=/share
echo -e "\033[34m [1/2] 開始安裝conda環境: <$target>. \033[0m"
sleep 3
tar --skip-old-files -xzvf /share/pkgs.tar.gz -C ${CONDA_HOME}
wait_echo&
wait_pid=$!
conda create -n $target --clone ${SHARE_CONDA_HOME}/${source}
if [ $? -ne 0 ]; then
echo -e "\033[31m 初始化conda環境: ${target}失敗 \033[0m"
exit 10
fi
kill $wait_pid
# for xtuner, re-install dependencies
case "$source" in
xtuner)
source_install_xtuner $target
;;
esac
echo -e "\033[34m [2/2] 同步當前conda環境至jupyterlab kernel \033[0m"
lab add $target
source $CONDA_HOME/bin/activate $target
cd $HOME_DIR
點選檢視/share/install_conda_env.sh
的全部程式碼
#!/bin/bash
# clone internlm-base conda env to user's conda env
# created by xj on 01.07.2024
# modifed by xj on 01.19.2024 to fix bug of conda env clone
# modified by ljy on 01.26.2024 to extend
XTUNER_UPDATE_DATE=`cat /share/repos/UPDATE | grep xtuner |awk -F= '{print $2}'`
HOME_DIR=/root
CONDA_HOME=$HOME_DIR/.conda
SHARE_CONDA_HOME=/share/conda_envs
SHARE_HOME=/share
list() {
cat <<-EOF
預設環境 描述
internlm-base pytorch:2.0.1, pytorch-cuda:11.7
xtuner Xtuner(原始碼安裝: main $(echo -e "\033[4mhttps://github.com/InternLM/xtuner/tree/main\033[0m"), 更新日期:$XTUNER_UPDATE_DATE)
pytorch-2.1.2 pytorch:2.1.2, pytorch-cuda:11.8
EOF
}
help() {
cat <<-EOF
說明: 用於快速clone預設的conda環境
使用:
1. studio-conda env -l/list 列印預設的conda環境列表
2. studio-conda <target-conda-name> 快速clone: 預設複製internlm-base conda環境
3. studio-conda -t <target-conda-name> -o <origin-conda-name> 將預設的conda環境複製到指定的conda環境
EOF
}
clone() {
source=$1
target=$2
if [[ -z "$source" || -z "$target" ]]; then
echo -e "\033[31m 輸入不符合規範 \033[0m"
help
exit 1
fi
if [ ! -d "${SHARE_CONDA_HOME}/$source" ]; then
echo -e "\033[34m 指定的預設環境: $source不存在\033[0m"
list
exit 1
fi
if [ -d "${CONDA_HOME}/envs/$target" ]; then
echo -e "\033[34m 指定conda環境的目錄: ${CONDA_HOME}/envs/$target已存在, 將清空原目錄安裝 \033[0m"
wait_echo&
wait_pid=$!
rm -rf "${CONDA_HOME}/envs/$target"
kill $wait_pid
fi
echo -e "\033[34m [1/2] 開始安裝conda環境: <$target>. \033[0m"
sleep 3
tar --skip-old-files -xzvf /share/pkgs.tar.gz -C ${CONDA_HOME}
wait_echo&
wait_pid=$!
conda create -n $target --clone ${SHARE_CONDA_HOME}/${source}
if [ $? -ne 0 ]; then
echo -e "\033[31m 初始化conda環境: ${target}失敗 \033[0m"
exit 10
fi
kill $wait_pid
# for xtuner, re-install dependencies
case "$source" in
xtuner)
source_install_xtuner $target
;;
esac
echo -e "\033[34m [2/2] 同步當前conda環境至jupyterlab kernel \033[0m"
lab add $target
source $CONDA_HOME/bin/activate $target
cd $HOME_DIR
echo -e "\033[32m conda環境: $target安裝成功! \033[0m"
echo """
============================================
ALL DONE!
============================================
"""
}
source_install_xtuner() {
conda_env=$1
echo -e "\033[34m 原始碼安裝xtuner... \033[0m"
sleep 2
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
install=0
if [ -d "${HOME_DIR}/xtuner" ]; then
read -r -p "$HOME_DIR中已存在目錄xtuner: 是否清空目錄? [Y/N][yes/no]" input
case $input in
[yY][eE][sS]|[yY])
echo -e "\033[34m 清空目錄: $HOME_DIR/xtuner, 並同步原始碼至該目錄進行原始碼安裝... \033[0m"
install=1
;;
*)
echo -e "\033[34m 嘗試使用: $HOME_DIR/xtuner目錄進行原始碼安裝... \033[0m"
;;
esac
else
install=1
fi
if [ $install -eq 1 ]; then
rm -rf $HOME_DIR/xtuner
mkdir -p $HOME_DIR/xtuner
cp -rf $SHARE_HOME/repos/xtuner/* $HOME_DIR/xtuner/
fi
cd $HOME_DIR/xtuner
$CONDA_HOME/envs/$conda_env/bin/pip install -e '.[all]'
if [ $? -ne 0 ]; then
echo -e "\033[31m 原始碼安裝xtuner失敗 \033[0m"
exit 10
fi
$CONDA_HOME/envs/$conda_env/bin/pip install cchardet
$CONDA_HOME/envs/$conda_env/bin/pip install -U datasets
}
wait_echo() {
local i=0
local sp='/-\|'
local n=${#sp}
printf ' '
while sleep 0.1; do
printf '\b%s' "${sp:i++%n:1}"
done
}
dispatch() {
if [ $# -lt 1 ]; then
help
exit -2
fi
if [ $1 == "env" ]; then
list
exit 0
fi
if [[ $1 == "-h" || $1 == "help" ]]; then
help
exit 0
fi
origin_env=
target_env=
if [ $# -eq 1 ]; then
origin_env=internlm-base
target_env=$1
else
while getopts t:o: flag; do
case "${flag}" in
t) target_env=${OPTARG} ;;
o) origin_env=${OPTARG} ;;
esac
done
fi
echo -e "\033[32m 預設環境: $origin_env \033[0m"
echo -e "\033[32m 目標conda環境名稱: $target_env \033[0m"
sleep 3
clone $origin_env $target_env
}
dispatch $@
這個檔案就是它設定程式碼環境的了。指令碼里面定義了幾個變數和函式,之後就直接呼叫dispatch函式了。之後的流程如下:
- 因為我們給的引數是
-o internlm-base -t demo
,所以會直接從dispatch這裡執行指令碼中的clone
函式,引數是internlm-base demo
。 CONDA_HOME
會透過HOME_DIR=/root; CONDA_HOME=$HOME_DIR/.conda
指定為/root/.conda
,即工作區下的資料夾。- 然後,將
/share/pkgs.tar.gz
解壓至目錄,再透過conda create clone的方式克隆環境完成環境的搭建。
所以這個命令實際上是將預配置好的環境打包解壓克隆了一遍,和教程中的等效程式碼還是有較大不同的。
然後需要我們執行以下程式碼配置環境。輕輕吐槽一下既然都是直接解壓並conda clone了,為什麼不直接做一個裝好這些庫的conda環境壓縮包。
conda activate demo
pip install huggingface-hub==0.17.3
pip install transformers==4.34
pip install psutil==5.9.8
pip install accelerate==0.24.1
pip install streamlit==1.32.2
pip install matplotlib==3.8.3
pip install modelscope==1.9.5
pip install sentencepiece==0.1.99
下載模型
再透過呼叫modelscope.hub.snapshot_download
從modelscope下載模型:
import os
from modelscope.hub.snapshot_download import snapshot_download
os.system("mkdir /root/models")
save_dir="/root/models"
snapshot_download("Shanghai_AI_Laboratory/internlm2-chat-1_8b",
cache_dir=save_dir, revision='v1.1.0')
有一說一,官方教程新建資料夾這裡不呼叫os.mkdir
而是直接os.system("mkdir /root/models")
真是個bad practice,別學。
模型推理
使用以下程式碼完成模型推理:
# 匯入相關的庫
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name_or_path = "/root/models/Shanghai_AI_Laboratory/internlm2-chat-1_8b"
# Hugging Face 的 AutoTokenizer 和 AutoModelForCausalLM 類熟悉大模型的不會陌生,用於自動載入預訓練模型和相應的tokenizer。
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True, device_map='cuda:0')
# 相信遠端程式碼以便從HuggingFace拉取確實模型權重,使用bf16量化節省記憶體,指定使用第一張顯示卡
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, trust_remote_code=True, torch_dtype=torch.bfloat16, device_map='cuda:0')
model = model.eval()
system_prompt = """You are an AI assistant whose name is InternLM (書生·浦語).
- InternLM (書生·浦語) is a conversational language model that is developed by Shanghai AI Laboratory (上海人工智慧實驗室). It is designed to be helpful, honest, and harmless.
- InternLM (書生·浦語) can understand and communicate fluently in the language chosen by the user such as English and 中文.
"""
messages = [(system_prompt, '')]
print("=============Welcome to InternLM chatbot, type 'exit' to exit.=============")
while True:
input_text = input("\nUser >>> ")
input_text = input_text.replace(' ', '') # 移除使用者輸入文字中的空格
if input_text == "exit": # 如果要退出,輸入exit即可
break
length = 0
# 對模型的 stream_chat 方法進行迭代,該方法會生成一個對話的生成器。迭代過程中,每次生成一個回覆訊息 response 和一個佔位符 _。
for response, _ in model.stream_chat(tokenizer, input_text, messages):
# 如果回覆訊息不為空,則列印回覆訊息中從上次列印位置 length 開始到結尾的部分,並重新整理輸出緩衝區。
if response is not None:
print(response[length:], flush=True, end="")
# 更新上次列印的位置,以便下一次列印時從正確位置開始。
length = len(response)
基礎作業執行結果
輸入命令,執行 Demo 程式:
conda activate demo
python /root/demo/cli_demo.py
基礎作業還是輕輕又鬆鬆啊哈哈哈哈。。。不過其實之前模型輸出崩壞過一次:
對的,模型直接給了30個故事的名字。我直接掐斷了模型的輸出。
伺服器顯示卡資訊
出於好奇看了看顯示卡資訊:
原來真的是A100啊,不過很好奇他們是怎麼控制單個開發機的視訊記憶體開銷為10%、30%、50%的了。哈哈哈哈哈哈哈哈。