ollama 初體驗

harrychinese發表於2024-04-04

參考

https://github.com/ollama/ollama
https://zhuanlan.zhihu.com/p/689555159
https://zhuanlan.zhihu.com/p/687099148
https://zhuanlan.zhihu.com/p/685166253
https://babyno.top/posts/2024/03/run-a-large-language-model-locally-2/ 提供RAG示例
https://sspai.com/post/85193#!

Windows 設定環境變數

  • OLLAMA_HOST, 取值為 0.0.0.0
  • OLLAMA_MODELS, 取值為 D:\my_workspace\OLLAMA_MODELS\

下載並安裝 ollama windows版本

ollama 安裝後會提供一個命令列工具 ollama.exe,

ollama pull qwen:0.5b #檔案: 395MB, 僅下載模型
ollama run tinyllama  #檔案: 637MB, 下載並自動使用該模型推理
ollama run qwen:1.8b   #檔案: 637MB
ollama run nomic-embed-text  #檔案: 275MB
ollama run qwen:7b   #檔案: 1.1GB
ollama run mistral
ollama run llama2
ollama run llama2-chinese
ollama serve      # 啟動 ollama 本地訪問, 埠為 11434

ollama API 示例

下面是 VS code 的 RestClient寫法, 不知為何 RestClient 無法使用 localhost 和 127.0.0.1 訪問.

可以訪問:
GET http://0.0.0.0:11434/ HTTP/1.1

不可以訪問:
GET http://localhost:11434/ HTTP/1.1
GET http://127.0.0.1:11434/ HTTP/1.1


POST http://0.0.0.0:11434/api/embeddings HTTP/1.1
content-type: application/json

{
 "model": "qwen:0.5b",
 "prompt": "Here is an article about llamas..."
}

POST http://0.0.0.0:11434/api/embeddings HTTP/1.1
content-type: application/json

{
 "model": "nomic-embed-text",
 "prompt": "Here is an article about llamas..."
}

POST http://0.0.0.0:11434/api/show HTTP/1.1
content-type: application/json

{
 "name": "qwen:0.5b"
}


POST http://0.0.0.0:11434/api/generate HTTP/1.1
content-type: application/json

{
  "model": "qwen:0.5b",
  "prompt": "Here is an article about llamas...",
  "context": [
  ],
  "stream": false,
  "format":"json",
  "options": {
    "seed": 123,
    "temperature": 0
  }  
}

POST http://0.0.0.0:11434/api/chat HTTP/1.1
content-type: application/json

{
  "model": "qwen:0.5b",  
  "stream": false,
  "format":"json",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ]  
}


## Send a chat message with a conversation history, 同時增加system role設定系統提示詞. 
POST http://0.0.0.0:11434/api/chat HTTP/1.1
content-type: application/json

{
  "model": "qwen:1.8b",  
  "stream": false,
  "format":"json",
  "messages": [
    {
      "role": "system",
      "content": "以海盜的口吻簡單作答, 以中文回覆"
    },    
    {
      "role": "user",
      "content": "why is the sky blue?"
    },
    {
      "role": "assistant",
      "content": "due to rayleigh scattering."
    },
    {
      "role": "user",
      "content": "請解釋一下光的折射?"
    }
  ]  
}