【ollama】（2）：在linux搭建环境，编译ollama代码，测试qwen大模型，本地运行速度飞快，本质上是对llama.cpp 项目封装

正在检查是否收录...

关于 ollama 项目

https://github.com/ollama/ollama/tree/main/docs

https://www.bilibili.com/video/BV1oS421w7aM/

【ollama】（2）：在linux搭建环境，编译ollama代码，测试qwen大模型，本地运行速度飞快，本质上是对llama.cpp 项目封装

1，下载代码和子模块

git clone --recurse-submodules https://github.com/ollama/ollama.git 正克隆到 'ollama'... remote: Enumerating objects: 11260, done. remote: Counting objects: 100% (494/494), done. remote: Compressing objects: 100% (263/263), done. remote: Total 11260 (delta 265), reused 357 (delta 190), pack-reused 10766 接收对象中: 100% (11260/11260), 6.92 MiB | 57.00 KiB/s, 完成. 处理 delta 中: 100% (6984/6984), 完成. 子模组 'llama.cpp'（https://github.com/ggerganov/llama.cpp.git）已对路径 'llm/llama.cpp' 注册 正克隆到 '/data/home/test/go/src/ollama/llm/llama.cpp'... fatal: 无法访问 'https://github.com/ggerganov/llama.cpp.git/'：GnuTLS recv error (-110): The TLS connection was non-properly terminated.

要是失败执行，子模块更新：

$ git submodule update 正克隆到 '/data/home/test/go/src/ollama/llm/llama.cpp'... remote: Enumerating objects: 12802, done. remote: Counting objects: 100% (12802/12802), done. remote: Compressing objects: 100% (3561/3561), done. remote: Total 12483 (delta 9258), reused 12045 (delta 8826), pack-reused 0 接收对象中: 100% (12483/12483), 10.19 MiB | 679.00 KiB/s, 完成. 处理 delta 中: 100% (9258/9258), 完成 260 个本地对象. 来自 https://github.com/ggerganov/llama.cpp * branch c2101a2e909ac7c08976d414e64e96c90ee5fa9e -> FETCH_HEAD 子模组路径 'llm/llama.cpp'：检出 'c2101a2e909ac7c08976d414e64e96c90ee5fa9e'

2，然后就可以编译了

go build llm/payload_linux.go:7:12: pattern llama.cpp/build/linux/*/*/lib/*: no matching files found

需要编译 llama.cpp 的代码，

test@thinkPadE15:~/go/src/ollama$ cd llm/generate/ test@thinkPadE15:~/go/src/ollama/llm/generate$ ls gen_common.sh gen_darwin.sh generate_darwin.go generate_linux.go generate_windows.go gen_linux.sh gen_windows.ps1 test@thinkPadE15:~/go/src/ollama/llm/generate$ bash gen_linux.sh - Build files have been written to: /data/home/test/go/src/ollama/llm/llama.cpp/build/linux/x86_64/cpu + cmake --build ../llama.cpp/build/linux/x86_64/cpu --target ext_server -j8 [ 6%] Generating build details from Git [ 20%] Building C object CMakeFiles/ggml.dir/ggml-backend.c.o [ 20%] Building C object CMakeFiles/ggml.dir/ggml.c.o [ 20%] Building C object CMakeFiles/ggml.dir/ggml-quants.c.o [ 26%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o -- Found Git: /usr/bin/git (found version "2.34.1") [ 33%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o [ 33%] Built target build_info

然后经过漫长的编译，就而可以生成了在 build 文件夹。里面有相关的liib库了。

3，再执行 go build 就可以了

go build ls -lha ollama -rwxrwxr-x 1 test test 34M 3月 9 22:13 ollama

会有几个警告，问题不大：

gpu_info.h:33:23: note: in definition of macro ‘LOG’ 33 | fprintf(stderr, __VA_ARGS__); \ | ^~~~~~~~~~~ gpu_info_cuda.c:159:41: note: format string is defined here 159 | LOG(h.verbose, "[%d] CUDA usedMem %ld\n", i, memInfo.used); | ~~^ | | | long int | %lld

运行 qwen 模型：

https://ollama.com/library/qwen

./ollama run qwen:0.5b pulling manifest pulling fad2a06e4cc7... 25% ▕█████████████ ▏ 96 MB/394 MB 368 KB/s 13m28s

框架本质上还是使用了 llama.cpp 项目的。

测试速度飞快：

>>> 你好 你好！有什么我可以帮您的吗？ >>> 北京景点推荐 在北京，有很多值得一去的地方。以下是一些推荐的北京景点： 1. 故宫：世界文化遗产，拥有丰富的历史文化和艺术价值。 2. 长城：世界自然遗产，拥有壮丽的山峦和河流景观。 3. 北京天安门广场：中国国家象征，拥有壮观的建筑景观。 以上是一些推荐的北京景点。希望对您有所帮助！ >>> 测 你是谁 我是阿里云研发的大规模语言模型“通义千问”。如果您有任何问题或需要帮助，请随时告诉我，我会尽力提供支持和解答。

4，查看文件存储位置

 - macOS: `~/.ollama/models` - Linux: `/usr/share/ollama/.ollama/models` - Windows: `C:\Users\<username>\.ollama\models`

5，可以使用 openai api 标准接口测试

curl http://localhost:11434/api/chat -d '{ "model": "qwen:7b", "messages": [ { "role": "user", "content": "你好" } ] }' curl http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "qwen:7b","stream":true, "messages": [ { "role": "user", "content": "你好" } ] }'

支持流返回：

data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001611,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"你好"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001611,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"！"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"有什么"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"问题"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"或者"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"需要"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"帮助"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"吗"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"？"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"我"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"在这里"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"。"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001614,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"\n"},"finish_reason":null}]} data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001614,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}]} data: [DONE]

llamachatelterpassistantstemasocreategitlinuxllmgithubganwindowsurlapimmogpucpumac