Merge pull request #4355 from MengqingCao/npu

Add docker-npu
This commit is contained in:
hoshi-hiyouga 2024-06-25 01:07:43 +08:00 committed by GitHub
commit d0e6059902
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
6 changed files with 179 additions and 30 deletions

View File

@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
| torch-npu | 2.1.0 | 2.1.0.post3 |
| deepspeed | 0.13.2 | 0.13.2 |
Docker image:
- 32GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
- 64GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
Remember to use `ASCEND_RT_VISIBLE_DEVICES` instead of `CUDA_VISIBLE_DEVICES` to specify the device to use.
If you cannot infer model on NPU devices, try setting `do_sample: false` in the configurations.
@ -424,17 +419,33 @@ llamafactory-cli webui
### Build Docker
#### Use Docker
For CUDA users:
```bash
docker build -f ./Dockerfile \
docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d
docker-compose exec llamafactory bash
```
For Ascend NPU users:
```bash
docker-compose -f ./docker/docker-npu/docker-compose.yml up -d
docker-compose exec llamafactory bash
```
<details><summary>Build without Docker Compose</summary>
For CUDA users:
```bash
docker build -f ./docker/docker-cuda/Dockerfile \
--build-arg INSTALL_BNB=false \
--build-arg INSTALL_VLLM=false \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .
docker run -it --gpus=all \
docker run -dit --gpus=all \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
@ -443,15 +454,43 @@ docker run -it --gpus=all \
--shm-size 16G \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
#### Use Docker Compose
For Ascend NPU users:
```bash
docker-compose up -d
docker-compose exec llamafactory bash
# Change docker image upon your environment
docker build -f ./docker/docker-npu/Dockerfile \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .
# Change `device` upon your resources
docker run -dit \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-p 7860:7860 \
-p 8000:8000 \
--device /dev/davinci0 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
--shm-size 16G \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
</details>
<details><summary>Details about volume</summary>
- hf_cache: Utilize Hugging Face cache on the host machine. Reassignable if a cache already exists in a different directory.

View File

@ -360,7 +360,7 @@ pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/downl
<details><summary>昇腾 NPU 用户指南</summary>
在昇腾 NPU 设备上安装 LLaMA Factory 时,需要指定额外依赖项,使用 `pip install -e '.[torch-npu,metrics]'` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令:
在昇腾 NPU 设备上安装 LLaMA Factory 时,需要指定额外依赖项,使用 `pip install -e ".[torch-npu,metrics]"` 命令安装。此外,还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**,安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令:
```bash
# 请替换 URL 为 CANN 版本和设备型号对应的 URL
@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
| torch-npu | 2.1.0 | 2.1.0.post3 |
| deepspeed | 0.13.2 | 0.13.2 |
Docker 镜像:
- 32GB[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
- 64GB[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
请使用 `ASCEND_RT_VISIBLE_DEVICES` 而非 `CUDA_VISIBLE_DEVICES` 来指定运算设备。
如果遇到无法正常推理的情况,请尝试设置 `do_sample: false`
@ -424,17 +419,33 @@ llamafactory-cli webui
### 构建 Docker
#### 使用 Docker
CUDA 用户:
```bash
docker build -f ./Dockerfile \
docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d
docker-compose exec llamafactory bash
```
昇腾 NPU 用户:
```bash
docker-compose -f ./docker/docker-npu/docker-compose.yml up -d
docker-compose exec llamafactory bash
```
<details><summary>不使用 Docker Compose 构建</summary>
CUDA 用户:
```bash
docker build -f ./docker/docker-cuda/Dockerfile \
--build-arg INSTALL_BNB=false \
--build-arg INSTALL_VLLM=false \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .
docker run -it --gpus=all \
docker run -dit --gpus=all \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
@ -443,15 +454,43 @@ docker run -it --gpus=all \
--shm-size 16G \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
#### 使用 Docker Compose
昇腾 NPU 用户:
```bash
docker-compose up -d
docker-compose exec llamafactory bash
# 根据您的环境选择镜像
docker build -f ./docker/docker-npu/Dockerfile \
--build-arg INSTALL_DEEPSPEED=false \
--build-arg PIP_INDEX=https://pypi.org/simple \
-t llamafactory:latest .
# 根据您的资源更改 `device`
docker run -dit \
-v ./hf_cache:/root/.cache/huggingface/ \
-v ./data:/app/data \
-v ./output:/app/output \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-p 7860:7860 \
-p 8000:8000 \
--device /dev/davinci0 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
--shm-size 16G \
--name llamafactory \
llamafactory:latest
docker exec -it llamafactory bash
```
</details>
<details><summary>数据卷详情</summary>
- hf_cache使用宿主机的 Hugging Face 缓存文件夹,允许更改为新的目录。

View File

@ -12,13 +12,14 @@ ARG PIP_INDEX=https://pypi.org/simple
WORKDIR /app
# Install the requirements
COPY requirements.txt /app/
COPY requirements.txt /app
RUN pip config set global.index-url $PIP_INDEX
RUN pip config set global.extra-index-url $PIP_INDEX
RUN python -m pip install --upgrade pip
RUN python -m pip install -r requirements.txt
# Copy the rest of the application into the image
COPY . /app/
COPY . /app
# Install the LLaMA Factory
RUN EXTRA_PACKAGES="metrics"; \
@ -38,10 +39,9 @@ RUN EXTRA_PACKAGES="metrics"; \
VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ]
# Expose port 7860 for the LLaMA Board
ENV GRADIO_SERVER_PORT 7860
EXPOSE 7860
# Expose port 8000 for the API service
ENV API_PORT 8000
EXPOSE 8000
# Launch LLaMA Board
CMD [ "llamafactory-cli", "webui" ]

View File

@ -1,8 +1,8 @@
services:
llamafactory:
build:
dockerfile: Dockerfile
context: .
dockerfile: ./docker/docker-cuda/Dockerfile
context: ../..
args:
INSTALL_BNB: false
INSTALL_VLLM: false

View File

@ -0,0 +1,41 @@
# Use the Ubuntu 22.04 image with CANN 8.0.rc1
# More versions can be found at https://hub.docker.com/r/cosdt/cann/tags
FROM cosdt/cann:8.0.rc1-910b-ubuntu22.04
ENV DEBIAN_FRONTEND=noninteractive
# Define installation arguments
ARG INSTALL_DEEPSPEED=false
ARG PIP_INDEX=https://pypi.org/simple
# Set the working directory
WORKDIR /app
# Install the requirements
COPY requirements.txt /app
RUN pip config set global.index-url $PIP_INDEX
RUN pip config set global.extra-index-url $PIP_INDEX
RUN python -m pip install --upgrade pip
RUN python -m pip install -r requirements.txt
# Copy the rest of the application into the image
COPY . /app
# Install the LLaMA Factory
RUN EXTRA_PACKAGES="torch-npu,metrics"; \
if [ "$INSTALL_DEEPSPEED" = "true" ]; then \
EXTRA_PACKAGES="${EXTRA_PACKAGES},deepspeed"; \
fi; \
pip install -e .[$EXTRA_PACKAGES] && \
pip uninstall -y transformer-engine flash-attn
# Set up volumes
VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ]
# Expose port 7860 for the LLaMA Board
ENV GRADIO_SERVER_PORT 7860
EXPOSE 7860
# Expose port 8000 for the API service
ENV API_PORT 8000
EXPOSE 8000

View File

@ -0,0 +1,30 @@
services:
llamafactory:
build:
dockerfile: ./docker/docker-npu/Dockerfile
context: ../..
args:
INSTALL_DEEPSPEED: false
PIP_INDEX: https://pypi.org/simple
container_name: llamafactory
volumes:
- ./hf_cache:/root/.cache/huggingface/
- ./data:/app/data
- ./output:/app/output
- /usr/local/dcmi:/usr/local/dcmi
- /usr/local/bin/npu-smi:/usr/local/bin/npu-smi
- /usr/local/Ascend/driver:/usr/local/Ascend/driver
- /etc/ascend_install.info:/etc/ascend_install.info
ports:
- "7860:7860"
- "8000:8000"
ipc: host
tty: true
stdin_open: true
command: bash
devices:
- /dev/davinci0
- /dev/davinci_manager
- /dev/devmm_svm
- /dev/hisi_hdc
restart: unless-stopped