Merge pull request #4355 from MengqingCao/npu

Add docker-npu
2024-06-25 01:07:43 +08:00 · 2024-06-25 01:07:43 +08:00 · d0e6059902
parent 3bed18c644 ec95f942d1
commit d0e6059902
6 changed files with 179 additions and 30 deletions
--- a/README.md
+++ b/README.md
@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
 | torch-npu    | 2.1.0   | 2.1.0.post3 |
 | deepspeed    | 0.13.2  | 0.13.2      |
 Docker image:
 - 32GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
 - 64GB: [Download page](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
 Remember to use `ASCEND_RT_VISIBLE_DEVICES` instead of `CUDA_VISIBLE_DEVICES` to specify the device to use.
 If you cannot infer model on NPU devices, try setting `do_sample: false` in the configurations.
@ -424,17 +419,33 @@ llamafactory-cli webui
 ### Build Docker
-#### Use Docker
+For CUDA users:
 ```bash
-docker build -f ./Dockerfile \
+docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d
 docker-compose exec llamafactory bash
 ```
 For Ascend NPU users:
 ```bash
 docker-compose -f ./docker/docker-npu/docker-compose.yml up -d
 docker-compose exec llamafactory bash
 ```
 <details><summary>Build without Docker Compose</summary>
 For CUDA users:
 ```bash
 docker build -f ./docker/docker-cuda/Dockerfile \
    --build-arg INSTALL_BNB=false \
    --build-arg INSTALL_VLLM=false \
    --build-arg INSTALL_DEEPSPEED=false \
    --build-arg PIP_INDEX=https://pypi.org/simple \
    -t llamafactory:latest .
-docker run -it --gpus=all \
+docker run -dit --gpus=all \
    -v ./hf_cache:/root/.cache/huggingface/ \
    -v ./data:/app/data \
    -v ./output:/app/output \
@ -443,15 +454,43 @@ docker run -it --gpus=all \
    --shm-size 16G \
    --name llamafactory \
    llamafactory:latest
 docker exec -it llamafactory bash
 ```
-#### Use Docker Compose
+For Ascend NPU users:
 ```bash
-docker-compose up -d
+# Change docker image upon your environment
-docker-compose exec llamafactory bash
+docker build -f ./docker/docker-npu/Dockerfile \
    --build-arg INSTALL_DEEPSPEED=false \
    --build-arg PIP_INDEX=https://pypi.org/simple \
    -t llamafactory:latest .
 # Change `device` upon your resources
 docker run -dit \
    -v ./hf_cache:/root/.cache/huggingface/ \
    -v ./data:/app/data \
    -v ./output:/app/output \
    -v /usr/local/dcmi:/usr/local/dcmi \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
    -v /etc/ascend_install.info:/etc/ascend_install.info \
    -p 7860:7860 \
    -p 8000:8000 \
    --device /dev/davinci0 \
    --device /dev/davinci_manager \
    --device /dev/devmm_svm \
    --device /dev/hisi_hdc \
    --shm-size 16G \
    --name llamafactory \
    llamafactory:latest
 docker exec -it llamafactory bash
 ```
 </details>
 <details><summary>Details about volume</summary>
 - hf_cache: Utilize Hugging Face cache on the host machine. Reassignable if a cache already exists in a different directory.
--- a/README_zh.md
+++ b/README_zh.md
@ -360,7 +360,7 @@ pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/downl
 <details><summary>昇腾 NPU 用户指南</summary>
-在昇腾 NPU 设备上安装 LLaMA Factory 时，需要指定额外依赖项，使用 `pip install -e '.[torch-npu,metrics]'` 命令安装。此外，还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**，安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令：
+在昇腾 NPU 设备上安装 LLaMA Factory 时，需要指定额外依赖项，使用 `pip install -e ".[torch-npu,metrics]"` 命令安装。此外，还需要安装 **[Ascend CANN Toolkit and Kernels](https://www.hiascend.com/developer/download/community/result?module=cann)**，安装方法请参考[安装教程](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/80RC2alpha002/quickstart/quickstart/quickstart_18_0004.html)或使用以下命令：
 ```bash
 # 请替换 URL 为 CANN 版本和设备型号对应的 URL
@ -383,11 +383,6 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
 | torch-npu    | 2.1.0   | 2.1.0.post3 |
 | deepspeed    | 0.13.2  | 0.13.2      |
 Docker 镜像：
 - 32GB：[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/130.html)
 - 64GB：[下载地址](http://mirrors.cn-central-221.ovaijisuan.com/detail/131.html)
 请使用 `ASCEND_RT_VISIBLE_DEVICES` 而非 `CUDA_VISIBLE_DEVICES` 来指定运算设备。
 如果遇到无法正常推理的情况，请尝试设置 `do_sample: false`。
@ -424,17 +419,33 @@ llamafactory-cli webui
 ### 构建 Docker
-#### 使用 Docker
+CUDA 用户：
 ```bash
-docker build -f ./Dockerfile \
+docker-compose -f ./docker/docker-cuda/docker-compose.yml up -d
 docker-compose exec llamafactory bash
 ```
 昇腾 NPU 用户：
 ```bash
 docker-compose -f ./docker/docker-npu/docker-compose.yml up -d
 docker-compose exec llamafactory bash
 ```
 <details><summary>不使用 Docker Compose 构建</summary>
 CUDA 用户：
 ```bash
 docker build -f ./docker/docker-cuda/Dockerfile \
    --build-arg INSTALL_BNB=false \
    --build-arg INSTALL_VLLM=false \
    --build-arg INSTALL_DEEPSPEED=false \
    --build-arg PIP_INDEX=https://pypi.org/simple \
    -t llamafactory:latest .
-docker run -it --gpus=all \
+docker run -dit --gpus=all \
    -v ./hf_cache:/root/.cache/huggingface/ \
    -v ./data:/app/data \
    -v ./output:/app/output \
@ -443,15 +454,43 @@ docker run -it --gpus=all \
    --shm-size 16G \
    --name llamafactory \
    llamafactory:latest
 docker exec -it llamafactory bash
 ```
-#### 使用 Docker Compose
+昇腾 NPU 用户：
 ```bash
-docker-compose up -d
+# 根据您的环境选择镜像
-docker-compose exec llamafactory bash
+docker build -f ./docker/docker-npu/Dockerfile \
    --build-arg INSTALL_DEEPSPEED=false \
    --build-arg PIP_INDEX=https://pypi.org/simple \
    -t llamafactory:latest .
 # 根据您的资源更改 `device`
 docker run -dit \
    -v ./hf_cache:/root/.cache/huggingface/ \
    -v ./data:/app/data \
    -v ./output:/app/output \
    -v /usr/local/dcmi:/usr/local/dcmi \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
    -v /etc/ascend_install.info:/etc/ascend_install.info \
    -p 7860:7860 \
    -p 8000:8000 \
    --device /dev/davinci0 \
    --device /dev/davinci_manager \
    --device /dev/devmm_svm \
    --device /dev/hisi_hdc \
    --shm-size 16G \
    --name llamafactory \
    llamafactory:latest
 docker exec -it llamafactory bash
 ```
 </details>
 <details><summary>数据卷详情</summary>
 - hf_cache：使用宿主机的 Hugging Face 缓存文件夹，允许更改为新的目录。
--- a/docker/docker-cuda/Dockerfile
+++ b/docker/docker-cuda/Dockerfile
@ -12,13 +12,14 @@ ARG PIP_INDEX=https://pypi.org/simple
 WORKDIR /app
 # Install the requirements
-COPY requirements.txt /app/
+COPY requirements.txt /app
 RUN pip config set global.index-url $PIP_INDEX
 RUN pip config set global.extra-index-url $PIP_INDEX
 RUN python -m pip install --upgrade pip
 RUN python -m pip install -r requirements.txt
 # Copy the rest of the application into the image
-COPY . /app/
+COPY . /app
 # Install the LLaMA Factory
 RUN EXTRA_PACKAGES="metrics"; \
@ -38,10 +39,9 @@ RUN EXTRA_PACKAGES="metrics"; \
 VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ]
 # Expose port 7860 for the LLaMA Board
 ENV GRADIO_SERVER_PORT 7860
 EXPOSE 7860
 # Expose port 8000 for the API service
 ENV API_PORT 8000
 EXPOSE 8000
 # Launch LLaMA Board
 CMD [ "llamafactory-cli", "webui" ]
--- a/docker/docker-cuda/docker-compose.yml
+++ b/docker/docker-cuda/docker-compose.yml
@ -1,8 +1,8 @@
 services:
  llamafactory:
    build:
-      dockerfile: Dockerfile
+      dockerfile: ./docker/docker-cuda/Dockerfile
-      context: .
+      context: ../..
      args:
        INSTALL_BNB: false
        INSTALL_VLLM: false
--- a/docker/docker-npu/Dockerfile
+++ b/docker/docker-npu/Dockerfile
@ -0,0 +1,41 @@
 # Use the Ubuntu 22.04 image with CANN 8.0.rc1
 # More versions can be found at https://hub.docker.com/r/cosdt/cann/tags
 FROM cosdt/cann:8.0.rc1-910b-ubuntu22.04
 ENV DEBIAN_FRONTEND=noninteractive
 # Define installation arguments
 ARG INSTALL_DEEPSPEED=false
 ARG PIP_INDEX=https://pypi.org/simple
 # Set the working directory
 WORKDIR /app
 # Install the requirements
 COPY requirements.txt /app
 RUN pip config set global.index-url $PIP_INDEX
 RUN pip config set global.extra-index-url $PIP_INDEX
 RUN python -m pip install --upgrade pip
 RUN python -m pip install -r requirements.txt
 # Copy the rest of the application into the image
 COPY . /app
 # Install the LLaMA Factory
 RUN EXTRA_PACKAGES="torch-npu,metrics"; \
    if [ "$INSTALL_DEEPSPEED" = "true" ]; then \
        EXTRA_PACKAGES="${EXTRA_PACKAGES},deepspeed"; \
    fi; \
    pip install -e .[$EXTRA_PACKAGES] && \
    pip uninstall -y transformer-engine flash-attn
 # Set up volumes
 VOLUME [ "/root/.cache/huggingface/", "/app/data", "/app/output" ]
 # Expose port 7860 for the LLaMA Board
 ENV GRADIO_SERVER_PORT 7860
 EXPOSE 7860
 # Expose port 8000 for the API service
 ENV API_PORT 8000
 EXPOSE 8000
--- a/docker/docker-npu/docker-compose.yml
+++ b/docker/docker-npu/docker-compose.yml
@ -0,0 +1,30 @@
 services:
  llamafactory:
    build:
      dockerfile: ./docker/docker-npu/Dockerfile
      context: ../..
      args:
        INSTALL_DEEPSPEED: false
        PIP_INDEX: https://pypi.org/simple
    container_name: llamafactory
    volumes:
      - ./hf_cache:/root/.cache/huggingface/
      - ./data:/app/data
      - ./output:/app/output
      - /usr/local/dcmi:/usr/local/dcmi
      - /usr/local/bin/npu-smi:/usr/local/bin/npu-smi
      - /usr/local/Ascend/driver:/usr/local/Ascend/driver
      - /etc/ascend_install.info:/etc/ascend_install.info
    ports:
      - "7860:7860"
      - "8000:8000"
    ipc: host
    tty: true
    stdin_open: true
    command: bash
    devices:
      - /dev/davinci0
      - /dev/davinci_manager
      - /dev/devmm_svm
      - /dev/hisi_hdc
    restart: unless-stopped