Update docs (#92)

* docs: 规范化文档 Signed-off-by: YdrMaster <ydrml@hotmail.com> * Update README.md --------- Signed-off-by: YdrMaster <ydrml@hotmail.com> Co-authored-by: zhengly123 <zhengly123@outlook.com>
2023-07-10 02:31:45 +08:00 · 2023-07-10 02:31:45 +08:00 · 7023454e32
parent ab74b6a321
commit 7023454e32
9 changed files with 211 additions and 172 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -0,0 +1,13 @@
 # Changelog
 All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ## Unreleased
 ### Added
 ### Modified
 ### Fixed
--- a/INSTALL_GUIDE_CN.md
+++ b/INSTALL_GUIDE_CN.md
@ -1,137 +0,0 @@
 # 安装部署手册
 ## 目录
 - [环境准备](#环境准备)
 - [编译本项目](#编译本项目)
 - [技术支持](#技术支持)
 ## 环境准备
 目前的软硬件环境支持矩阵
 | Host CPU | Device        | OS            |  Support   |
 | -------- | ------------  | -----------   | ---------- |
 | X86-64   | Nvidia GPU    |  Ubuntu-22.04 |  Yes       |
 | X86-64   | Cambricon MLU |  Ubuntu-22.04 |  Yes       |
 推荐使用 X86-64 机器以及 Ubuntu-22.04，本文以此环境为例。
 1. 确认 GCC 版本为 11.3 及以上的稳定版本，如若您的机器 GCC 版本不满足此条件，请自行编译安装，下述方式二选一：
   > [GCC 官方文档](https://gcc.gnu.org/onlinedocs/gcc-11.3.0/gcc/)
   > [网友安装分享](https://zhuanlan.zhihu.com/p/509695395)
 2. 确认 CMake 版本为 3.17 及以上的稳定版本， 如若您的机器 CMake 版本不满足此条件，请自行编译安装，下述方式二选一：
   > [CMake 官方文档](https://cmake.org/install/)
   > [网友安装分享](https://zhuanlan.zhihu.com/p/110793004)
 3. 第三方加速卡软件资源安装，目前本项目已经适配了如下的第三方加速卡：
   > 如您的第三方加速卡为英伟达 GPU，请参考英伟达官方文档进行：
   > > [驱动安装](https://www.nvidia.cn/geforce/drivers/)，
   > > [CUDA Toolkit 安装](https://developer.nvidia.com/cuda-toolkit)，
   > > [Cudnn 安装](https://developer.nvidia.com/rdp/cudnn-download)，
   > > [Cublas 安装](https://developer.nvidia.com/cublas)，
   > > 安装完成后请进行相应的环境变量配置，将可执行文件目录与库目录添加到操作系统识别的路径中，例如
   > > ```bash
   > > # 将如下内容写入到你的 bashrc 文件并 source 该文件
   > > export CUDA_HOME="/PATH/TO/YOUR/CUDA_HOME"
   > > export CUDNN_HOME="/PATH/TO/YOUR/CUDNN_HOME"
   > > export PATH="${CUDA_HOME}/bin:${PATH}"
   > > export LD_LIBRARY_PATH="${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}"
   > > # 如您不方便将上述环境变量配置到 bashrc 文件中进行长期使用，你也可以在我们提供的 env.sh 文件中进行正确配置并激活，作为临时使用
   > > source env.sh
   > > ```
   我们强烈建议您规范安装，统一到一个目录下，以免不必要的麻烦。
   > 如您的第三方加速卡为寒武纪 MLU，请参考寒武纪官方文档进行：
   > > [驱动安装](https://www.cambricon.com/docs/sdk_1.11.0/driver_5.10.6/user_guide_5.10.6/index.html)，
   > > [CNToolkit 安装](https://www.cambricon.com/docs/sdk_1.11.0/cntoolkit_3.4.1/cntoolkit_install_3.4.1/index.html)，
   > > [CNNL 安装](https://www.cambricon.com/docs/sdk_1.11.0/cambricon_cnnl_1.16.1/user_guide/index.html)，
   > > 安装完成后请进行相应的环境变量配置，将可执行文件目录与库目录添加到操作系统识别的路径中，例如
   > > ```bash
   > > # 将如下内容写入到你的 bashrc 文件并 source 该文件
   > > export NEUWARE_HOME="/usr/local/neuware"
   > > export PATH="${NEUWARE_HOME}/bin:${PATH}"
   > > export LD_LIBRARY_PATH="${NEUWARE_HOME}/lib64:${LD_LIBRARY_PATH}"
   > > # 如您不方便将上述环境变量配置到 bashrc 文件中进行长期使用，你也可以在我们提供的 env.sh 文件中进行正确配置并激活，作为临时使用
   > > source env.sh
   > > ```
   > > 我们强烈建议您规范安装，统一到一个目录下，以免不必要的麻烦。另外请注意，由于 MLU 上层软件建设适配程度有限，如您在其覆盖的机器，操作系统之外运行，需要在安装驱动之后使用上层软件的 Docker。
 4. 确认您安装了 make，build-essential， python-is-python3， python-dev-is-python3， python3-pip， libdw-dev，如您的机器没有上述基础依赖，请自行按需安装。
   > 在使用 apt-get 工具情况下，您可以这样子执行。
   ```bash
   sudo apt-get install make cmake build-essential python-is-python3 python-dev-is-python3 python3-pip libdw-dev
   ```
   > 其他工具安装方式请自行上网搜寻
 5. 更新pip并切换到清华源
   ```bash
   python -m pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade pip
   pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
   ```
 6. 安装一些不必要的项目（可选）
   > 如您需要运行本项目下的 example 代码，您需要安装一些辅助项目。请注意这些项目不是必要的，若您不需要运行样例代码，这些项目无需安装。
   > > [Pytorch](https://pytorch.org/get-started/locally/)：业界内流行的神经网络编程框架
   > > [ONNX](https://onnx.ai/get-started.html)：业界内流行的神经网络模型存储文件与转换器
   > > [onnxsim](https://pypi.org/project/onnxsim/)：一个简化onnx模型的小工具
   > > [onnx2torch](https://github.com/ENOT-AutoDL/onnx2torch)：一个将onnx模型转换pytorch模型的小工具
   > > [tqdm](https://pypi.org/project/tqdm/)：一个显示程序运行进度条的小工具
   > 如您需要使用本项目下的 InfiniTest 测试工具，你还需要安装如下的项目：
   > > [protobuf](https://github.com/protocolbuffers/protobuf)： 一种序列化文件的格式及其编译、序列化、解析工具 
 ## 编译本项目
 推荐使用 X86-64 机器以及 Ubuntu-22.04，本文以此环境为例。
 1. 配置环境
 打开 env.sh 文件进行环境变量配置，之后执行
  ```bash
  source env.sh
  ```
 2. 编译本项目并打包成 Python 库进行安装
 我们提供了意见编译参数，您可以在项目根目录下执行下面的命令。第一次执行会同时安装 python 依赖库，耗时略长，请耐心等待
   仅编译 CPU 部分，不编译第三方计算卡：
   ```bash
   make install-python
   ```
   编译 CPU 部分，同时编译英伟达 GPU 部分：
   ```bash
   export CUDA_HOME=/path/to/your/cuda_home
   make install-python CUDA=ON
   ```
   编译 CPU 部分，同时编译寒武纪 MLU 部分：
   ```bash
   export NEUWARE_HOME=/path/to/your/neuware_home
   make install-python BANG=ON
   ```
 3. 使用方法
 安装成功后，您就可以使用本项目的 Python 接口进行编码并运行。具体使用方式可以参考项目样例代码 example/Resnet/resnet.py 以及用户使用手册
 ## 技术支持
 如遇到问题，请联系我们技术支持团队
--- a/README.md
+++ b/README.md
@ -1,20 +1,14 @@
 # InfiniTensor
-## Compilation on Lotus
+[中文项目简介](/README_CN.md) | Documentation | [中文文档](/docs/INDEX.md)
 # Compilation for cuda
 ``` bash
 # Enter the root of InfiniTensor
 source test/script/env_lotus.sh
 make CUDA=ON
 ```
 ## Compilation for intelcpu
 ``` bash
 # Enter the root of InfiniTensor
 source test/script/env_lotus.sh intelcpu
 mkdir build && cd build
 cmake -DUSE_INTELCPU=ON -DCMAKE_CXX_COMPILER=dpcpp .. && make -j 12
 ```
 [![Build](https://github.com/InfiniTensor/InfiniTensor/actions/workflows/workflow.yml/badge.svg?branch=master)](https://github.com/InfiniTensor/InfiniTensor/actions)
 [![issue](https://img.shields.io/github/issues/InfiniTensor/InfiniTensor)](https://github.com/InfiniTensor/InfiniTensor/issues)
 ![license](https://img.shields.io/github/license/InfiniTensor/InfiniTensor)
 InfiniTensor is a high-performance inference engine tailored for GPUs and AI accelerators. Its design focuses on effective deployment and swift academic validation.
 ## Get started 
 ### Make Commands
 - `make`/`make build`: Builds the project;
@ -30,12 +24,22 @@ cmake -DUSE_INTELCPU=ON -DCMAKE_CXX_COMPILER=dpcpp .. && make -j 12
 ### CMake Options
-There are several configurable CMake options, see the [CMakeLists.txt file](/CMakeLists.txt#L5).
+There are several configurable CMake options, see the [CMakeLists.txt](/CMakeLists.txt#L5) file.
 - If `USE_BACKTRACE` is `ON`, `libdw-dev` have to be installed. See the README of [backward-cpp](https://github.com/bombela/backward-cpp) for details.
 - If `USE_PROTOBUF` is `ON`, `protobuf` have to be installed. See the README of [protobuf](https://github.com/protocolbuffers/protobuf) for details.
 - If `USE_CUDA` is `ON`, `cuda` have to be installed.
 ## Roadmap
 - [EinNet](https://github.com/InfiniTensor/InfiniTensor/tree/NNET_e2e) is going to be merged into the main branch.
 - Integration of [PET](https://github.com/thu-pacman/PET), a tensor program optimizer supporting partially equivalent transformations.
 - Supported hardware
    - ✔ NVIDIA GPU
    - ✔ Cambricon MLU
    - ⬜ Ascend NPU
    - ⬜ Kunlunxin XPU
 ## Contributor Guide
 InfiniTensor development is based on the pull request on Github. Before requesting for merging, a PR should satisfy the following requirements
@ -46,9 +50,22 @@ InfiniTensor development is based on the pull request on Github. Before requesti
 2. Receive at least one approval from reviewers.
 3. PR title should be concise since it is going to be the commit message in the main branch after merging and squashing.
-## Dependencies
+## Reference
 Please cite EinNet or PET in your publications if it helps your research:
 ```
@article{zheng2023einnet,
  title={EINNET: Optimizing Tensor Programs with Derivation-Based Transformations},
  author={Zheng, Liyan and Wang, Haojie and Zhai, Jidong and Hu, Muyan and Ma, Zixuan and Wang, Tuowei and Huang, Shuhong and Miao, Xupeng and Tang, Shizhi and Huang, Kezhao and Jia, Zhihao},
  booktitle={17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23)},
  pages={739--755},
  year={2023}
 }
- [backward-cpp](https://github.com/bombela/backward-cpp): [v1.6](https://github.com/bombela/backward-cpp/releases/tag/v1.6)
+@inproceedings{wang2021pet,
- [googletest](https://github.com/google/googletest): [v1.13.0](https://github.com/google/googletest/releases/tag/v1.13.0)
+  title={PET: Optimizing tensor programs with partially equivalent transformations and automated corrections},
- [nlohmann_json_cmake_fetchcontent](https://github.com/ArthurSonzogni/nlohmann_json_cmake_fetchcontent): [v3.10.5](https://github.com/ArthurSonzogni/nlohmann_json_cmake_fetchcontent/releases/tag/v3.10.5)
+  author={Wang, Haojie and Zhai, Jidong and Gao, Mingyu and Ma, Zixuan and Tang, Shizhi and Zheng, Liyan and Li, Yuanzhi and Rong, Kaiyuan and Chen, Yuanyong and Jia, Zhihao},
- [pybind11](https://github.com/pybind/pybind11): [v2.10.3](https://github.com/pybind/pybind11/releases/tag/v2.10.3)
+  booktitle={15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21)},
  pages={37--54},
  year={2021}
 }
 ```
--- a/README_CN.md
+++ b/README_CN.md
@ -0,0 +1,13 @@
 # Infinitensor
 ## 项目简介
 本项目是深度学习领域的一个编译器集合，本项目旨在缩小深度学习应用与后端硬件之间的鸿沟。本项目通过使用编译器超优化技术，对神经网络模型进行优化，从而获得更好的性能。同时，本项目与深度学习框架相互配合，为不同的硬件后端提供端倒端的编译，方便用户迁移部署。
 ## 项目设计
 本项目的设计是前后端解耦合的，主要有三个模块，分别为：
 - Runtime 模块：该模式负责对不同的加速卡后端进行包装与支持，支撑后端运行。另外提供统一的向上接口，方便上层建设。
 - Compiler 模块：该模式负责对神经网络模型进行优化变换，获得更加高效的等价模型。
 - Interface 模块：该模式负责给用户提供编程与交互的接口，方便用户使用本系统。
--- a/docs/INDEX.md
+++ b/docs/INDEX.md
@ -0,0 +1,5 @@
 # 项目文档
 - [安装步数指南](INSTALL_GUIDE_CN.md)
 - [硬件支持](SUPPORT_MATRIX_CN.md)
 - [使用指南](USER_GUIDE_CN.md)
--- a/docs/INSTALL_GUIDE_CN.md
+++ b/docs/INSTALL_GUIDE_CN.md
@ -0,0 +1,142 @@
 # 安装部署指南
 ## 目录
 - [环境准备](#环境准备)
 - [编译本项目](#编译本项目)
 - [技术支持](#技术支持)
 ## 环境准备
 目前的软硬件环境支持矩阵
 | Host CPU | Device        | OS            |  Support   |
 | -------- | ------------  | -----------   | ---------- |
 | X86-64   | Nvidia GPU    |  Ubuntu-22.04 |  Yes       |
 | X86-64   | Cambricon MLU |  Ubuntu-22.04 |  Yes       |
 推荐使用 X86-64 机器以及 Ubuntu-22.04，本文以此环境为例。
 1. 确认 GCC 版本为 11.3 及以上的稳定版本，如若您的机器 GCC 版本不满足此条件，请自行编译安装，下述方式二选一：
   - [GCC 官方文档](https://gcc.gnu.org/onlinedocs/gcc-11.3.0/gcc/)
   - [网友安装分享](https://zhuanlan.zhihu.com/p/509695395)
 2. 确认 CMake 版本为 3.17 及以上的稳定版本， 如若您的机器 CMake 版本不满足此条件，请自行编译安装，下述方式二选一：
   - [CMake 官方文档](https://cmake.org/install/)
   - [网友安装分享](https://zhuanlan.zhihu.com/p/110793004)
 3. 第三方加速卡软件资源安装，目前本项目已经适配了如下的第三方加速卡：
   - 如您的第三方加速卡为英伟达 GPU，请参考英伟达官方文档进行：
     > [驱动安装](https://www.nvidia.cn/geforce/drivers/)，
     > [CUDA Toolkit 安装](https://developer.nvidia.com/cuda-toolkit)，
     > [Cudnn 安装](https://developer.nvidia.com/rdp/cudnn-download)，
     > [Cublas 安装](https://developer.nvidia.com/cublas)，
     > 安装完成后请进行相应的环境变量配置，将可执行文件目录与库目录添加到操作系统识别的路径中，例如
     >
     > ```bash
     > # 将如下内容写入到你的 bashrc 文件并 source 该文件
     > export CUDA_HOME="/PATH/TO/YOUR/CUDA_HOME"
     > export CUDNN_HOME="/PATH/TO/YOUR/CUDNN_HOME"
     > export PATH="${CUDA_HOME}/bin:${PATH}"
     > export LD_LIBRARY_PATH="${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}"
     > # 如您不方便将上述环境变量配置到 bashrc 文件中进行长期使用，你也可以在我们提供的 env.sh 文件中进行正确配置并激活，作为临时使用
     > source env.sh
     > ```
     我们强烈建议您规范安装，统一到一个目录下，以免不必要的麻烦。
   - 如您的第三方加速卡为寒武纪 MLU，请参考寒武纪官方文档进行：
     > [驱动安装](https://www.cambricon.com/docs/sdk_1.11.0/driver_5.10.6/user_guide_5.10.6/index.html)，
     > [CNToolkit 安装](https://www.cambricon.com/docs/sdk_1.11.0/cntoolkit_3.4.1/cntoolkit_install_3.4.1/index.html)，
     > [CNNL 安装](https://www.cambricon.com/docs/sdk_1.11.0/cambricon_cnnl_1.16.1/user_guide/index.html)，
     > 安装完成后请进行相应的环境变量配置，将可执行文件目录与库目录添加到操作系统识别的路径中，例如
     >
     > ```bash
     > # 将如下内容写入到你的 bashrc 文件并 source 该文件
     > export NEUWARE_HOME="/usr/local/neuware"
     > export PATH="${NEUWARE_HOME}/bin:${PATH}"
     > export LD_LIBRARY_PATH="${NEUWARE_HOME}/lib64:${LD_LIBRARY_PATH}"
     > # 如您不方便将上述环境变量配置到 bashrc 文件中进行长期使用，你也可以在我们提供的 env.sh 文件中进行正确配置并激活，作为临时使用
     > source env.sh
     > ```
     我们强烈建议您规范安装，统一到一个目录下，以免不必要的麻烦。另外请注意，由于 MLU 上层软件建设适配程度有限，如您在其覆盖的机器，操作系统之外运行，需要在安装驱动之后使用上层软件的 Docker。
 4. 确认您安装了 make，build-essential， python-is-python3， python-dev-is-python3， python3-pip， libdw-dev，如您的机器没有上述基础依赖，请自行按需安装。
   - 在使用 apt-get 工具情况下，您可以这样执行
     ```bash
     sudo apt-get install make cmake build-essential python-is-python3 python-dev-is-python3 python3-pip libdw-dev
     ```
 5. 更新pip并切换到清华源
   ```bash
   python -m pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade pip
   pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
   ```
 6. 安装一些不必要的项目（可选）
   - 如您需要运行本项目下的 example 代码，您需要安装一些辅助项目。请注意这些项目不是必要的，若您不需要运行样例代码，这些项目无需安装。
     > [Pytorch](https://pytorch.org/get-started/locally/)：业界内流行的神经网络编程框架
     > [ONNX](https://onnx.ai/get-started.html)：业界内流行的神经网络模型存储文件与转换器
     > [onnxsim](https://pypi.org/project/onnxsim/)：一个简化onnx模型的小工具
     > [onnx2torch](https://github.com/ENOT-AutoDL/onnx2torch)：一个将onnx模型转换pytorch模型的小工具
     > [tqdm](https://pypi.org/project/tqdm/)：一个显示程序运行进度条的小工具
   - 如您需要使用本项目下的 InfiniTest 测试工具，你还需要安装如下的项目：
     > [protobuf](https://github.com/protocolbuffers/protobuf)： 一种序列化文件的格式及其编译、序列化、解析工具
 ## 编译本项目
 推荐使用 X86-64 机器以及 Ubuntu-22.04，本文以此环境为例。
 1. 配置环境
   打开 env.sh 文件进行环境变量配置，之后执行
   ```bash
   source env.sh
   ```
 2. 编译本项目并打包成 Python 库进行安装
   我们提供了意见编译参数，您可以在项目根目录下执行下面的命令。第一次执行会同时安装 python 依赖库，耗时略长，请耐心等待。
   仅编译 CPU 部分，不编译第三方计算卡：
   ```bash
   make install-python
   ```
   编译 CPU 部分，同时编译英伟达 GPU 部分：
   ```bash
   export CUDA_HOME=/path/to/your/cuda_home
   make install-python CUDA=ON
   ```
   编译 CPU 部分，同时编译寒武纪 MLU 部分：
   ```bash
   export NEUWARE_HOME=/path/to/your/neuware_home
   make install-python BANG=ON
   ```
 3. 使用方法
   安装成功后，您就可以使用本项目的 Python 接口进行编码并运行。具体使用方式可以参考项目样例代码 example/Resnet/resnet.py 以及用户使用手册
 ## 技术支持
 如遇到问题，请联系我们技术支持团队
--- a/docs/SUPPORT_MATRIX_CN.md
+++ b/docs/SUPPORT_MATRIX_CN.md
--- a/docs/TODO.md
+++ b/docs/TODO.md
@ -0,0 +1 @@
--- a/docs/USER_GUIDE_CN.md
+++ b/docs/USER_GUIDE_CN.md
@ -2,8 +2,6 @@
 ## 目录
 - [项目简介](#项目简介)
 - [项目设计](#项目设计)
 - [使用方法](#使用方法)
 - [python-前端应用指南](#python-前端应用指南)
  - [导入-onnx-模型](#导入-onnx-模型)
@ -13,18 +11,6 @@
 - [技术支持](#技术支持)
 - [测试](#测试)
 ## 项目简介
 本项目是深度学习领域的一个编译器集合，本项目旨在缩小深度学习应用与后端硬件之间的鸿沟。本项目通过使用编译器超优化技术，对神经网络模型进行优化，从而获得更好的性能。同时，本项目与深度学习框架相互配合，为不同的硬件后端提供端倒端的编译，方便用户迁移部署。
 ## 项目设计
 本项目的设计是前后端解耦合的，主要有三个模块，分别为：
 - Runtime 模块：该模式负责对不同的加速卡后端进行包装与支持，支撑后端运行。另外提供统一的向上接口，方便上层建设。
 - Compiler 模块：该模式负责对神经网络模型进行优化变换，获得更加高效的等价模型。
 - Interface 模块：该模式负责给用户提供编程与交互的接口，方便用户使用本系统。
 ## 使用方法
 项目管理功能已写到 [Makefile](Makefile)，支持下列功能：
@ -162,7 +148,6 @@ python resnet.py -h
 如若您遇到了本项目的问题，请联系我们的技术支持团队
 ## 测试
 除了单元测试 `make test-cpp` 和 `make test-onnx` 之外，还可以用其他方式来测试单个模型导入导出和优化的正确性。