forked from p04798526/LLaMA-Factory-Mirror
update readme, add starcoder2, cosmopedia
This commit is contained in:
parent
1006f372ae
commit
894d183214
31
README.md
31
README.md
|
@ -5,29 +5,21 @@
|
||||||
[![GitHub last commit](https://img.shields.io/github/last-commit/hiyouga/LLaMA-Factory)](https://github.com/hiyouga/LLaMA-Factory/commits/main)
|
[![GitHub last commit](https://img.shields.io/github/last-commit/hiyouga/LLaMA-Factory)](https://github.com/hiyouga/LLaMA-Factory/commits/main)
|
||||||
[![PyPI](https://img.shields.io/pypi/v/llmtuner)](https://pypi.org/project/llmtuner/)
|
[![PyPI](https://img.shields.io/pypi/v/llmtuner)](https://pypi.org/project/llmtuner/)
|
||||||
[![Downloads](https://static.pepy.tech/badge/llmtuner)](https://pypi.org/project/llmtuner/)
|
[![Downloads](https://static.pepy.tech/badge/llmtuner)](https://pypi.org/project/llmtuner/)
|
||||||
[![Citation](https://img.shields.io/badge/Citation-21-green)](#projects-using-llama-factory)
|
[![Citation](https://img.shields.io/badge/citation-21-green)](#projects-using-llama-factory)
|
||||||
[![GitHub pull request](https://img.shields.io/badge/PRs-welcome-blue)](https://github.com/hiyouga/LLaMA-Factory/pulls)
|
[![GitHub pull request](https://img.shields.io/badge/PRs-welcome-blue)](https://github.com/hiyouga/LLaMA-Factory/pulls)
|
||||||
[![Discord](https://dcbadge.vercel.app/api/server/rKfvV9r9FK?compact=true&style=flat)](https://discord.gg/rKfvV9r9FK)
|
[![Discord](https://dcbadge.vercel.app/api/server/rKfvV9r9FK?compact=true&style=flat)](https://discord.gg/rKfvV9r9FK)
|
||||||
[![Twitter](https://img.shields.io/twitter/follow/llamafactory_ai)](https://twitter.com/llamafactory_ai)
|
[![Twitter](https://img.shields.io/twitter/follow/llamafactory_ai)](https://twitter.com/llamafactory_ai)
|
||||||
[![Spaces](https://img.shields.io/badge/🤗-Open%20In%20Spaces-blue)](https://huggingface.co/spaces/hiyouga/LLaMA-Board)
|
[![Spaces](https://img.shields.io/badge/🤗-Open%20in%20Spaces-blue)](https://huggingface.co/spaces/hiyouga/LLaMA-Board)
|
||||||
[![Studios](https://img.shields.io/badge/ModelScope-Open%20In%20Studios-blue)](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
|
[![Studios](https://img.shields.io/badge/ModelScope-Open%20in%20Studios-blue)](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
|
||||||
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing)
|
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing)
|
||||||
|
|
||||||
👋 Join our [WeChat](assets/wechat.jpg).
|
👋 Join our [WeChat](assets/wechat.jpg).
|
||||||
|
|
||||||
\[ English | [中文](README_zh.md) \]
|
\[ English | [中文](README_zh.md) \]
|
||||||
|
|
||||||
## LLaMA Board: A One-stop Web UI for Getting Started with LLaMA Factory
|
|
||||||
|
|
||||||
Preview LLaMA Board at **[🤗 Spaces](https://huggingface.co/spaces/hiyouga/LLaMA-Board)** and **[ModelScope](https://modelscope.cn/studios/hiyouga/LLaMA-Board)**, or launch it locally with `CUDA_VISIBLE_DEVICES=0 python src/train_web.py`.
|
|
||||||
|
|
||||||
Here is an example of altering the self-cognition of an instruction-tuned language model within 10 minutes on a single GPU.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
https://github.com/hiyouga/LLaMA-Factory/assets/16256802/9840a653-7e9c-41c8-ae89-7ace5698baf6
|
https://github.com/hiyouga/LLaMA-Factory/assets/16256802/9840a653-7e9c-41c8-ae89-7ace5698baf6
|
||||||
|
|
||||||
|
Open in [🤗 Spaces](https://huggingface.co/spaces/hiyouga/LLaMA-Board) / [ModelScope](https://modelscope.cn/studios/hiyouga/LLaMA-Board) / [Colab](https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing) / [Local machine](#llama-board-gui).
|
||||||
|
|
||||||
## Table of Contents
|
## Table of Contents
|
||||||
|
|
||||||
|
@ -133,6 +125,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
|
||||||
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | q_proj,v_proj | - |
|
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | q_proj,v_proj | - |
|
||||||
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | c_attn | qwen |
|
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | c_attn | qwen |
|
||||||
| [Qwen1.5](https://huggingface.co/Qwen) | 0.5B/1.8B/4B/7B/14B/72B | q_proj,v_proj | qwen |
|
| [Qwen1.5](https://huggingface.co/Qwen) | 0.5B/1.8B/4B/7B/14B/72B | q_proj,v_proj | qwen |
|
||||||
|
| [StarCoder2](https://huggingface.co/bigcode) | 3B/7B/15B | q_proj,v_proj | - |
|
||||||
| [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | q_proj,v_proj | xverse |
|
| [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | q_proj,v_proj | xverse |
|
||||||
| [Yi](https://huggingface.co/01-ai) | 6B/34B | q_proj,v_proj | yi |
|
| [Yi](https://huggingface.co/01-ai) | 6B/34B | q_proj,v_proj | yi |
|
||||||
| [Yuan](https://huggingface.co/IEITYuan) | 2B/51B/102B | q_proj,v_proj | yuan |
|
| [Yuan](https://huggingface.co/IEITYuan) | 2B/51B/102B | q_proj,v_proj | yuan |
|
||||||
|
@ -210,6 +203,7 @@ Please refer to [constants.py](src/llmtuner/extras/constants.py) for a full list
|
||||||
- [LMSYS Chat 1M (en)](https://huggingface.co/datasets/lmsys/lmsys-chat-1m)
|
- [LMSYS Chat 1M (en)](https://huggingface.co/datasets/lmsys/lmsys-chat-1m)
|
||||||
- [Evol Instruct V2 (en)](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k)
|
- [Evol Instruct V2 (en)](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k)
|
||||||
- [Glaive Function Calling V2 (en)](https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2)
|
- [Glaive Function Calling V2 (en)](https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2)
|
||||||
|
- [Cosmopedia (en)](https://huggingface.co/datasets/HuggingFaceTB/cosmopedia)
|
||||||
- [Open Assistant (de)](https://huggingface.co/datasets/mayflowergmbh/oasst_de)
|
- [Open Assistant (de)](https://huggingface.co/datasets/mayflowergmbh/oasst_de)
|
||||||
- [Dolly 15k (de)](https://huggingface.co/datasets/mayflowergmbh/dolly-15k_de)
|
- [Dolly 15k (de)](https://huggingface.co/datasets/mayflowergmbh/dolly-15k_de)
|
||||||
- [Alpaca GPT4 (de)](https://huggingface.co/datasets/mayflowergmbh/alpaca-gpt4_de)
|
- [Alpaca GPT4 (de)](https://huggingface.co/datasets/mayflowergmbh/alpaca-gpt4_de)
|
||||||
|
@ -247,7 +241,7 @@ huggingface-cli login
|
||||||
| ------------ | ------- | --------- |
|
| ------------ | ------- | --------- |
|
||||||
| python | 3.8 | 3.10 |
|
| python | 3.8 | 3.10 |
|
||||||
| torch | 1.13.1 | 2.2.1 |
|
| torch | 1.13.1 | 2.2.1 |
|
||||||
| transformers | 4.37.2 | 4.38.1 |
|
| transformers | 4.37.2 | 4.38.2 |
|
||||||
| datasets | 2.14.3 | 2.17.1 |
|
| datasets | 2.14.3 | 2.17.1 |
|
||||||
| accelerate | 0.27.2 | 0.27.2 |
|
| accelerate | 0.27.2 | 0.27.2 |
|
||||||
| peft | 0.9.0 | 0.9.0 |
|
| peft | 0.9.0 | 0.9.0 |
|
||||||
|
@ -312,7 +306,7 @@ Then you can train the corresponding model by specifying a model ID of the Model
|
||||||
```bash
|
```bash
|
||||||
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
|
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
|
||||||
--model_name_or_path modelscope/Llama-2-7b-ms \
|
--model_name_or_path modelscope/Llama-2-7b-ms \
|
||||||
... # arguments (same as above)
|
... # arguments (same as below)
|
||||||
```
|
```
|
||||||
|
|
||||||
LLaMA Board also supports using the models and datasets on the ModelScope Hub.
|
LLaMA Board also supports using the models and datasets on the ModelScope Hub.
|
||||||
|
@ -326,6 +320,13 @@ CUDA_VISIBLE_DEVICES=0 USE_MODELSCOPE_HUB=1 python src/train_web.py
|
||||||
> [!IMPORTANT]
|
> [!IMPORTANT]
|
||||||
> If you want to train models on multiple GPUs, please refer to [Distributed Training](#distributed-training).
|
> If you want to train models on multiple GPUs, please refer to [Distributed Training](#distributed-training).
|
||||||
|
|
||||||
|
|
||||||
|
#### LLaMA Board GUI
|
||||||
|
|
||||||
|
```bash
|
||||||
|
CUDA_VISIBLE_DEVICES=0 python src/train_web.py
|
||||||
|
```
|
||||||
|
|
||||||
#### Pre-Training
|
#### Pre-Training
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
@ -653,7 +654,7 @@ CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
|
||||||
|
|
||||||
This repository is licensed under the [Apache-2.0 License](LICENSE).
|
This repository is licensed under the [Apache-2.0 License](LICENSE).
|
||||||
|
|
||||||
Please follow the model licenses to use the corresponding model weights: [Baichuan2](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/blob/main/Community%20License%20for%20Baichuan%202%20Model.pdf) / [BLOOM](https://huggingface.co/spaces/bigscience/license) / [ChatGLM3](https://github.com/THUDM/ChatGLM3/blob/main/MODEL_LICENSE) / [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/LICENSE-MODEL) / [Falcon](https://huggingface.co/tiiuae/falcon-180B/blob/main/LICENSE.txt) / [Gemma](https://ai.google.dev/gemma/terms) / [InternLM2](https://github.com/InternLM/InternLM#license) / [LLaMA](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) / [LLaMA-2](https://ai.meta.com/llama/license/) / [Mistral](LICENSE) / [Phi-1.5/2](https://huggingface.co/microsoft/phi-1_5/resolve/main/Research%20License.docx) / [Qwen](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT) / [XVERSE](https://github.com/xverse-ai/XVERSE-13B/blob/main/MODEL_LICENSE.pdf) / [Yi](https://huggingface.co/01-ai/Yi-6B/blob/main/LICENSE) / [Yuan](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/LICENSE-Yuan)
|
Please follow the model licenses to use the corresponding model weights: [Baichuan2](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/blob/main/Community%20License%20for%20Baichuan%202%20Model.pdf) / [BLOOM](https://huggingface.co/spaces/bigscience/license) / [ChatGLM3](https://github.com/THUDM/ChatGLM3/blob/main/MODEL_LICENSE) / [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/LICENSE-MODEL) / [Falcon](https://huggingface.co/tiiuae/falcon-180B/blob/main/LICENSE.txt) / [Gemma](https://ai.google.dev/gemma/terms) / [InternLM2](https://github.com/InternLM/InternLM#license) / [LLaMA](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) / [LLaMA-2](https://ai.meta.com/llama/license/) / [Mistral](LICENSE) / [Phi-1.5/2](https://huggingface.co/microsoft/phi-1_5/resolve/main/Research%20License.docx) / [Qwen](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT) / [StarCoder2](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) / [XVERSE](https://github.com/xverse-ai/XVERSE-13B/blob/main/MODEL_LICENSE.pdf) / [Yi](https://huggingface.co/01-ai/Yi-6B/blob/main/LICENSE) / [Yuan](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/LICENSE-Yuan)
|
||||||
|
|
||||||
## Citation
|
## Citation
|
||||||
|
|
||||||
|
|
29
README_zh.md
29
README_zh.md
|
@ -5,28 +5,21 @@
|
||||||
[![GitHub last commit](https://img.shields.io/github/last-commit/hiyouga/LLaMA-Factory)](https://github.com/hiyouga/LLaMA-Factory/commits/main)
|
[![GitHub last commit](https://img.shields.io/github/last-commit/hiyouga/LLaMA-Factory)](https://github.com/hiyouga/LLaMA-Factory/commits/main)
|
||||||
[![PyPI](https://img.shields.io/pypi/v/llmtuner)](https://pypi.org/project/llmtuner/)
|
[![PyPI](https://img.shields.io/pypi/v/llmtuner)](https://pypi.org/project/llmtuner/)
|
||||||
[![Downloads](https://static.pepy.tech/badge/llmtuner)](https://pypi.org/project/llmtuner/)
|
[![Downloads](https://static.pepy.tech/badge/llmtuner)](https://pypi.org/project/llmtuner/)
|
||||||
[![Citation](https://img.shields.io/badge/Citation-21-green)](#使用了-llama-factory-的项目)
|
[![Citation](https://img.shields.io/badge/citation-21-green)](#使用了-llama-factory-的项目)
|
||||||
[![GitHub pull request](https://img.shields.io/badge/PRs-welcome-blue)](https://github.com/hiyouga/LLaMA-Factory/pulls)
|
[![GitHub pull request](https://img.shields.io/badge/PRs-welcome-blue)](https://github.com/hiyouga/LLaMA-Factory/pulls)
|
||||||
[![Discord](https://dcbadge.vercel.app/api/server/rKfvV9r9FK?compact=true&style=flat)](https://discord.gg/rKfvV9r9FK)
|
[![Discord](https://dcbadge.vercel.app/api/server/rKfvV9r9FK?compact=true&style=flat)](https://discord.gg/rKfvV9r9FK)
|
||||||
[![Twitter](https://img.shields.io/twitter/follow/llamafactory_ai)](https://twitter.com/llamafactory_ai)
|
[![Twitter](https://img.shields.io/twitter/follow/llamafactory_ai)](https://twitter.com/llamafactory_ai)
|
||||||
[![Spaces](https://img.shields.io/badge/🤗-Open%20In%20Spaces-blue)](https://huggingface.co/spaces/hiyouga/LLaMA-Board)
|
[![Spaces](https://img.shields.io/badge/🤗-Open%20in%20Spaces-blue)](https://huggingface.co/spaces/hiyouga/LLaMA-Board)
|
||||||
[![Studios](https://img.shields.io/badge/ModelScope-Open%20In%20Studios-blue)](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
|
[![Studios](https://img.shields.io/badge/ModelScope-Open%20in%20Studios-blue)](https://modelscope.cn/studios/hiyouga/LLaMA-Board)
|
||||||
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing)
|
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing)
|
||||||
|
|
||||||
👋 加入我们的[微信群](assets/wechat.jpg)。
|
👋 加入我们的[微信群](assets/wechat.jpg)。
|
||||||
|
|
||||||
\[ [English](README.md) | 中文 \]
|
\[ [English](README.md) | 中文 \]
|
||||||
|
|
||||||
## LLaMA Board: 通过一站式网页界面快速上手 LLaMA Factory
|
|
||||||
|
|
||||||
通过 **[🤗 Spaces](https://huggingface.co/spaces/hiyouga/LLaMA-Board)** 或 **[ModelScope](https://modelscope.cn/studios/hiyouga/LLaMA-Board)** 预览 LLaMA Board,或者通过命令 `CUDA_VISIBLE_DEVICES=0 python src/train_web.py` 本地启动。
|
|
||||||
|
|
||||||
下面是使用单张 GPU 在 10 分钟内更改对话式大型语言模型自我认知的示例。
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd-d76c6d0a6594
|
https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd-d76c6d0a6594
|
||||||
|
|
||||||
|
使用 [Hugging Face 空间](https://huggingface.co/spaces/hiyouga/LLaMA-Board) / [魔搭社区](https://modelscope.cn/studios/hiyouga/LLaMA-Board) / [Colab](https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing) / [本地机器](#llama-board-gui) 打开。
|
||||||
|
|
||||||
## 目录
|
## 目录
|
||||||
|
|
||||||
|
@ -132,6 +125,7 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
|
||||||
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | q_proj,v_proj | - |
|
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | q_proj,v_proj | - |
|
||||||
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | c_attn | qwen |
|
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | c_attn | qwen |
|
||||||
| [Qwen1.5](https://huggingface.co/Qwen) | 0.5B/1.8B/4B/7B/14B/72B | q_proj,v_proj | qwen |
|
| [Qwen1.5](https://huggingface.co/Qwen) | 0.5B/1.8B/4B/7B/14B/72B | q_proj,v_proj | qwen |
|
||||||
|
| [StarCoder2](https://huggingface.co/bigcode) | 3B/7B/15B | q_proj,v_proj | - |
|
||||||
| [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | q_proj,v_proj | xverse |
|
| [XVERSE](https://huggingface.co/xverse) | 7B/13B/65B | q_proj,v_proj | xverse |
|
||||||
| [Yi](https://huggingface.co/01-ai) | 6B/34B | q_proj,v_proj | yi |
|
| [Yi](https://huggingface.co/01-ai) | 6B/34B | q_proj,v_proj | yi |
|
||||||
| [Yuan](https://huggingface.co/IEITYuan) | 2B/51B/102B | q_proj,v_proj | yuan |
|
| [Yuan](https://huggingface.co/IEITYuan) | 2B/51B/102B | q_proj,v_proj | yuan |
|
||||||
|
@ -209,6 +203,7 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
|
||||||
- [LMSYS Chat 1M (en)](https://huggingface.co/datasets/lmsys/lmsys-chat-1m)
|
- [LMSYS Chat 1M (en)](https://huggingface.co/datasets/lmsys/lmsys-chat-1m)
|
||||||
- [Evol Instruct V2 (en)](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k)
|
- [Evol Instruct V2 (en)](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k)
|
||||||
- [Glaive Function Calling V2 (en)](https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2)
|
- [Glaive Function Calling V2 (en)](https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2)
|
||||||
|
- [Cosmopedia (en)](https://huggingface.co/datasets/HuggingFaceTB/cosmopedia)
|
||||||
- [Open Assistant (de)](https://huggingface.co/datasets/mayflowergmbh/oasst_de)
|
- [Open Assistant (de)](https://huggingface.co/datasets/mayflowergmbh/oasst_de)
|
||||||
- [Dolly 15k (de)](https://huggingface.co/datasets/mayflowergmbh/dolly-15k_de)
|
- [Dolly 15k (de)](https://huggingface.co/datasets/mayflowergmbh/dolly-15k_de)
|
||||||
- [Alpaca GPT4 (de)](https://huggingface.co/datasets/mayflowergmbh/alpaca-gpt4_de)
|
- [Alpaca GPT4 (de)](https://huggingface.co/datasets/mayflowergmbh/alpaca-gpt4_de)
|
||||||
|
@ -246,7 +241,7 @@ huggingface-cli login
|
||||||
| ------------ | ------- | --------- |
|
| ------------ | ------- | --------- |
|
||||||
| python | 3.8 | 3.10 |
|
| python | 3.8 | 3.10 |
|
||||||
| torch | 1.13.1 | 2.2.1 |
|
| torch | 1.13.1 | 2.2.1 |
|
||||||
| transformers | 4.37.2 | 4.38.1 |
|
| transformers | 4.37.2 | 4.38.2 |
|
||||||
| datasets | 2.14.3 | 2.17.1 |
|
| datasets | 2.14.3 | 2.17.1 |
|
||||||
| accelerate | 0.27.2 | 0.27.2 |
|
| accelerate | 0.27.2 | 0.27.2 |
|
||||||
| peft | 0.9.0 | 0.9.0 |
|
| peft | 0.9.0 | 0.9.0 |
|
||||||
|
@ -311,7 +306,7 @@ export USE_MODELSCOPE_HUB=1 # Windows 使用 `set USE_MODELSCOPE_HUB=1`
|
||||||
```bash
|
```bash
|
||||||
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
|
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
|
||||||
--model_name_or_path modelscope/Llama-2-7b-ms \
|
--model_name_or_path modelscope/Llama-2-7b-ms \
|
||||||
... # 参数同上
|
... # 参数同下
|
||||||
```
|
```
|
||||||
|
|
||||||
LLaMA Board 同样支持魔搭社区的模型和数据集下载。
|
LLaMA Board 同样支持魔搭社区的模型和数据集下载。
|
||||||
|
@ -325,6 +320,12 @@ CUDA_VISIBLE_DEVICES=0 USE_MODELSCOPE_HUB=1 python src/train_web.py
|
||||||
> [!IMPORTANT]
|
> [!IMPORTANT]
|
||||||
> 如果您使用多张 GPU 训练模型,请移步[多 GPU 分布式训练](#多-gpu-分布式训练)部分。
|
> 如果您使用多张 GPU 训练模型,请移步[多 GPU 分布式训练](#多-gpu-分布式训练)部分。
|
||||||
|
|
||||||
|
#### LLaMA Board GUI
|
||||||
|
|
||||||
|
```bash
|
||||||
|
CUDA_VISIBLE_DEVICES=0 python src/train_web.py
|
||||||
|
```
|
||||||
|
|
||||||
#### 预训练
|
#### 预训练
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
@ -652,7 +653,7 @@ CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
|
||||||
|
|
||||||
本仓库的代码依照 [Apache-2.0](LICENSE) 协议开源。
|
本仓库的代码依照 [Apache-2.0](LICENSE) 协议开源。
|
||||||
|
|
||||||
使用模型权重时,请遵循对应的模型协议:[Baichuan2](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/blob/main/Community%20License%20for%20Baichuan%202%20Model.pdf) / [BLOOM](https://huggingface.co/spaces/bigscience/license) / [ChatGLM3](https://github.com/THUDM/ChatGLM3/blob/main/MODEL_LICENSE) / [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/LICENSE-MODEL) / [Falcon](https://huggingface.co/tiiuae/falcon-180B/blob/main/LICENSE.txt) / [Gemma](https://ai.google.dev/gemma/terms) / [InternLM2](https://github.com/InternLM/InternLM#license) / [LLaMA](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) / [LLaMA-2](https://ai.meta.com/llama/license/) / [Mistral](LICENSE) / [Phi-1.5/2](https://huggingface.co/microsoft/phi-1_5/resolve/main/Research%20License.docx) / [Qwen](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT) / [XVERSE](https://github.com/xverse-ai/XVERSE-13B/blob/main/MODEL_LICENSE.pdf) / [Yi](https://huggingface.co/01-ai/Yi-6B/blob/main/LICENSE) / [Yuan](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/LICENSE-Yuan)
|
使用模型权重时,请遵循对应的模型协议:[Baichuan2](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/blob/main/Community%20License%20for%20Baichuan%202%20Model.pdf) / [BLOOM](https://huggingface.co/spaces/bigscience/license) / [ChatGLM3](https://github.com/THUDM/ChatGLM3/blob/main/MODEL_LICENSE) / [DeepSeek](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/LICENSE-MODEL) / [Falcon](https://huggingface.co/tiiuae/falcon-180B/blob/main/LICENSE.txt) / [Gemma](https://ai.google.dev/gemma/terms) / [InternLM2](https://github.com/InternLM/InternLM#license) / [LLaMA](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) / [LLaMA-2](https://ai.meta.com/llama/license/) / [Mistral](LICENSE) / [Phi-1.5/2](https://huggingface.co/microsoft/phi-1_5/resolve/main/Research%20License.docx) / [Qwen](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT) / [StarCoder2](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) / [XVERSE](https://github.com/xverse-ai/XVERSE-13B/blob/main/MODEL_LICENSE.pdf) / [Yi](https://huggingface.co/01-ai/Yi-6B/blob/main/LICENSE) / [Yuan](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/LICENSE-Yuan)
|
||||||
|
|
||||||
## 引用
|
## 引用
|
||||||
|
|
||||||
|
|
|
@ -220,6 +220,13 @@
|
||||||
"ms_hub_url": "AI-ModelScope/WizardLM_evol_instruct_V2_196k",
|
"ms_hub_url": "AI-ModelScope/WizardLM_evol_instruct_V2_196k",
|
||||||
"formatting": "sharegpt"
|
"formatting": "sharegpt"
|
||||||
},
|
},
|
||||||
|
"cosmopedia": {
|
||||||
|
"hf_hub_url": "HuggingFaceTB/cosmopedia",
|
||||||
|
"columns": {
|
||||||
|
"prompt": "prompt",
|
||||||
|
"response": "text"
|
||||||
|
}
|
||||||
|
},
|
||||||
"oasst_de": {
|
"oasst_de": {
|
||||||
"hf_hub_url": "mayflowergmbh/oasst_de"
|
"hf_hub_url": "mayflowergmbh/oasst_de"
|
||||||
},
|
},
|
||||||
|
|
|
@ -743,6 +743,21 @@ register_model_group(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
register_model_group(
|
||||||
|
models={
|
||||||
|
"StarCoder2-3B": {
|
||||||
|
DownloadSource.DEFAULT: "bigcode/starcoder2-3b",
|
||||||
|
},
|
||||||
|
"StarCoder2-7B": {
|
||||||
|
DownloadSource.DEFAULT: "bigcode/starcoder2-7b",
|
||||||
|
},
|
||||||
|
"StarCoder2-15B": {
|
||||||
|
DownloadSource.DEFAULT: "bigcode/starcoder2-15b",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
register_model_group(
|
register_model_group(
|
||||||
models={
|
models={
|
||||||
"Vicuna1.5-7B-Chat": {
|
"Vicuna1.5-7B-Chat": {
|
||||||
|
|
|
@ -28,6 +28,7 @@ def save_model(
|
||||||
export_quantization_dataset: str,
|
export_quantization_dataset: str,
|
||||||
export_legacy_format: bool,
|
export_legacy_format: bool,
|
||||||
export_dir: str,
|
export_dir: str,
|
||||||
|
export_hub_model_id: str,
|
||||||
) -> Generator[str, None, None]:
|
) -> Generator[str, None, None]:
|
||||||
error = ""
|
error = ""
|
||||||
if not model_name:
|
if not model_name:
|
||||||
|
@ -59,6 +60,7 @@ def save_model(
|
||||||
finetuning_type=finetuning_type,
|
finetuning_type=finetuning_type,
|
||||||
template=template,
|
template=template,
|
||||||
export_dir=export_dir,
|
export_dir=export_dir,
|
||||||
|
export_hub_model_id=export_hub_model_id or None,
|
||||||
export_size=max_shard_size,
|
export_size=max_shard_size,
|
||||||
export_quantization_bit=int(export_quantization_bit) if export_quantization_bit in GPTQ_BITS else None,
|
export_quantization_bit=int(export_quantization_bit) if export_quantization_bit in GPTQ_BITS else None,
|
||||||
export_quantization_dataset=export_quantization_dataset,
|
export_quantization_dataset=export_quantization_dataset,
|
||||||
|
@ -77,7 +79,10 @@ def create_export_tab(engine: "Engine") -> Dict[str, "Component"]:
|
||||||
export_quantization_dataset = gr.Textbox(value="data/c4_demo.json")
|
export_quantization_dataset = gr.Textbox(value="data/c4_demo.json")
|
||||||
export_legacy_format = gr.Checkbox()
|
export_legacy_format = gr.Checkbox()
|
||||||
|
|
||||||
export_dir = gr.Textbox()
|
with gr.Row():
|
||||||
|
export_dir = gr.Textbox()
|
||||||
|
export_hub_model_id = gr.Textbox()
|
||||||
|
|
||||||
export_btn = gr.Button()
|
export_btn = gr.Button()
|
||||||
info_box = gr.Textbox(show_label=False, interactive=False)
|
info_box = gr.Textbox(show_label=False, interactive=False)
|
||||||
|
|
||||||
|
@ -95,6 +100,7 @@ def create_export_tab(engine: "Engine") -> Dict[str, "Component"]:
|
||||||
export_quantization_dataset,
|
export_quantization_dataset,
|
||||||
export_legacy_format,
|
export_legacy_format,
|
||||||
export_dir,
|
export_dir,
|
||||||
|
export_hub_model_id,
|
||||||
],
|
],
|
||||||
[info_box],
|
[info_box],
|
||||||
)
|
)
|
||||||
|
@ -105,6 +111,7 @@ def create_export_tab(engine: "Engine") -> Dict[str, "Component"]:
|
||||||
export_quantization_dataset=export_quantization_dataset,
|
export_quantization_dataset=export_quantization_dataset,
|
||||||
export_legacy_format=export_legacy_format,
|
export_legacy_format=export_legacy_format,
|
||||||
export_dir=export_dir,
|
export_dir=export_dir,
|
||||||
|
export_hub_model_id=export_hub_model_id,
|
||||||
export_btn=export_btn,
|
export_btn=export_btn,
|
||||||
info_box=info_box,
|
info_box=info_box,
|
||||||
)
|
)
|
||||||
|
|
|
@ -1019,6 +1019,20 @@ LOCALES = {
|
||||||
"info": "保存导出模型的文件夹路径。",
|
"info": "保存导出模型的文件夹路径。",
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
"export_hub_model_id": {
|
||||||
|
"en": {
|
||||||
|
"label": "HF Hub ID (optional)",
|
||||||
|
"info": "Repo ID for uploading model to Hugging Face hub.",
|
||||||
|
},
|
||||||
|
"ru": {
|
||||||
|
"label": "HF Hub ID (опционально)",
|
||||||
|
"info": "Идентификатор репозитория для загрузки модели на Hugging Face hub.",
|
||||||
|
},
|
||||||
|
"zh": {
|
||||||
|
"label": "HF Hub ID(非必填)",
|
||||||
|
"info": "用于将模型上传至 Hugging Face Hub 的仓库 ID。",
|
||||||
|
},
|
||||||
|
},
|
||||||
"export_btn": {
|
"export_btn": {
|
||||||
"en": {
|
"en": {
|
||||||
"value": "Export",
|
"value": "Export",
|
||||||
|
|
Loading…
Reference in New Issue