update readme
This commit is contained in:
parent
949e5fe638
commit
11a6c1bad6
21
README.md
21
README.md
|
@ -245,8 +245,6 @@ You also can add a custom chat template to [template.py](src/llmtuner/data/templ
|
|||
|
||||
</details>
|
||||
|
||||
Please refer to [data/README.md](data/README.md) for details.
|
||||
|
||||
Some datasets require confirmation before using them, so we recommend logging in with your Hugging Face account using these commands.
|
||||
|
||||
```bash
|
||||
|
@ -366,8 +364,18 @@ docker compose -f ./docker-compose.yml up -d
|
|||
|
||||
See [examples](examples) for usage.
|
||||
|
||||
> [!TIP]
|
||||
> Use `python src/train_bash.py -h` to display arguments description.
|
||||
Use `python src/train_bash.py -h` to display arguments description.
|
||||
|
||||
### Deploy with OpenAI-style API and vLLM
|
||||
|
||||
```bash
|
||||
CUDA_VISIBLE_DEVICES=0 API_PORT=8000 python src/api_demo.py \
|
||||
--model_name_or_path path_to_model \
|
||||
--adapter_name_or_path path_to_lora_adapter \
|
||||
--template default \
|
||||
--finetuning_type lora \
|
||||
--infer_backend vllm
|
||||
```
|
||||
|
||||
### Use ModelScope Hub
|
||||
|
||||
|
@ -381,6 +389,8 @@ Train the model by specifying a model ID of the ModelScope Hub as the `--model_n
|
|||
|
||||
## Projects using LLaMA Factory
|
||||
|
||||
If you have a project that should be incorporated, please contact via email or create a pull request.
|
||||
|
||||
<details><summary>Click to show</summary>
|
||||
|
||||
1. Wang et al. ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation. 2023. [[arxiv]](https://arxiv.org/abs/2308.02223)
|
||||
|
@ -414,9 +424,6 @@ Train the model by specifying a model ID of the ModelScope Hub as the `--model_n
|
|||
|
||||
</details>
|
||||
|
||||
> [!TIP]
|
||||
> If you have a project that should be incorporated, please contact via email or create a pull request.
|
||||
|
||||
## License
|
||||
|
||||
This repository is licensed under the [Apache-2.0 License](LICENSE).
|
||||
|
|
22
README_zh.md
22
README_zh.md
|
@ -245,8 +245,6 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
|
|||
|
||||
</details>
|
||||
|
||||
使用方法请参考 [data/README_zh.md](data/README_zh.md) 文件。
|
||||
|
||||
部分数据集的使用需要确认,我们推荐使用下述命令登录您的 Hugging Face 账户。
|
||||
|
||||
```bash
|
||||
|
@ -337,7 +335,6 @@ CUDA_VISIBLE_DEVICES=0 python src/train_web.py
|
|||
|
||||
```bash
|
||||
docker build -f ./Dockerfile -t llama-factory:latest .
|
||||
|
||||
docker run --gpus=all \
|
||||
-v ./hf_cache:/root/.cache/huggingface/ \
|
||||
-v ./data:/app/data \
|
||||
|
@ -367,8 +364,18 @@ docker compose -f ./docker-compose.yml up -d
|
|||
|
||||
使用方法请参考 [examples](examples) 文件夹。
|
||||
|
||||
> [!TIP]
|
||||
> 使用 `python src/train_bash.py -h` 查看参数文档。
|
||||
使用 `python src/train_bash.py -h` 查看参数文档。
|
||||
|
||||
### 使用 OpenAI 风格 API 和 vLLM 部署
|
||||
|
||||
```bash
|
||||
CUDA_VISIBLE_DEVICES=0 API_PORT=8000 python src/api_demo.py \
|
||||
--model_name_or_path path_to_model \
|
||||
--adapter_name_or_path path_to_lora_adapter \
|
||||
--template default \
|
||||
--finetuning_type lora \
|
||||
--infer_backend vllm
|
||||
```
|
||||
|
||||
### 使用魔搭社区
|
||||
|
||||
|
@ -382,6 +389,8 @@ export USE_MODELSCOPE_HUB=1 # Windows 使用 `set USE_MODELSCOPE_HUB=1`
|
|||
|
||||
## 使用了 LLaMA Factory 的项目
|
||||
|
||||
如果您有项目希望添加至上述列表,请通过邮件联系或者创建一个 PR。
|
||||
|
||||
<details><summary>点击显示</summary>
|
||||
|
||||
1. Wang et al. ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation. 2023. [[arxiv]](https://arxiv.org/abs/2308.02223)
|
||||
|
@ -415,9 +424,6 @@ export USE_MODELSCOPE_HUB=1 # Windows 使用 `set USE_MODELSCOPE_HUB=1`
|
|||
|
||||
</details>
|
||||
|
||||
> [!TIP]
|
||||
> 如果您有项目希望添加至上述列表,请通过邮件联系或者创建一个 PR。
|
||||
|
||||
## 协议
|
||||
|
||||
本仓库的代码依照 [Apache-2.0](LICENSE) 协议开源。
|
||||
|
|
|
@ -0,0 +1,43 @@
|
|||
We provide diverse examples about fine-tuning LLMs.
|
||||
|
||||
```
|
||||
examples/
|
||||
├── lora_single_gpu/
|
||||
│ ├── pt.sh: Pre-training
|
||||
│ ├── sft.sh: Supervised fine-tuning
|
||||
│ ├── reward.sh: Reward modeling
|
||||
│ ├── ppo.sh: PPO training
|
||||
│ ├── dpo.sh: DPO training
|
||||
│ ├── orpo.sh: ORPO training
|
||||
│ ├── prepare.sh: Save tokenized dataset
|
||||
│ └── predict.sh: Batch prediction
|
||||
├── qlora_single_gpu/
|
||||
│ ├── bitsandbytes.sh
|
||||
│ ├── gptq.sh
|
||||
│ ├── awq.sh
|
||||
│ └── aqlm.sh
|
||||
├── lora_multi_gpu/
|
||||
│ ├── single_node.sh
|
||||
│ └── multi_node.sh
|
||||
├── full_multi_gpu/
|
||||
│ ├── single_node.sh
|
||||
│ └── multi_node.sh
|
||||
├── merge_lora/
|
||||
│ ├── merge.sh
|
||||
│ └── quantize.sh
|
||||
├── inference/
|
||||
│ ├── cli_demo.sh
|
||||
│ ├── api_demo.sh
|
||||
│ ├── web_demo.sh
|
||||
│ └── evaluate.sh
|
||||
└── extras/
|
||||
├── galore/
|
||||
│ └── sft.sh
|
||||
├── loraplus/
|
||||
│ └── sft.sh
|
||||
├── llama_pro/
|
||||
│ ├── expand.sh
|
||||
│ └── sft.sh
|
||||
└── fsdp_qlora/
|
||||
└── sft.sh
|
||||
```
|
Loading…
Reference in New Issue