update baichuan2 template
This commit is contained in:
parent
60603a94c6
commit
0531886e1f
26
README.md
26
README.md
|
@ -48,19 +48,19 @@
|
||||||
|
|
||||||
## Supported Models
|
## Supported Models
|
||||||
|
|
||||||
| Model | Model size | Default module | Template |
|
| Model | Model size | Default module | Template |
|
||||||
| -------------------------------------------------------- | --------------------------- | ----------------- |----------|
|
| -------------------------------------------------------- | --------------------------- | ----------------- | --------- |
|
||||||
| [LLaMA](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | q_proj,v_proj | - |
|
| [LLaMA](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | q_proj,v_proj | - |
|
||||||
| [LLaMA-2](https://huggingface.co/meta-llama) | 7B/13B/70B | q_proj,v_proj | llama2 |
|
| [LLaMA-2](https://huggingface.co/meta-llama) | 7B/13B/70B | q_proj,v_proj | llama2 |
|
||||||
| [BLOOM](https://huggingface.co/bigscience/bloom) | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - |
|
| [BLOOM](https://huggingface.co/bigscience/bloom) | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - |
|
||||||
| [BLOOMZ](https://huggingface.co/bigscience/bloomz) | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - |
|
| [BLOOMZ](https://huggingface.co/bigscience/bloomz) | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - |
|
||||||
| [Falcon](https://huggingface.co/tiiuae/falcon-7b) | 7B/40B | query_key_value | - |
|
| [Falcon](https://huggingface.co/tiiuae/falcon-7b) | 7B/40B | query_key_value | - |
|
||||||
| [Baichuan](https://github.com/baichuan-inc/baichuan-13B) | 7B/13B | W_pack | baichuan |
|
| [Baichuan](https://github.com/baichuan-inc/baichuan-13B) | 7B/13B | W_pack | baichuan |
|
||||||
| [Baichuan2](https://github.com/baichuan-inc/Baichuan2) | 7B/13B | W_pack | baichuan |
|
| [Baichuan2](https://github.com/baichuan-inc/Baichuan2) | 7B/13B | W_pack | baichuan2 |
|
||||||
| [InternLM](https://github.com/InternLM/InternLM) | 7B | q_proj,v_proj | intern |
|
| [InternLM](https://github.com/InternLM/InternLM) | 7B | q_proj,v_proj | intern |
|
||||||
| [Qwen](https://github.com/QwenLM/Qwen-7B) | 7B | c_attn | chatml |
|
| [Qwen](https://github.com/QwenLM/Qwen-7B) | 7B | c_attn | chatml |
|
||||||
| [XVERSE](https://github.com/xverse-ai/XVERSE-13B) | 13B | q_proj,v_proj | xverse |
|
| [XVERSE](https://github.com/xverse-ai/XVERSE-13B) | 13B | q_proj,v_proj | xverse |
|
||||||
| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | query_key_value | chatglm2 |
|
| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | query_key_value | chatglm2 |
|
||||||
|
|
||||||
- **Default module** is used for the `--lora_target` argument. Please use `python src/train_bash.py -h` to see all available options.
|
- **Default module** is used for the `--lora_target` argument. Please use `python src/train_bash.py -h` to see all available options.
|
||||||
- For the "base" models, the `--template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the corresponding template for the "chat" models.
|
- For the "base" models, the `--template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the corresponding template for the "chat" models.
|
||||||
|
|
26
README_zh.md
26
README_zh.md
|
@ -48,19 +48,19 @@
|
||||||
|
|
||||||
## 模型
|
## 模型
|
||||||
|
|
||||||
| 模型名 | 模型大小 | 默认模块 | Template |
|
| 模型名 | 模型大小 | 默认模块 | Template |
|
||||||
| -------------------------------------------------------- | --------------------------- | ----------------- |----------|
|
| -------------------------------------------------------- | --------------------------- | ----------------- | --------- |
|
||||||
| [LLaMA](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | q_proj,v_proj | - |
|
| [LLaMA](https://github.com/facebookresearch/llama) | 7B/13B/33B/65B | q_proj,v_proj | - |
|
||||||
| [LLaMA-2](https://huggingface.co/meta-llama) | 7B/13B/70B | q_proj,v_proj | llama2 |
|
| [LLaMA-2](https://huggingface.co/meta-llama) | 7B/13B/70B | q_proj,v_proj | llama2 |
|
||||||
| [BLOOM](https://huggingface.co/bigscience/bloom) | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - |
|
| [BLOOM](https://huggingface.co/bigscience/bloom) | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - |
|
||||||
| [BLOOMZ](https://huggingface.co/bigscience/bloomz) | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - |
|
| [BLOOMZ](https://huggingface.co/bigscience/bloomz) | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - |
|
||||||
| [Falcon](https://huggingface.co/tiiuae/falcon-7b) | 7B/40B | query_key_value | - |
|
| [Falcon](https://huggingface.co/tiiuae/falcon-7b) | 7B/40B | query_key_value | - |
|
||||||
| [Baichuan](https://github.com/baichuan-inc/baichuan-13B) | 7B/13B | W_pack | baichuan |
|
| [Baichuan](https://github.com/baichuan-inc/baichuan-13B) | 7B/13B | W_pack | baichuan |
|
||||||
| [Baichuan2](https://github.com/baichuan-inc/Baichuan2) | 7B/13B | W_pack | baichuan |
|
| [Baichuan2](https://github.com/baichuan-inc/Baichuan2) | 7B/13B | W_pack | baichuan2 |
|
||||||
| [InternLM](https://github.com/InternLM/InternLM) | 7B | q_proj,v_proj | intern |
|
| [InternLM](https://github.com/InternLM/InternLM) | 7B | q_proj,v_proj | intern |
|
||||||
| [Qwen](https://github.com/QwenLM/Qwen-7B) | 7B | c_attn | chatml |
|
| [Qwen](https://github.com/QwenLM/Qwen-7B) | 7B | c_attn | chatml |
|
||||||
| [XVERSE](https://github.com/xverse-ai/XVERSE-13B) | 13B | q_proj,v_proj | xverse |
|
| [XVERSE](https://github.com/xverse-ai/XVERSE-13B) | 13B | q_proj,v_proj | xverse |
|
||||||
| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | query_key_value | chatglm2 |
|
| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B | query_key_value | chatglm2 |
|
||||||
|
|
||||||
- **默认模块**是 `--lora_target` 参数的部分可选项。请使用 `python src/train_bash.py -h` 查看全部可选项。
|
- **默认模块**是 `--lora_target` 参数的部分可选项。请使用 `python src/train_bash.py -h` 查看全部可选项。
|
||||||
- 对于所有“基座”(Base)模型,`--template` 参数可以是 `default`, `alpaca`, `vicuna` 等任意值。但“对话”(Chat)模型请务必使用对应的模板。
|
- 对于所有“基座”(Base)模型,`--template` 参数可以是 `default`, `alpaca`, `vicuna` 等任意值。但“对话”(Chat)模型请务必使用对应的模板。
|
||||||
|
|
|
@ -78,7 +78,7 @@ DEFAULT_TEMPLATE = {
|
||||||
"LLaMA2": "llama2",
|
"LLaMA2": "llama2",
|
||||||
"ChineseLLaMA2": "llama2_zh",
|
"ChineseLLaMA2": "llama2_zh",
|
||||||
"Baichuan": "baichuan",
|
"Baichuan": "baichuan",
|
||||||
"Baichuan2": "baichuan",
|
"Baichuan2": "baichuan2",
|
||||||
"InternLM": "intern",
|
"InternLM": "intern",
|
||||||
"Qwen": "chatml",
|
"Qwen": "chatml",
|
||||||
"XVERSE": "xverse",
|
"XVERSE": "xverse",
|
||||||
|
|
|
@ -516,6 +516,49 @@ register_template(
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
r"""
|
||||||
|
Supports: https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat
|
||||||
|
https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat
|
||||||
|
Used for training and inference of the fine-tuned models.
|
||||||
|
"""
|
||||||
|
register_template(
|
||||||
|
name="baichuan2",
|
||||||
|
prefix=[
|
||||||
|
"{{system}}"
|
||||||
|
],
|
||||||
|
prompt=[
|
||||||
|
{"token": "<reserved_106>"}, # user token
|
||||||
|
"{{query}}",
|
||||||
|
{"token": "<reserved_107>"} # assistant token
|
||||||
|
],
|
||||||
|
system="",
|
||||||
|
sep=[]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
r"""
|
||||||
|
Supports: https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat
|
||||||
|
https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat
|
||||||
|
Used for inference of the original model.
|
||||||
|
"""
|
||||||
|
register_template(
|
||||||
|
name="baichuan2_eval",
|
||||||
|
prefix=[
|
||||||
|
"{{system}}",
|
||||||
|
{"token": "<reserved_106>"} # user token
|
||||||
|
],
|
||||||
|
prompt=[
|
||||||
|
"{{query}}",
|
||||||
|
{"token": "<reserved_107>"} # assistant token
|
||||||
|
],
|
||||||
|
system="",
|
||||||
|
sep=[],
|
||||||
|
stop_words=[
|
||||||
|
"<reserved_106>" # user token
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
r"""
|
r"""
|
||||||
Supports: https://huggingface.co/HuggingFaceH4/starchat-alpha
|
Supports: https://huggingface.co/HuggingFaceH4/starchat-alpha
|
||||||
https://huggingface.co/HuggingFaceH4/starchat-beta
|
https://huggingface.co/HuggingFaceH4/starchat-beta
|
||||||
|
|
Loading…
Reference in New Issue