From 293bd95712330eb354220abf79384b8c594608ee Mon Sep 17 00:00:00 2001
From: codemayq <codingma@pku.edu.cn>
Date: Mon, 7 Aug 2023 09:30:23 +0800
Subject: [PATCH] add detailed model configs

---
 README.md    | 18 ++++++++++--------
 README_zh.md | 18 ++++++++++--------
 2 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/README.md b/README.md
index b546a841..f69bd7e7 100644
--- a/README.md
+++ b/README.md
@@ -41,14 +41,16 @@
 [23/05/31] Now we support training the **BLOOM & BLOOMZ** models in this repo. Try `--model_name_or_path bigscience/bloomz-7b1-mt` and `--lora_target query_key_value` arguments to use the BLOOMZ model.
 
 ## Supported Models
-
-- [LLaMA](https://github.com/facebookresearch/llama) (7B/13B/33B/65B)
-- [LLaMA-2](https://huggingface.co/meta-llama) (7B/13B/70B)
-- [BLOOM](https://huggingface.co/bigscience/bloom) & [BLOOMZ](https://huggingface.co/bigscience/bloomz) (560M/1.1B/1.7B/3B/7.1B/176B)
-- [Falcon](https://huggingface.co/tiiuae/falcon-7b) (7B/40B)
-- [Baichuan](https://huggingface.co/baichuan-inc/baichuan-7B) (7B/13B)
-- [InternLM](https://github.com/InternLM/InternLM) (7B)
-- [Qwen](https://github.com/QwenLM/Qwen-7B) (7B)
+| model                                                       | model size                  | model_name_or_path             | lora_target       | template |
+|-------------------------------------------------------------|-----------------------------|--------------------------------|-------------------|----------|
+| [LLaMA](https://github.com/facebookresearch/llama)          | 7B/13B/33B/65B              | -                              | q_proj,v_proj     | default  |
+| [LLaMA-2](https://huggingface.co/meta-llama)                | 7B/13B/70B                  | meta-llama/Llama-2-7b-hf       | q_proj,v_proj     | llama2   |
+| [BLOOM](https://huggingface.co/bigscience/bloom)            | 560M/1.1B/1.7B/3B/7.1B/176B | bigscience/bloom-7b1           | query_key_value   | default  |
+| [BLOOMZ](https://huggingface.co/bigscience/bloomz)          | 560M/1.1B/1.7B/3B/7.1B/176B | bigscience/bloomz-7b1-mt       | query_key_value   | default  |
+| [Falcon](https://huggingface.co/tiiuae/falcon-7b)           | 7B/40B                      | tiiuae/falcon-7b               | query_key_value   | default  |
+| [Baichuan](https://huggingface.co/baichuan-inc/baichuan-7B) | 7B/13B                      | baichuan-inc/Baichuan-13B-Chat | W_pack            | baichuan |
+| [InternLM](https://github.com/InternLM/InternLM)            | 7B                          | internlm/internlm-7b           | q_proj,v_proj     | intern   |
+| [Qwen](https://github.com/QwenLM/Qwen-7B)                   | 7B                          | Qwen/Qwen-7B-Chat              | c_attn            | chatml   |
 
 ## Supported Training Approaches
 
diff --git a/README_zh.md b/README_zh.md
index c8f11293..4acb2cf7 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -41,14 +41,16 @@
 [23/05/31] 现在我们支持了 **BLOOM & BLOOMZ** 模型的训练。请尝试使用 `--model_name_or_path bigscience/bloomz-7b1-mt` 和 `--lora_target query_key_value` 参数。
 
 ## 模型
-
-- [LLaMA](https://github.com/facebookresearch/llama) (7B/13B/33B/65B)
-- [LLaMA-2](https://huggingface.co/meta-llama) (7B/13B/70B)
-- [BLOOM](https://huggingface.co/bigscience/bloom) & [BLOOMZ](https://huggingface.co/bigscience/bloomz) (560M/1.1B/1.7B/3B/7.1B/176B)
-- [Falcon](https://huggingface.co/tiiuae/falcon-7b) (7B/40B)
-- [Baichuan](https://huggingface.co/baichuan-inc/baichuan-7B) (7B/13B)
-- [InternLM](https://github.com/InternLM/InternLM) (7B)
-- [Qwen](https://github.com/QwenLM/Qwen-7B) (7B)
+| model                                                       | model size                  | model_name_or_path             | lora_target       | template |
+|-------------------------------------------------------------|-----------------------------|--------------------------------|-------------------|----------|
+| [LLaMA](https://github.com/facebookresearch/llama)          | 7B/13B/33B/65B              | -                              | q_proj,v_proj     | default  |
+| [LLaMA-2](https://huggingface.co/meta-llama)                | 7B/13B/70B                  | meta-llama/Llama-2-7b-hf       | q_proj,v_proj     | llama2   |
+| [BLOOM](https://huggingface.co/bigscience/bloom)            | 560M/1.1B/1.7B/3B/7.1B/176B | bigscience/bloom-7b1           | query_key_value   | default  |
+| [BLOOMZ](https://huggingface.co/bigscience/bloomz)          | 560M/1.1B/1.7B/3B/7.1B/176B | bigscience/bloomz-7b1-mt       | query_key_value   | default  |
+| [Falcon](https://huggingface.co/tiiuae/falcon-7b)           | 7B/40B                      | tiiuae/falcon-7b               | query_key_value   | default  |
+| [Baichuan](https://huggingface.co/baichuan-inc/baichuan-7B) | 7B/13B                      | baichuan-inc/Baichuan-13B-Chat | W_pack            | baichuan |
+| [InternLM](https://github.com/InternLM/InternLM)            | 7B                          | internlm/internlm-7b           | q_proj,v_proj     | intern   |
+| [Qwen](https://github.com/QwenLM/Qwen-7B)                   | 7B                          | Qwen/Qwen-7B-Chat              | c_attn            | chatml   |
 
 ## 微调方法