hiyouga
|
f621f7631a
|
add default template
|
2023-06-16 21:12:17 +08:00 |
hiyouga
|
334d1a6d26
|
add belle multiturn dataset
|
2023-06-16 20:01:16 +08:00 |
hiyouga
|
a6c4b141cd
|
fix freeze layers
|
2023-06-16 17:38:21 +08:00 |
hiyouga
|
fc4d8155b3
|
add source prefix
|
2023-06-16 16:32:17 +08:00 |
hiyouga
|
0574b590ef
|
support loading lora from hub
|
2023-06-16 00:02:17 +08:00 |
hiyouga
|
0cee6ad67f
|
support baichuan model
|
2023-06-15 16:02:01 +08:00 |
hiyouga
|
c527399424
|
fix bug in template vanilla
|
2023-06-15 14:36:55 +08:00 |
hiyouga
|
0a36658bb6
|
Update wechat.jpg
|
2023-06-15 13:48:53 +08:00 |
hiyouga
|
d668f8b501
|
add BOS token in pre-training
|
2023-06-15 01:46:17 +08:00 |
hiyouga
|
b6faf0207d
|
support multiturn training like FastChat
|
2023-06-14 22:27:39 +08:00 |
hiyouga
|
875e8e2349
|
fix loading valuehead
|
2023-06-13 11:13:06 +08:00 |
hiyouga
|
531a3764d9
|
fix generating args
|
2023-06-13 01:33:56 +08:00 |
hiyouga
|
cec6524d6b
|
support RM metrics, add generating Args
|
2023-06-12 15:48:48 +08:00 |
hoshi-hiyouga
|
e3f380c1be
|
Merge pull request #26 from BUAADreamer/main
add code for reading from multi files in one directory
|
2023-06-11 19:06:29 +08:00 |
BUAADreamer
|
e3b53a67c7
|
update json line file to .jsonl
|
2023-06-11 18:59:19 +08:00 |
BUAADreamer
|
676d910260
|
add some
|
2023-06-11 18:55:53 +08:00 |
BUAADreamer
|
a2af9df5a9
|
add code for reading from multi files in one directory
|
2023-06-10 16:27:30 +08:00 |
BUAADreamer
|
3dd5f9a874
|
add code for reading from multi files in one directory
|
2023-06-10 15:53:47 +08:00 |
hiyouga
|
2ba5d69c7f
|
tiny fix
|
2023-06-07 16:42:31 +08:00 |
hiyouga
|
16c2860d56
|
tiny fix
|
2023-06-07 16:02:07 +08:00 |
hiyouga
|
edafb97733
|
tiny fix
|
2023-06-07 12:58:14 +08:00 |
hiyouga
|
3875b19a34
|
add templates
|
2023-06-07 12:40:44 +08:00 |
hiyouga
|
17acf3a3eb
|
add belle template
|
2023-06-07 12:30:11 +08:00 |
hiyouga
|
ce43386080
|
tiny fix
|
2023-06-07 12:08:39 +08:00 |
hiyouga
|
909af8f496
|
add prompt template class
|
2023-06-07 11:55:25 +08:00 |
hiyouga
|
5d021d4ad5
|
fix inference, add prompt template
|
2023-06-07 10:52:35 +08:00 |
hiyouga
|
13d1f0709c
|
recover logging
|
2023-06-06 21:36:37 +08:00 |
hiyouga
|
4eb17bcf6c
|
support distributed quantized training
|
2023-06-06 17:39:41 +08:00 |
hiyouga
|
3d8d5ee5d5
|
add API demo from #1
|
2023-06-05 21:32:18 +08:00 |
hoshi-hiyouga
|
06e1b120e1
|
Merge pull request #11 from hiyouga/api
Api
|
2023-06-05 20:58:02 +08:00 |
hiyouga
|
a38d57ddd7
|
fix bug in web demo
|
2023-06-05 17:58:29 +08:00 |
hiyouga
|
56eb99106a
|
increase max length in cli demo
|
2023-06-05 16:49:14 +08:00 |
hiyouga
|
fe1d930816
|
implement stream generating
|
2023-06-05 16:43:44 +08:00 |
hiyouga
|
44298c1235
|
tiny fix
|
2023-06-05 15:25:22 +08:00 |
hiyouga
|
38b83533a4
|
tiny fix
|
2023-06-04 16:35:50 +08:00 |
hiyouga
|
eac9921e5c
|
tiny fix
|
2023-06-04 12:55:40 +08:00 |
hiyouga
|
3b9eee8cd2
|
support QLoRA
|
2023-06-04 00:08:56 +08:00 |
hiyouga
|
1bd13d7ca1
|
fix int8 inference
|
2023-06-03 23:22:05 +08:00 |
hiyouga
|
926291940d
|
reduce repetition penalty
|
2023-06-03 21:57:39 +08:00 |
hiyouga
|
0f69a0c19e
|
fix int8 inference
|
2023-06-03 21:17:47 +08:00 |
hiyouga
|
de09ee1315
|
add ziya prompt template
|
2023-06-03 19:05:51 +08:00 |
hiyouga
|
771f454ff1
|
use low_cpu_mem_usage to speed up loading
|
2023-06-03 18:19:01 +08:00 |
hiyouga
|
dca27b4412
|
add logits processor
|
2023-06-03 16:34:54 +08:00 |
hiyouga
|
ed6161fa6a
|
remove unused code
|
2023-06-03 00:10:54 +08:00 |
hiyouga
|
72a85ccc39
|
add wechat
|
2023-06-02 21:47:10 +08:00 |
hiyouga
|
b8a034807e
|
tiny fix
|
2023-06-02 19:02:25 +08:00 |
hiyouga
|
e3aaef7d4a
|
fix layer norm name in PPO
|
2023-06-02 17:30:01 +08:00 |
hiyouga
|
bd565af370
|
fix #1
|
2023-06-02 14:25:00 +08:00 |
hiyouga
|
50d9a20f81
|
alter rewards data type
|
2023-06-02 14:19:51 +08:00 |
hiyouga
|
e6126244c1
|
fix possibly OOM error
|
2023-06-01 23:54:44 +08:00 |