hiyouga
|
d2a676c8ba
|
improve model export
|
2024-01-05 18:51:49 +08:00 |
hiyouga
|
f6fdd83f8a
|
fix #2098
|
2024-01-05 17:11:26 +08:00 |
hiyouga
|
ed216bbc46
|
fix qwen template
|
2024-01-05 16:14:56 +08:00 |
hiyouga
|
33f2c0d4f8
|
fix #2081
|
2024-01-04 23:19:08 +08:00 |
hiyouga
|
cc275abe09
|
fix #2090
|
2024-01-04 23:05:08 +08:00 |
hiyouga
|
368b31f6b7
|
fix #2067
|
2024-01-04 22:53:03 +08:00 |
hiyouga
|
1696698eb9
|
fix dispatch
|
2024-01-03 16:33:16 +08:00 |
hiyouga
|
24d8d6f224
|
fix valuehead patch
|
2024-01-03 16:19:23 +08:00 |
hiyouga
|
55021097d5
|
fix rm server
|
2024-01-03 15:30:46 +08:00 |
hiyouga
|
3014e3c189
|
Update wechat.jpg
|
2023-12-31 15:05:59 +08:00 |
hiyouga
|
4519d95923
|
Update wechat.jpg
|
2023-12-29 15:26:22 +08:00 |
hiyouga
|
ce2156eaa8
|
fix #2014
|
2023-12-29 15:17:22 +08:00 |
hiyouga
|
c7ea17d616
|
add yuan model
|
2023-12-29 13:50:24 +08:00 |
hiyouga
|
47da742fc9
|
fix version
|
2023-12-29 04:53:36 +08:00 |
hiyouga
|
65c5b0477c
|
fix args
|
2023-12-28 18:47:19 +08:00 |
hiyouga
|
e165354fac
|
fix export format
|
2023-12-28 18:40:46 +08:00 |
hiyouga
|
5431be42f9
|
fix ppo trainer
|
2023-12-28 18:09:28 +08:00 |
hiyouga
|
db6cb2d0e7
|
add model link
|
2023-12-25 19:44:38 +08:00 |
hiyouga
|
5b93d545e2
|
tiny update
|
2023-12-25 18:29:34 +08:00 |
hiyouga
|
e4bb846c43
|
fix bug
|
2023-12-24 19:20:12 +08:00 |
hiyouga
|
6629087e12
|
update loader
|
2023-12-24 19:10:23 +08:00 |
hiyouga
|
e44b82ee24
|
update patcher
|
2023-12-23 15:24:27 +08:00 |
hiyouga
|
0bbf7118df
|
fix #1909
|
2023-12-23 14:42:20 +08:00 |
hiyouga
|
0ad86a4f62
|
update readme
|
2023-12-23 02:17:41 +08:00 |
hiyouga
|
779cfefb78
|
fix unsloth dtype
|
2023-12-23 01:59:49 +08:00 |
hiyouga
|
074745b170
|
fix dpo trainer
|
2023-12-23 01:51:55 +08:00 |
hiyouga
|
9a18a85639
|
llama board: add unsloth
|
2023-12-23 00:35:53 +08:00 |
hiyouga
|
7aad0b889d
|
support unsloth
|
2023-12-23 00:14:33 +08:00 |
hoshi-hiyouga
|
315b8367cb
|
Merge pull request #1953 from ShaneTian/model-load-bf16
Fix slow model initialization in bfloat16 dtype.
|
2023-12-22 17:29:54 +08:00 |
ShaneTian
|
d032daa4bd
|
Fix slow model initialization in bfloat16 dtype.
|
2023-12-22 16:27:28 +08:00 |
hiyouga
|
ba69378841
|
fix param type
|
2023-12-21 17:33:01 +08:00 |
hiyouga
|
083355fc05
|
fix ds zero3 check
|
2023-12-21 01:19:22 +08:00 |
hiyouga
|
af0194e6d9
|
match version
|
2023-12-20 22:17:35 +08:00 |
hoshi-hiyouga
|
ba4d32bf59
|
Merge pull request #1932 from ShaneTian/main
Update transformers to 4.36.2 to resolve multi-node saving bug.
|
2023-12-20 22:13:28 +08:00 |
ShaneTian
|
390f0caf7f
|
Update transformers to 4.36.2 to resolve bug when saving a checkpoint in the multi-node setting.
|
2023-12-20 22:00:41 +08:00 |
hiyouga
|
7910dbae92
|
Update wechat.jpg
|
2023-12-20 19:24:37 +08:00 |
hiyouga
|
dec360d5ae
|
fix stop words
|
2023-12-20 19:06:43 +08:00 |
hiyouga
|
5af8841c4f
|
fix yi template #1895
|
2023-12-20 18:58:16 +08:00 |
hiyouga
|
624cc21281
|
improve quantization
|
2023-12-20 18:27:16 +08:00 |
hiyouga
|
c4a3977ad7
|
add max_memory for gptq #1923
|
2023-12-20 18:15:17 +08:00 |
hiyouga
|
31165a9822
|
fix #1073 #1462 #1735 #1908
|
2023-12-20 17:15:40 +08:00 |
hiyouga
|
ec1fe1daa9
|
optimize data loading logic
|
2023-12-20 16:15:41 +08:00 |
hiyouga
|
c6abbbfe90
|
fix #1909
|
2023-12-20 16:11:07 +08:00 |
hiyouga
|
f86857bd9e
|
fix mixtral inference #1821
|
2023-12-20 15:11:15 +08:00 |
hiyouga
|
0c6ab7c75e
|
fix #1900
|
2023-12-19 17:21:46 +08:00 |
hiyouga
|
edb7d177c2
|
update readme
|
2023-12-18 22:29:45 +08:00 |
hiyouga
|
a67a440644
|
add codegeex template
|
2023-12-18 19:52:35 +08:00 |
hiyouga
|
2df923540c
|
add xverse-65B-2 model
|
2023-12-18 19:24:09 +08:00 |
hiyouga
|
709ac8870a
|
add models
|
2023-12-18 19:09:31 +08:00 |
hiyouga
|
71a9c16171
|
fix tokenizer for Yi chat models #1617 #1875
|
2023-12-18 17:18:11 +08:00 |