hiyouga
72367307df
improve lora+ impl.
2024-03-13 23:32:51 +08:00
hoshi-hiyouga
4e5e99af43
Merge pull request #2830 from qibaoyuan/lora_plus
...
[FEATURE]: ADD LORA+ ALGORITHM
2024-03-13 20:15:46 +08:00
齐保元
a0965cd62c
[FEATURE]: ADD LORA+ ALGORITHM
2024-03-13 19:43:27 +08:00
hiyouga
dfd451b722
Update wechat.jpg
2024-03-13 19:03:00 +08:00
hiyouga
0b4a5bf509
fix #2817
2024-03-13 12:42:03 +08:00
hiyouga
b9f87cdc11
fix #2802
2024-03-13 12:33:45 +08:00
hiyouga
96ce76cd27
fix kv cache
2024-03-13 01:21:50 +08:00
hiyouga
19ef482649
support QDoRA
2024-03-12 22:12:42 +08:00
hiyouga
70a3052dd8
patch for gemma cpt
2024-03-12 21:21:54 +08:00
hiyouga
60cc17f3a8
fix plot issues
2024-03-12 18:41:35 +08:00
hiyouga
b3247d6a16
support olmo
2024-03-12 18:30:38 +08:00
hiyouga
8d8956bad5
fix #2802
2024-03-12 17:08:34 +08:00
hiyouga
06c97083e1
fix #2803
2024-03-12 16:57:39 +08:00
hiyouga
07f9b754a7
fix #2782 #2798
2024-03-12 15:53:29 +08:00
hoshi-hiyouga
c901aa63ff
Merge pull request #2743 from S3Studio/DockerizeSupport
...
Add dockerize support
2024-03-12 00:05:49 +08:00
hiyouga
e874c00906
fix #2775
2024-03-11 00:42:54 +08:00
hiyouga
352693e2dc
tiny fix
2024-03-11 00:17:18 +08:00
hiyouga
be99799413
update parser
2024-03-10 13:35:20 +08:00
hiyouga
8664262cde
support layerwise galore
2024-03-10 00:24:11 +08:00
hiyouga
18ffce36b5
fix #2732
2024-03-09 22:37:16 +08:00
hiyouga
bdb496644c
allow non-packing pretraining
2024-03-09 22:21:46 +08:00
hiyouga
412c52e325
fix #2766
2024-03-09 21:35:24 +08:00
hiyouga
af0e370fb1
use default arg for freeze tuning
2024-03-09 06:08:48 +08:00
hiyouga
818726e9bc
add GaLore results
2024-03-09 04:11:55 +08:00
hiyouga
393c2de27c
update hardware requirements
2024-03-09 03:58:18 +08:00
hiyouga
4c00bcdcae
update examples
2024-03-09 02:30:37 +08:00
hiyouga
e8dd38b7fd
fix #2756 , patch #2746
2024-03-09 02:01:26 +08:00
hoshi-hiyouga
516d0ddc66
Merge pull request #2746 from stephen-nju/main
...
fix deepspeed ppo RuntimeError
2024-03-09 01:37:00 +08:00
hiyouga
74ff8664d7
Update setup.py
2024-03-09 00:14:48 +08:00
hiyouga
10be2f0ecc
fix aqlm version
2024-03-09 00:09:09 +08:00
hiyouga
8a45213440
fix example params
2024-03-08 20:41:43 +08:00
stephen_zhu
aa71571b77
update
2024-03-08 12:47:44 +08:00
stephen
cdb7f82869
fix ppo runtime error
2024-03-08 11:48:26 +08:00
S3Studio
3d911ae713
Add dockerize support
...
Already tested with the model of Qwen:1.8B and the dataset of alpaca_data_zh. Some python libraries are added to the Dockerfile as a result of the exception messages displayed throughout test procedure.
2024-03-08 10:47:28 +08:00
hiyouga
4a2cc60b94
update readme
2024-03-08 03:06:21 +08:00
hiyouga
5d956e2a51
fix chat engine, update webui
2024-03-08 03:01:53 +08:00
hiyouga
5cd4947650
Update setup.py
2024-03-08 01:23:00 +08:00
hiyouga
0ac6b40a47
update galore args
2024-03-08 01:17:32 +08:00
hiyouga
33a4c24a8a
fix galore
2024-03-08 00:44:51 +08:00
hiyouga
57452a4aa1
add Yi-9B model
2024-03-07 23:11:57 +08:00
hiyouga
7230e1177d
add galore examples
2024-03-07 22:53:45 +08:00
hiyouga
28f7862188
support galore
2024-03-07 22:41:36 +08:00
hiyouga
725f7cd70f
update readme
2024-03-07 20:34:49 +08:00
hiyouga
77211d9843
tiny fix
2024-03-07 20:29:34 +08:00
hoshi-hiyouga
a0dc721816
Merge pull request #2739 from hiyouga/dev-vllm
...
support vllm
2024-03-07 20:28:18 +08:00
hiyouga
d07ad5cc1c
support vllm
2024-03-07 20:26:31 +08:00
hiyouga
f74f804a71
fix #2735
2024-03-07 16:15:53 +08:00
hoshi-hiyouga
2185855bdb
Merge pull request #2730 from cx2333-gt/main
...
fix flash_attn in train_web
2024-03-07 14:37:18 +08:00
cx2333
94b7a1b915
revert choice name
2024-03-07 14:28:55 +08:00
hiyouga
921ee82267
fix chatglm3 template
2024-03-07 14:26:16 +08:00