Commit Graph

1164 Commits

Author SHA1 Message Date
hiyouga 72367307df improve lora+ impl. 2024-03-13 23:32:51 +08:00
hoshi-hiyouga 4e5e99af43
Merge pull request #2830 from qibaoyuan/lora_plus
[FEATURE]: ADD LORA+ ALGORITHM
2024-03-13 20:15:46 +08:00
齐保元 a0965cd62c [FEATURE]: ADD LORA+ ALGORITHM 2024-03-13 19:43:27 +08:00
hiyouga dfd451b722 Update wechat.jpg 2024-03-13 19:03:00 +08:00
hiyouga 0b4a5bf509 fix #2817 2024-03-13 12:42:03 +08:00
hiyouga b9f87cdc11 fix #2802 2024-03-13 12:33:45 +08:00
hiyouga 96ce76cd27 fix kv cache 2024-03-13 01:21:50 +08:00
hiyouga 19ef482649 support QDoRA 2024-03-12 22:12:42 +08:00
hiyouga 70a3052dd8 patch for gemma cpt 2024-03-12 21:21:54 +08:00
hiyouga 60cc17f3a8 fix plot issues 2024-03-12 18:41:35 +08:00
hiyouga b3247d6a16 support olmo 2024-03-12 18:30:38 +08:00
hiyouga 8d8956bad5 fix #2802 2024-03-12 17:08:34 +08:00
hiyouga 06c97083e1 fix #2803 2024-03-12 16:57:39 +08:00
hiyouga 07f9b754a7 fix #2782 #2798 2024-03-12 15:53:29 +08:00
hoshi-hiyouga c901aa63ff
Merge pull request #2743 from S3Studio/DockerizeSupport
Add dockerize support
2024-03-12 00:05:49 +08:00
hiyouga e874c00906 fix #2775 2024-03-11 00:42:54 +08:00
hiyouga 352693e2dc tiny fix 2024-03-11 00:17:18 +08:00
hiyouga be99799413 update parser 2024-03-10 13:35:20 +08:00
hiyouga 8664262cde support layerwise galore 2024-03-10 00:24:11 +08:00
hiyouga 18ffce36b5 fix #2732 2024-03-09 22:37:16 +08:00
hiyouga bdb496644c allow non-packing pretraining 2024-03-09 22:21:46 +08:00
hiyouga 412c52e325 fix #2766 2024-03-09 21:35:24 +08:00
hiyouga af0e370fb1 use default arg for freeze tuning 2024-03-09 06:08:48 +08:00
hiyouga 818726e9bc add GaLore results 2024-03-09 04:11:55 +08:00
hiyouga 393c2de27c update hardware requirements 2024-03-09 03:58:18 +08:00
hiyouga 4c00bcdcae update examples 2024-03-09 02:30:37 +08:00
hiyouga e8dd38b7fd fix #2756 , patch #2746 2024-03-09 02:01:26 +08:00
hoshi-hiyouga 516d0ddc66
Merge pull request #2746 from stephen-nju/main
fix deepspeed ppo RuntimeError
2024-03-09 01:37:00 +08:00
hiyouga 74ff8664d7 Update setup.py 2024-03-09 00:14:48 +08:00
hiyouga 10be2f0ecc fix aqlm version 2024-03-09 00:09:09 +08:00
hiyouga 8a45213440 fix example params 2024-03-08 20:41:43 +08:00
stephen_zhu aa71571b77 update 2024-03-08 12:47:44 +08:00
stephen cdb7f82869 fix ppo runtime error 2024-03-08 11:48:26 +08:00
S3Studio 3d911ae713 Add dockerize support
Already tested with the model of Qwen:1.8B and the dataset of alpaca_data_zh. Some python libraries are added to the Dockerfile as a result of the exception messages displayed throughout test procedure.
2024-03-08 10:47:28 +08:00
hiyouga 4a2cc60b94 update readme 2024-03-08 03:06:21 +08:00
hiyouga 5d956e2a51 fix chat engine, update webui 2024-03-08 03:01:53 +08:00
hiyouga 5cd4947650 Update setup.py 2024-03-08 01:23:00 +08:00
hiyouga 0ac6b40a47 update galore args 2024-03-08 01:17:32 +08:00
hiyouga 33a4c24a8a fix galore 2024-03-08 00:44:51 +08:00
hiyouga 57452a4aa1 add Yi-9B model 2024-03-07 23:11:57 +08:00
hiyouga 7230e1177d add galore examples 2024-03-07 22:53:45 +08:00
hiyouga 28f7862188 support galore 2024-03-07 22:41:36 +08:00
hiyouga 725f7cd70f update readme 2024-03-07 20:34:49 +08:00
hiyouga 77211d9843 tiny fix 2024-03-07 20:29:34 +08:00
hoshi-hiyouga a0dc721816
Merge pull request #2739 from hiyouga/dev-vllm
support vllm
2024-03-07 20:28:18 +08:00
hiyouga d07ad5cc1c support vllm 2024-03-07 20:26:31 +08:00
hiyouga f74f804a71 fix #2735 2024-03-07 16:15:53 +08:00
hoshi-hiyouga 2185855bdb
Merge pull request #2730 from cx2333-gt/main
fix flash_attn in train_web
2024-03-07 14:37:18 +08:00
cx2333 94b7a1b915 revert choice name 2024-03-07 14:28:55 +08:00
hiyouga 921ee82267 fix chatglm3 template 2024-03-07 14:26:16 +08:00