hiyouga
|
ad76482cf9
|
add papers
|
2024-02-25 15:18:58 +08:00 |
hiyouga
|
c99e19641a
|
support gemma
|
2024-02-21 23:27:36 +08:00 |
hiyouga
|
daa3185350
|
tiny fix
|
2024-02-21 18:30:29 +08:00 |
hoshi-hiyouga
|
869fd208a8
|
Update README.md
|
2024-02-20 16:07:55 +08:00 |
codemayq
|
d47e40633a
|
1. update the version of pre-built bitsandbytes library
2. add pre-built flash-attn library
|
2024-02-20 11:28:25 +08:00 |
codemayq
|
95f53a46bd
|
1. update the version of pre-built bitsandbytes library
2. add pre-built flash-attn library
|
2024-02-20 11:26:22 +08:00 |
hiyouga
|
7924ffc55d
|
support llama pro #2338 , add rslora
|
2024-02-15 02:27:36 +08:00 |
hiyouga
|
7d2dc83c5e
|
improve aligner
|
2024-02-10 16:39:19 +08:00 |
hiyouga
|
54ea9684ed
|
improve fix tokenizer
|
2024-02-09 14:53:14 +08:00 |
hoshi-hiyouga
|
d0daaa01f9
|
Merge pull request #2423 from mayflower/main
Support for german sft and dpo
|
2024-02-07 15:58:20 +08:00 |
hiyouga
|
ccabb5b04a
|
support qwen1.5
|
2024-02-06 00:10:51 +08:00 |
Johann-Peter Hartmann
|
d9a8301ed4
|
Add support for german datasets
|
2024-01-30 10:18:01 +01:00 |
hiyouga
|
a0d59aa4ec
|
release v0.5.0 (real)
|
2024-01-21 01:54:49 +08:00 |
hiyouga
|
5608a0da8e
|
update readme
|
2024-01-18 14:30:48 +08:00 |
hiyouga
|
5a207bb723
|
tiny fix
|
2024-01-15 23:34:23 +08:00 |
Junu Moon(Fran)
|
7a320de097
|
fix: typo on README.md
|
2024-01-15 19:50:35 +09:00 |
hiyouga
|
ca3933dc52
|
support deepseek moe
|
2024-01-14 00:14:49 +08:00 |
hiyouga
|
d1a73fe26c
|
fix phi modules
|
2024-01-13 23:12:47 +08:00 |
JessyTsu1
|
8c5e4a8896
|
Update README.md
|
2024-01-11 23:18:29 +08:00 |
JessyTsu1
|
d72aff5ae6
|
Update README.md
|
2024-01-11 23:17:00 +08:00 |
hiyouga
|
4571068e1e
|
fix #1789
|
2024-01-09 18:31:27 +08:00 |
hiyouga
|
c7ea17d616
|
add yuan model
|
2023-12-29 13:50:24 +08:00 |
hiyouga
|
65c5b0477c
|
fix args
|
2023-12-28 18:47:19 +08:00 |
hiyouga
|
5b93d545e2
|
tiny update
|
2023-12-25 18:29:34 +08:00 |
hiyouga
|
e44b82ee24
|
update patcher
|
2023-12-23 15:24:27 +08:00 |
hiyouga
|
0ad86a4f62
|
update readme
|
2023-12-23 02:17:41 +08:00 |
hiyouga
|
7aad0b889d
|
support unsloth
|
2023-12-23 00:14:33 +08:00 |
hiyouga
|
edb7d177c2
|
update readme
|
2023-12-18 22:29:45 +08:00 |
hiyouga
|
2b4e5f0d32
|
update readme
|
2023-12-18 15:46:45 +08:00 |
hiyouga
|
71389be37c
|
support autogptq in llama board #246
|
2023-12-16 16:31:30 +08:00 |
hiyouga
|
3524aa1e58
|
support quantization in export model
|
2023-12-15 23:44:50 +08:00 |
hiyouga
|
87ef3f47b5
|
update dc link
|
2023-12-15 22:11:31 +08:00 |
hiyouga
|
0716f5e470
|
refactor adapter hparam
|
2023-12-15 20:53:11 +08:00 |
hiyouga
|
3a8a50d4d4
|
remove loftq
|
2023-12-13 01:53:46 +08:00 |
hiyouga
|
28cc07868c
|
update readme
|
2023-12-12 23:30:29 +08:00 |
hiyouga
|
6219dfbd93
|
support loftq
|
2023-12-12 22:47:06 +08:00 |
hiyouga
|
0a9c6e0146
|
support system column #1765
|
2023-12-12 19:45:59 +08:00 |
hiyouga
|
8cace77808
|
update readme
|
2023-12-12 11:44:30 +08:00 |
hiyouga
|
96380f5e18
|
support mixtral
|
2023-12-12 11:39:04 +08:00 |
hiyouga
|
997b65f291
|
update readme
|
2023-12-04 11:22:01 +08:00 |
hiyouga
|
8ede3128df
|
update readme
|
2023-12-04 11:02:29 +08:00 |
hiyouga
|
5b78e269b6
|
add logo
|
2023-12-02 01:31:24 +08:00 |
hiyouga
|
0cb260f453
|
update readme
|
2023-12-01 22:58:29 +08:00 |
hiyouga
|
bd42c229b0
|
patch modelscope
|
2023-12-01 22:53:15 +08:00 |
hoshi-hiyouga
|
00f5c9ee16
|
Merge branch 'main' into feat/support_ms
|
2023-12-01 20:23:46 +08:00 |
yuze.zyz
|
5aa6751e52
|
add readme
|
2023-12-01 16:11:30 +08:00 |
hiyouga
|
bf6f6aeefe
|
fix #1696
|
2023-12-01 15:34:50 +08:00 |
hiyouga
|
509abe8864
|
add models
|
2023-11-30 19:16:13 +08:00 |
hiyouga
|
9d38e5687d
|
add gpu requirement #1657
|
2023-11-29 12:05:03 +08:00 |
hiyouga
|
5085b00a1d
|
update readme
|
2023-11-21 13:15:46 +08:00 |
hiyouga
|
9ea9380145
|
support GPTQ tuning #729 #1481 #1545 , fix chatglm template #1453 #1480 #1569
|
2023-11-20 22:52:11 +08:00 |
hiyouga
|
5021062493
|
update ppo trainer
|
2023-11-20 21:39:15 +08:00 |
hoshi-hiyouga
|
48211e3799
|
Merge pull request #1553 from hannlp/hans
Change the default argument settings for PPO training
|
2023-11-20 20:32:55 +08:00 |
hiyouga
|
a2019c8b61
|
update benchmark
|
2023-11-18 11:30:01 +08:00 |
hiyouga
|
90212280d6
|
update readme
|
2023-11-18 11:15:56 +08:00 |
hiyouga
|
329134f58c
|
add benchmark
|
2023-11-18 11:09:52 +08:00 |
Yuchen Han
|
c9b499fa7e
|
Update README.md
|
2023-11-17 00:17:36 -08:00 |
hiyouga
|
72e6699547
|
update readme
|
2023-11-16 15:58:37 +08:00 |
hiyouga
|
ce78303600
|
support full-parameter PPO
|
2023-11-16 02:08:04 +08:00 |
hiyouga
|
8350bcf85d
|
add demo mode for web UI
|
2023-11-15 23:51:26 +08:00 |
hiyouga
|
1e19cf242a
|
update readme and constants
|
2023-11-15 18:04:37 +08:00 |
hiyouga
|
88ab33254e
|
fix dc link
|
2023-11-13 23:22:56 +08:00 |
hiyouga
|
442aefb925
|
refactor evaluation, upgrade trl to 074
|
2023-11-13 22:20:35 +08:00 |
hiyouga
|
3697a3dc9a
|
refactor constants
|
2023-11-10 14:16:10 +08:00 |
hiyouga
|
b3572659f5
|
update readme
|
2023-11-09 16:00:24 +08:00 |
hiyouga
|
e1e04cb1f1
|
update readme (list in alphabetical order)
|
2023-11-06 17:18:12 +08:00 |
hiyouga
|
a7eeb8e17c
|
update templates
|
2023-11-06 12:25:47 +08:00 |
hiyouga
|
cc8ffa10d8
|
update data readme (zh)
|
2023-11-02 23:42:49 +08:00 |
hiyouga
|
a837172413
|
support sharegpt format, add datasets
|
2023-11-02 23:10:04 +08:00 |
hiyouga
|
640a520108
|
update projects
|
2023-10-29 22:53:47 +08:00 |
hiyouga
|
59f342e76f
|
add projects
|
2023-10-29 22:07:13 +08:00 |
hiyouga
|
52fc24d166
|
fix vicuna template
|
2023-10-27 22:15:25 +08:00 |
hiyouga
|
4600c29e93
|
update readme
|
2023-10-27 19:19:03 +08:00 |
hiyouga
|
1c0ab9a908
|
support chatglm3
|
2023-10-27 19:16:28 +08:00 |
hiyouga
|
7b4acf7265
|
reimplement neftune
|
2023-10-22 16:15:08 +08:00 |
anvie
|
57fb40aa04
|
add NEFTune optimization
|
2023-10-21 13:24:10 +07:00 |
hiyouga
|
b665e9e133
|
fix #1232
|
2023-10-20 23:28:52 +08:00 |
hiyouga
|
6496a99b7d
|
fix #1217
|
2023-10-19 15:52:24 +08:00 |
hoshi-hiyouga
|
beacb798ea
|
Update README.md
|
2023-10-16 00:23:37 +08:00 |
hiyouga
|
f5d0da4d2a
|
update readme
|
2023-10-15 20:28:14 +08:00 |
hoshi-hiyouga
|
25d326e135
|
Update README.md
|
2023-10-15 20:23:22 +08:00 |
hiyouga
|
ea82f8a82a
|
refactor export, fix #1190
|
2023-10-15 16:01:48 +08:00 |
hiyouga
|
cb42676694
|
update readme
|
2023-10-13 13:53:43 +08:00 |
hiyouga
|
c4102f306a
|
update discord link
|
2023-10-12 21:44:28 +08:00 |
hiyouga
|
197c754d73
|
rename repository
|
2023-10-12 21:42:29 +08:00 |
hiyouga
|
8e2ed6b8ce
|
update readme
|
2023-10-09 20:02:50 +08:00 |
hiyouga
|
d11a545463
|
fix #1068 #1074
|
2023-09-28 14:39:16 +08:00 |
hiyouga
|
4eae061464
|
update readme
|
2023-09-27 21:57:47 +08:00 |
hiyouga
|
90375f600d
|
support LongLoRA
|
2023-09-27 21:55:50 +08:00 |
hiyouga
|
4dd9b4d982
|
add CMMLU, update eval script
|
2023-09-23 21:10:17 +08:00 |
hiyouga
|
badd2735b5
|
move file
|
2023-09-23 11:52:12 +08:00 |
hiyouga
|
465ee8119a
|
add MMLU and C-Eval script
|
2023-09-23 00:34:17 +08:00 |
hiyouga
|
5cc7a44784
|
fix #1000
|
2023-09-22 15:00:48 +08:00 |
hiyouga
|
044d4425b4
|
update readme
|
2023-09-22 14:34:13 +08:00 |
hiyouga
|
ace3f85a72
|
tiny fix
|
2023-09-21 15:25:29 +08:00 |
hiyouga
|
acda45e463
|
update readme
|
2023-09-16 17:33:01 +08:00 |
hiyouga
|
026af87e7f
|
add MathInstruct dataset
|
2023-09-13 22:30:14 +08:00 |
hiyouga
|
d4be857e23
|
fix #762 #814
|
2023-09-12 16:10:10 +08:00 |
hiyouga
|
ccb3553576
|
Release v0.1.8
|
2023-09-11 17:31:34 +08:00 |
hiyouga
|
baac22f4f4
|
truncate readme
|
2023-09-10 21:04:20 +08:00 |
hiyouga
|
63611de7ae
|
update readme
|
2023-09-10 21:01:20 +08:00 |
hiyouga
|
34005252df
|
update readme
|
2023-09-10 20:52:21 +08:00 |
hiyouga
|
d8aa1404be
|
support FlashAttention2
|
2023-09-10 20:43:56 +08:00 |
hiyouga
|
bca1a247bc
|
support lora target auto find
|
2023-09-09 15:38:37 +08:00 |
hiyouga
|
d8d82ca281
|
fix chatglm2 tokenizer
|
2023-09-09 13:50:29 +08:00 |
hiyouga
|
85b1f6632a
|
fix baichuan templates
|
2023-09-07 18:54:14 +08:00 |
hiyouga
|
0531886e1f
|
update baichuan2 template
|
2023-09-06 21:43:06 +08:00 |
hiyouga
|
60603a94c6
|
add Baichuan2 models
|
2023-09-06 18:40:11 +08:00 |
hiyouga
|
a9d1fb72f7
|
refactor dataset_attr, add eos in pt, fix #757
|
2023-09-01 19:00:45 +08:00 |
codemayq
|
604f85487b
|
add ad gen dataset
|
2023-08-27 20:35:32 +08:00 |
hiyouga
|
4318347d3f
|
update template
|
2023-08-22 19:46:09 +08:00 |
hiyouga
|
9020524418
|
fix PPO trainer #551 , update readme
|
2023-08-18 11:43:10 +08:00 |
hiyouga
|
e4eec9ddfd
|
update readme
|
2023-08-18 01:51:55 +08:00 |
hiyouga
|
58f13e22da
|
update training resuming
|
2023-08-18 01:41:17 +08:00 |
hiyouga
|
ff0aa793b6
|
update readme
|
2023-08-17 11:00:22 +08:00 |
hiyouga
|
ec94274ca1
|
web UI integrating RLHF
|
2023-08-14 10:48:47 +08:00 |
hiyouga
|
8a79ded55d
|
update readme
|
2023-08-12 21:29:06 +08:00 |
hiyouga
|
2618e0b5a7
|
update readme
|
2023-08-12 21:23:05 +08:00 |
hiyouga
|
1836c020c5
|
update readme
|
2023-08-12 21:00:11 +08:00 |
hiyouga
|
a48cb0d474
|
Release v0.1.6
|
2023-08-11 23:25:57 +08:00 |
hiyouga
|
3ec4351cfd
|
support DPO training (2305.18290)
|
2023-08-11 03:02:53 +08:00 |
hiyouga
|
20cf27976f
|
update readme
|
2023-08-07 15:02:02 +08:00 |
codemayq
|
293bd95712
|
add detailed model configs
|
2023-08-07 09:30:23 +08:00 |
hiyouga
|
87f8f830e2
|
support Qwen-7B, fix InternLM-7B inference
|
2023-08-03 15:53:32 +08:00 |
hiyouga
|
c689857bbb
|
release v0.1.5
|
2023-08-02 16:10:31 +08:00 |
hiyouga
|
ccde51c5ea
|
update readme
|
2023-08-01 18:48:27 +08:00 |
hiyouga
|
ac88ce5233
|
fix RM save model
|
2023-08-01 11:56:17 +08:00 |
hiyouga
|
973a638665
|
release v0.1.4
|
2023-08-01 10:08:47 +08:00 |
hiyouga
|
62dca5bb82
|
update readme
|
2023-07-31 23:42:32 +08:00 |
hiyouga
|
0411a4b3e1
|
support streaming data, fix #284 #274 #268
|
2023-07-31 23:33:00 +08:00 |
hiyouga
|
5ee87138e4
|
update readme
|
2023-07-28 17:36:00 +08:00 |
hiyouga
|
f5c2ccdde4
|
update dataset
|
2023-07-26 17:05:12 +08:00 |
hiyouga
|
00efa8a07f
|
fix #242
|
2023-07-25 17:04:02 +08:00 |
hiyouga
|
182b425043
|
update dataset
|
2023-07-23 20:01:43 +08:00 |
hiyouga
|
035c966d5c
|
update readme, fix web ui postprocess
|
2023-07-22 14:29:22 +08:00 |
mrhan1993
|
9f0b57b370
|
根据GLM Efficient Tuning添加中文README,web添加了server_port
|
2023-07-21 16:57:58 +08:00 |
hiyouga
|
c3fcb67486
|
Update README.md
|
2023-07-20 17:23:16 +08:00 |
hiyouga
|
7159bc54ed
|
add datasets
|
2023-07-19 20:59:15 +08:00 |
hiyouga
|
7a3ade8c69
|
support LLaMA-2
|
2023-07-19 16:42:14 +08:00 |
hiyouga
|
b447fa85aa
|
add web demo
|
2023-07-18 17:21:16 +08:00 |
hiyouga
|
f8193e8009
|
release v0.1.0
|
2023-07-18 00:18:25 +08:00 |
hiyouga
|
1e2b7e0c4b
|
Update README.md
|
2023-07-15 17:20:39 +08:00 |
hiyouga
|
f751376613
|
modity code structure
|
2023-07-15 16:54:28 +08:00 |
hiyouga
|
08439d29b2
|
fix Baichuan-13B
|
2023-07-13 23:08:45 +08:00 |
zxbsmk
|
4955dc9eed
|
Support for WebNovel dataset
|
2023-07-12 17:29:47 +08:00 |
hiyouga
|
1af031c02b
|
add baichuan template
|
2023-07-11 18:57:50 +08:00 |
hiyouga
|
f936a7af0b
|
support Baichuan-13B
|
2023-07-11 16:16:14 +08:00 |
hiyouga
|
8447206bbc
|
Update README.md
|
2023-07-10 23:09:11 +08:00 |
hiyouga
|
4182c7aa8b
|
Update README.md
|
2023-07-09 14:57:13 +08:00 |
hiyouga
|
233f20864b
|
Update README.md
|
2023-07-07 12:06:28 +08:00 |
hiyouga
|
a2f507c562
|
support InternLM
|
2023-07-07 11:02:28 +08:00 |
hiyouga
|
89c623e4bf
|
update readme
|
2023-07-05 23:03:58 +08:00 |
hiyouga
|
4abd2485e1
|
fix streaming response in API
|
2023-07-05 22:42:31 +08:00 |
hiyouga
|
c136f362c1
|
support falcon model #72
|
2023-07-05 15:00:06 +08:00 |
hiyouga
|
65e9ce2cdd
|
fix seq2seq predictions
|
2023-07-04 22:56:51 +08:00 |
codemayq
|
d3b30ecde3
|
add the pre-built version of bitsandbytes library for windows user
|
2023-07-03 13:58:10 +08:00 |
hiyouga
|
92fa515e97
|
fix typo
|
2023-06-30 10:09:59 +08:00 |
hiyouga
|
021b035c1e
|
Update README.md
|
2023-06-29 19:36:22 +08:00 |
hiyouga
|
70592035b8
|
Update README.md
|
2023-06-29 15:37:19 +08:00 |
hiyouga
|
9cb1af71f3
|
add star history
|
2023-06-27 23:56:29 +08:00 |
hiyouga
|
450910c1db
|
tiny fix
|
2023-06-27 23:54:24 +08:00 |
hiyouga
|
18f87c1b25
|
fix initializing data arguments
|
2023-06-27 22:50:23 +08:00 |
Jingsong-Yan
|
90bb5b6f37
|
Update README.md with baichuan-7b-rtx3090
在 Changelog 中新增 baichuan-7b-rtx3090 分支的描述
|
2023-06-26 19:45:41 +08:00 |
hiyouga
|
0697643358
|
update readme
|
2023-06-23 00:17:05 +08:00 |
hiyouga
|
0574b590ef
|
support loading lora from hub
|
2023-06-16 00:02:17 +08:00 |
hiyouga
|
0cee6ad67f
|
support baichuan model
|
2023-06-15 16:02:01 +08:00 |
hiyouga
|
4eb17bcf6c
|
support distributed quantized training
|
2023-06-06 17:39:41 +08:00 |
hiyouga
|
44298c1235
|
tiny fix
|
2023-06-05 15:25:22 +08:00 |
hiyouga
|
3b9eee8cd2
|
support QLoRA
|
2023-06-04 00:08:56 +08:00 |
hiyouga
|
72a85ccc39
|
add wechat
|
2023-06-02 21:47:10 +08:00 |
hiyouga
|
38ca429228
|
update readme
|
2023-05-31 16:57:43 +08:00 |
hiyouga
|
740a5daf56
|
support BLOOM models
|
2023-05-31 16:54:06 +08:00 |
hiyouga
|
6ccdfb4001
|
update readme
|
2023-05-29 21:54:01 +08:00 |
hiyouga
|
7698f9aa9a
|
update readme
|
2023-05-29 21:53:02 +08:00 |
hiyouga
|
769c6ab56b
|
Initial commit
|
2023-05-28 18:09:04 +08:00 |