hiyouga
894d183214
update readme, add starcoder2, cosmopedia
2024-03-03 01:01:46 +08:00
hiyouga
32884523c5
update data
2024-03-02 19:37:18 +08:00
hiyouga
1630a4cb8f
fix #2533
2024-02-21 22:47:48 +08:00
hiyouga
22acab8aff
fix #2481
2024-02-15 19:07:47 +08:00
hiyouga
7d2dc83c5e
improve aligner
2024-02-10 16:39:19 +08:00
Mark Mueller
1d3598afa1
Slim Orca data parsing
2024-02-08 19:32:20 +01:00
Johann-Peter Hartmann
49c69ea4b9
WS fix
2024-02-06 20:13:04 +01:00
Johann-Peter Hartmann
1126563505
add ranking to dpo dataset
2024-02-06 20:12:36 +01:00
Johann-Peter Hartmann
870182c3a9
remove comma
2024-02-03 08:48:39 +01:00
Johann-Peter Hartmann
d9a8301ed4
Add support for german datasets
2024-01-30 10:18:01 +01:00
hiyouga
dbaaa4546e
Update dataset_info.json
2024-01-23 00:10:32 +08:00
hiyouga
f1067d2b58
enable cutoff len
2024-01-18 12:25:42 +08:00
hiyouga
d9f1cae351
support function calling
2024-01-18 09:54:23 +08:00
hiyouga
5b93d545e2
tiny update
2023-12-25 18:29:34 +08:00
hiyouga
71389be37c
support autogptq in llama board #246
2023-12-16 16:31:30 +08:00
hiyouga
0a9c6e0146
support system column #1765
2023-12-12 19:45:59 +08:00
hiyouga
d5b2c57a35
fix modelscope data hub
2023-12-12 18:33:06 +08:00
hoshi-hiyouga
6382efec52
Merge branch 'main' into feat/support_ms
2023-12-12 17:55:32 +08:00
xingjun.wang
e80a989d49
modify guanaco
2023-12-12 15:00:37 +08:00
xingjun.wang
73b50a26b9
update dataset info
2023-12-12 14:53:59 +08:00
xingjun.wang
09533e95ed
update args for MsDataset.load
2023-12-12 13:02:54 +08:00
xingjun.wang
fe4acc66b0
add new datasets
2023-12-12 12:44:15 +08:00
xingjun.wang
0ce18a3782
add open orca
2023-12-12 12:34:04 +08:00
hiyouga
28d5de7e78
fix #1784
2023-12-09 20:53:18 +08:00
yuze.zyz
e4cf2a75ca
fix typo
2023-12-08 18:13:26 +08:00
yuze.zyz
9c2247d700
support ms dataset
2023-12-08 18:00:57 +08:00
hiyouga
bf6f6aeefe
fix #1696
2023-12-01 15:34:50 +08:00
Marco
9468ee9012
Update dataset_info.json
...
Added the Nectar dataset already preprocessed and divided in sft and rl to which I added a preprompt to each instruction since it has been seen that this increase instruction following
2023-11-30 16:21:34 +01:00
hiyouga
7b1aa6f63c
update dataset
2023-11-17 23:19:12 +08:00
hiyouga
ce78303600
support full-parameter PPO
2023-11-16 02:08:04 +08:00
hiyouga
386f590209
add template, modify datasets
2023-11-09 15:53:23 +08:00
hiyouga
cc8ffa10d8
update data readme (zh)
2023-11-02 23:42:49 +08:00
hiyouga
a837172413
support sharegpt format, add datasets
2023-11-02 23:10:04 +08:00
hiyouga
026af87e7f
add MathInstruct dataset
2023-09-13 22:30:14 +08:00
hiyouga
a9d1fb72f7
refactor dataset_attr, add eos in pt, fix #757
2023-09-01 19:00:45 +08:00
codemayq
604f85487b
add ad gen dataset
2023-08-27 20:35:32 +08:00
codemayq
c0e4d1e81b
add dataset stage and filter dataset when stage chosen in webui
2023-08-23 18:54:23 +08:00
hiyouga
3ec4351cfd
support DPO training (2305.18290)
2023-08-11 03:02:53 +08:00
hiyouga
b9cdff41bb
restore from git lfs
2023-08-01 16:33:25 +08:00
hiyouga
82e793ddb4
use git lfs
2023-08-01 10:14:08 +08:00
hiyouga
f5c2ccdde4
update dataset
2023-07-26 17:05:12 +08:00
hiyouga
182b425043
update dataset
2023-07-23 20:01:43 +08:00
hiyouga
7159bc54ed
add datasets
2023-07-19 20:59:15 +08:00
hiyouga
08439d29b2
fix Baichuan-13B
2023-07-13 23:08:45 +08:00
zxbsmk
4955dc9eed
Support for WebNovel dataset
2023-07-12 17:29:47 +08:00
hiyouga
3154fec979
add open assistant dataset
2023-06-28 23:09:33 +08:00
hiyouga
334d1a6d26
add belle multiturn dataset
2023-06-16 20:01:16 +08:00
hiyouga
cec6524d6b
support RM metrics, add generating Args
2023-06-12 15:48:48 +08:00
BUAADreamer
3dd5f9a874
add code for reading from multi files in one directory
2023-06-10 15:53:47 +08:00
hiyouga
a72492e649
remove dummy code
2023-05-30 16:28:00 +08:00