Marco
9468ee9012
Update dataset_info.json
...
Added the Nectar dataset already preprocessed and divided in sft and rl to which I added a preprompt to each instruction since it has been seen that this increase instruction following
2023-11-30 16:21:34 +01:00
hiyouga
7b1aa6f63c
update dataset
2023-11-17 23:19:12 +08:00
hiyouga
ce78303600
support full-parameter PPO
2023-11-16 02:08:04 +08:00
hiyouga
386f590209
add template, modify datasets
2023-11-09 15:53:23 +08:00
hiyouga
2b5e33c338
update data readme
2023-11-03 00:15:23 +08:00
hiyouga
cc8ffa10d8
update data readme (zh)
2023-11-02 23:42:49 +08:00
hiyouga
a837172413
support sharegpt format, add datasets
2023-11-02 23:10:04 +08:00
hiyouga
026af87e7f
add MathInstruct dataset
2023-09-13 22:30:14 +08:00
hiyouga
a9d1fb72f7
refactor dataset_attr, add eos in pt, fix #757
2023-09-01 19:00:45 +08:00
codemayq
604f85487b
add ad gen dataset
2023-08-27 20:35:32 +08:00
codemayq
cece66d48a
add readme for dataset
2023-08-23 19:55:45 +08:00
codemayq
c0e4d1e81b
add dataset stage and filter dataset when stage chosen in webui
2023-08-23 18:54:23 +08:00
hiyouga
4318347d3f
update template
2023-08-22 19:46:09 +08:00
Peter Pan
b0ca8fe634
add rm dataset explanation
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2023-08-22 01:33:59 -04:00
hiyouga
3ec4351cfd
support DPO training (2305.18290)
2023-08-11 03:02:53 +08:00
hiyouga
b9cdff41bb
restore from git lfs
2023-08-01 16:33:25 +08:00
hiyouga
82e793ddb4
use git lfs
2023-08-01 10:14:08 +08:00
hiyouga
f5c2ccdde4
update dataset
2023-07-26 17:05:12 +08:00
hiyouga
182b425043
update dataset
2023-07-23 20:01:43 +08:00
hiyouga
035c966d5c
update readme, fix web ui postprocess
2023-07-22 14:29:22 +08:00
mrhan1993
9f0b57b370
根据GLM Efficient Tuning添加中文README,web添加了server_port
2023-07-21 16:57:58 +08:00
hiyouga
7159bc54ed
add datasets
2023-07-19 20:59:15 +08:00
hiyouga
08439d29b2
fix Baichuan-13B
2023-07-13 23:08:45 +08:00
zxbsmk
4955dc9eed
Support for WebNovel dataset
2023-07-12 17:29:47 +08:00
hiyouga
3154fec979
add open assistant dataset
2023-06-28 23:09:33 +08:00
hiyouga
334d1a6d26
add belle multiturn dataset
2023-06-16 20:01:16 +08:00
hiyouga
cec6524d6b
support RM metrics, add generating Args
2023-06-12 15:48:48 +08:00
BUAADreamer
e3b53a67c7
update json line file to .jsonl
2023-06-11 18:59:19 +08:00
BUAADreamer
676d910260
add some
2023-06-11 18:55:53 +08:00
BUAADreamer
a2af9df5a9
add code for reading from multi files in one directory
2023-06-10 16:27:30 +08:00
BUAADreamer
3dd5f9a874
add code for reading from multi files in one directory
2023-06-10 15:53:47 +08:00
hiyouga
a72492e649
remove dummy code
2023-05-30 16:28:00 +08:00
hiyouga
8ff96509fa
add pre-training script
2023-05-29 21:37:22 +08:00
hiyouga
769c6ab56b
Initial commit
2023-05-28 18:09:04 +08:00