forked from jiuyuan/CPM-9G-8B
修改一些格式问题
This commit is contained in:
parent
d4da91b29c
commit
7ca0eb0bcc
|
@ -148,6 +148,7 @@ cat pretrain.txt | python convert_txt2jsonl.py > pretrain.jsonl
|
|||
```
|
||||
|
||||
2. jsonl格式转index。脚本位于./quick_start_clean/convert_json2index.py,应用方法如下:
|
||||
|
||||
```shell
|
||||
python convert_json2index.py \
|
||||
--path ../data_process/data \ #存放jsonl文件的目录
|
||||
|
|
Loading…
Reference in New Issue