CPM-9G-8B/README.md

27 lines
884 B
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

夸克网盘 docker链接https://pan.quark.cn/s/4cda395f13e8
(没有会员请联系我下载)
1.使用llama-factory对九格模型进行全参数微调。数据集见dataset
2.训练和推理都已验证无误在A100*8卡机器上。
docker 启动sudo docker run -it --runtime=nvidia --gpus all --shm-size=256g wjf:train
推理python inference.py
训练:
cd training
sh training.sh
3.推理使用多checkpoint、多次推理融合。
4.所有资料都已打包进docker只需要docker即可。
5.启动训练时将覆盖提交的checkpoint。
6.docker卡在数据处理可能是机器的问题尝试docker中输入
export NCCL_DEBUG=INFO
export NCCL_SHM_DISABLE=1
export NCCL_P2P_DISABLE=1
由于需要保存多个checkpoint请务必保证磁盘空间足够大于500G。
7.提交不易请有问题是及时联系我电话13121813131