Commit Graph

10 Commits

Author SHA1 Message Date
hiyouga 344b9a36b2 tiny fix 2024-06-18 23:32:18 +08:00
Eli Costa df12621dae
Fix Dockerfile
Adds the commands to correctly execute LLama-Factory servers
2024-06-16 19:16:23 -03:00
hiyouga 577de2fa07 fix #4242 2024-06-12 16:50:11 +08:00
hiyouga 949e9908ad fix #4145
Fix the docker image
2024-06-11 00:19:17 +08:00
hiyouga 308edbc426 rename package 2024-05-16 18:39:08 +08:00
junwooo.lee 4598734a0d fix: splitted Dockerfile's CMD 2024-05-07 15:09:48 +09:00
hiyouga 245fe47ece update webui and add CLIs 2024-05-03 02:58:23 +08:00
S3Studio e75407febd Use official Nvidia base image
Note that the flash-attn library is installed in this image and the qwen model will use it automatically.
However, if the the host machine's GPU is not compatible with the library, an exception will be raised during the training process as follows:
FlashAttention only supports Ampere GPUs or newer.
So if the --flash_attn flag is not set, an additional patch for the qwen model's config is necessary to set the default value of use_flash_attn from "auto" to False.
2024-03-15 08:59:13 +08:00
S3Studio 6a5693d11d improve Docker build and runtime parameters
Modify installation method of extra python library.
Utilize shared memory of the host machine to increase training performance.
2024-03-15 08:57:46 +08:00
S3Studio 3d911ae713 Add dockerize support
Already tested with the model of Qwen:1.8B and the dataset of alpaca_data_zh. Some python libraries are added to the Dockerfile as a result of the exception messages displayed throughout test procedure.
2024-03-08 10:47:28 +08:00