LLaMA-Factory-Mirror

Commit Graph

Author	SHA1	Message	Date
hiyouga	cd75b1fe9d	fix tool formatter, allow parallel function #4362	2024-06-19 03:23:51 +08:00
hoshi-hiyouga	c0ca42566c	Merge pull request #4173 from mMrBun/main Implemented the tool_formatter and tool_extractor for glm4 and Qwen2 tool_format	2024-06-19 03:18:55 +08:00
hiyouga	a233fbc258	add deepseek coder v2 #4346	2024-06-18 22:53:54 +08:00
hiyouga	4bd77d8563	fix #4357	2024-06-18 22:42:45 +08:00
hiyouga	c96264bc47	fix #4335	2024-06-18 22:08:56 +08:00
Jonery	8f7c78b641	fix typo	2024-06-18 12:39:26 +08:00
Jonery	0f72aac8c9	Support distributed BAdam.	2024-06-18 12:27:47 +08:00
hiyouga	24c160df3d	lint	2024-06-17 22:35:56 +08:00
hiyouga	7857c0990b	update chat engine #4335	2024-06-17 19:07:17 +08:00
Jonery	ea1f3ba5e0	Merge remote-tracking branch 'upstream/main'	2024-06-17 18:44:51 +08:00
Jonery	33b4372778	adapt for badam with ds zero3	2024-06-17 18:18:10 +08:00
hiyouga	e2665e71c7	fix #4326	2024-06-17 18:17:48 +08:00
ancv	238f5c3d99	update packing with sdpa and eager attention mode	2024-06-16 02:25:47 +07:00
hoshi-hiyouga	29c1f31baa	Update parser.py	2024-06-16 02:57:00 +08:00
hiyouga	46093b5786	fix tol	2024-06-16 01:38:44 +08:00
hiyouga	8c1046d78a	support pissa	2024-06-16 01:08:12 +08:00
hiyouga	38b6b0f52e	tiny fix	2024-06-16 01:06:41 +08:00
ancv	04315c3d92	remove some unused params	2024-06-15 23:00:55 +07:00
hiyouga	80a9e6bf94	use fixture	2024-06-15 20:06:17 +08:00
hiyouga	1b834f50be	add tests	2024-06-15 19:51:20 +08:00
hiyouga	572d8bbfdd	add minicpm #4227	2024-06-15 17:58:52 +08:00
hiyouga	d87108daa6	add license	2024-06-15 17:54:33 +08:00
hiyouga	d519b4d76d	disable DP	2024-06-15 04:57:19 +08:00
hiyouga	9092f963db	fix #4292	2024-06-15 04:47:13 +08:00
hiyouga	78589cf90c	fix #4295	2024-06-15 04:34:55 +08:00
hiyouga	b27269bd2b	add test cases	2024-06-15 04:05:54 +08:00
hiyouga	c94e6c9411	add quant check in webui export tab	2024-06-13 03:19:18 +08:00
hiyouga	6baafd4eb3	fix #4221	2024-06-13 02:48:21 +08:00
hiyouga	cf9f2d6c42	fix #4209 DeepSpeed ZeRO3 has inflight param error when calling model.eval()	2024-06-13 02:25:50 +08:00
hiyouga	2ed8270112	clean code	2024-06-13 01:58:16 +08:00
hoshi-hiyouga	1f23f25226	Merge pull request #4246 from hzhaoy/adapt-vllm-v0.5.0 adapt vllm==0.5.0	2024-06-13 01:54:02 +08:00
hiyouga	713fde4259	fix lint	2024-06-13 00:48:44 +08:00
hzhaoy	8fb6366ebe	adapt vllm==0.5.0	2024-06-12 18:29:03 +08:00
hiyouga	577de2fa07	fix #4242	2024-06-12 16:50:11 +08:00
Arthur Kim	d65a3f7cb6	Support vllm==0.5.0	2024-06-12 16:49:12 +09:00
ancv	b2c367bc61	implement efficient packing without cross-contamination attention	2024-06-12 11:56:01 +07:00
hoshi-hiyouga	9049aab911	Merge pull request #4204 from dignfei/main fixbug：llama3在增量预训练时应该使用<\|end_of_text\|>标识文本的结束	2024-06-11 17:06:10 +08:00
hoshi-hiyouga	0c29233237	Update pretrain.py	2024-06-11 17:02:14 +08:00
hiyouga	cca6f35108	fix deepspeed version	2024-06-11 16:52:36 +08:00
d	6979f3f848	经过大量的增量预训练，进行对比试验，发现这个bug：llama3在预训练时使用的tokenizer.eos_toke是'<\|end_of_text\|>' ，这里在每条数据后面也得用这个，而不是'<\|eot_id\|>'，否则很容易导致严重的性能下降	2024-06-11 16:23:40 +08:00
hiyouga	89f2bd8c8c	fix #4198	2024-06-11 15:38:38 +08:00
hiyouga	90e14a960d	tiny fix	2024-06-11 12:48:53 +08:00
hiyouga	3f24337a8a	tiny fix	2024-06-11 01:04:16 +08:00
hiyouga	91e62a098f	set dev version	2024-06-11 00:50:53 +08:00
hiyouga	2b6ebd6b51	release v0.8.1	2024-06-11 00:44:26 +08:00
hiyouga	a793e8456b	fix #4160 The split heads should be concatenated in dim=2	2024-06-11 00:37:17 +08:00
hiyouga	0012762b04	update evaluator	2024-06-10 23:56:00 +08:00
hiyouga	c907d81667	fix #2666	2024-06-10 21:24:15 +08:00
mMrBun	950e360ca0	Optimize the handling of QWEN2 in scenarios involving multiple tool calls.	2024-06-10 02:00:14 +08:00
mMrBun	6ed0b0c800	Removed unnecessary comments.	2024-06-09 18:25:22 +08:00
mMrBun	0f2609ce19	Merge branch 'hiyouga:main' into main	2024-06-09 18:17:24 +08:00
mMrBun	cb1cbcb293	Implemented the tool_formatter and tool_extractor for glm4 tool_format	2024-06-09 18:16:15 +08:00
hiyouga	972ec9c668	fix llamafactory-cli env	2024-06-08 07:15:45 +08:00
hiyouga	3ac11e77cc	set dev version	2024-06-08 06:46:09 +08:00
hiyouga	5aa4ce4756	release v0.8.0	2024-06-08 05:20:54 +08:00
hiyouga	54cd743ebf	reorganize adapter code	2024-06-08 00:47:23 +08:00
hoshi-hiyouga	cfd62283a9	fix #4139	2024-06-08 00:45:02 +08:00
hiyouga	06e5d136a4	add resume args in webui	2024-06-08 00:22:16 +08:00
hiyouga	8bf9da659c	fix #4137	2024-06-07 19:16:06 +08:00
hiyouga	f8d8690bf4	tiny fix	2024-06-07 05:19:21 +08:00
hiyouga	4489d73ac7	fix ppo trainer save zero3 model accelerator.get_state_dict(ds_model) should be called at all ranks	2024-06-07 05:14:19 +08:00
hiyouga	2702d7e952	fix ppo in trl 0.8.6	2024-06-07 04:48:29 +08:00
hiyouga	f9e818d79c	fix #4120	2024-06-07 04:18:05 +08:00
hiyouga	ccc8b64cc2	update data processors	2024-06-07 04:15:40 +08:00
hoshi-hiyouga	181dbb0d05	Merge pull request #4009 from AlongWY/main supervised packing with greedy knapsack algorithm	2024-06-07 03:48:46 +08:00
hoshi-hiyouga	c09ad8bab3	Update supervised.py	2024-06-07 03:42:08 +08:00
hoshi-hiyouga	788e8232fc	Update supervised.py	2024-06-07 03:38:23 +08:00
hoshi-hiyouga	8cecade708	Update supervised.py	2024-06-07 03:38:04 +08:00
hiyouga	8e95648850	add qwen2 models	2024-06-07 00:22:57 +08:00
hiyouga	74f96efef9	rename files	2024-06-07 00:09:06 +08:00
hiyouga	45d8be8f93	add DISABLE_TORCHRUN option	2024-06-06 23:44:58 +08:00
hoshi-hiyouga	55c18c49b0	Merge pull request #4082 from MengqingCao/bugfix Fix #4077	2024-06-06 23:38:40 +08:00
hoshi-hiyouga	751dd77bc0	Update cli.py	2024-06-06 23:38:09 +08:00
hiyouga	76c61905b2	fix ppo+zero3 #3108	2024-06-06 23:30:07 +08:00
hiyouga	451b6693c0	fix torch gc	2024-06-06 20:30:25 +08:00
hiyouga	149610c636	fix ppo dataset bug #4012	2024-06-06 19:03:20 +08:00
hiyouga	fad2591e31	update trainers	2024-06-06 18:45:49 +08:00
hiyouga	67aa78cde0	fix base64 image read #4061	2024-06-06 17:29:19 +08:00
hiyouga	cae4737907	lora modules: all by default	2024-06-06 03:53:28 +08:00
hiyouga	c23cc63d3d	add codestral 22B	2024-06-06 03:42:50 +08:00
hiyouga	7daf8366db	lint	2024-06-06 03:33:44 +08:00
hoshi-hiyouga	f2580ad403	Merge pull request #4066 from injet-zhou/main add throughput entry to training log	2024-06-06 03:32:04 +08:00
hoshi-hiyouga	ca459f67eb	Merge pull request #4080 from MengqingCao/npu Add npu option for model exporting	2024-06-06 03:15:44 +08:00
hoshi-hiyouga	feaee36c46	Update export.py	2024-06-06 03:14:46 +08:00
hoshi-hiyouga	af2c3cbee4	Update model_args.py	2024-06-06 03:14:23 +08:00
hoshi-hiyouga	0e740aa463	Merge pull request #4053 from hzhaoy/feature/add_select_config_file Support selecting saved configuration files	2024-06-06 03:06:03 +08:00
hiyouga	8fcc79e1e6	add vllm_dtype arg #3387 #3717	2024-06-06 02:53:27 +08:00
hiyouga	a12a506c3d	support train from scratch #4033 #4075	2024-06-06 02:43:19 +08:00
hiyouga	946f601136	support image input in api #3971 #4061	2024-06-06 02:29:55 +08:00
hiyouga	dc4a00dd63	update train hparams	2024-06-06 01:49:20 +08:00
hiyouga	d4908d5708	add llamafactory-cli env	2024-06-06 01:28:14 +08:00
hiyouga	67fe822324	fix #4090	2024-06-06 00:50:32 +08:00
MengqingCao	2c03052662	modify export_device option	2024-06-05 09:37:36 +00:00
MengqingCao	90ed3cae92	fix #4077	2024-06-05 08:03:30 +00:00
hiyouga	f48f5e646e	support glm-4	2024-06-05 15:16:38 +08:00
MengqingCao	07045c876a	add npu for model export	2024-06-05 07:06:40 +00:00
faddddeout	b2f0459542	add throughput entry to log	2024-06-04 11:04:29 +00:00
hzhaoy	b27c4cfcb3	add: support selecting saved configuration files and loading training parameters	2024-06-04 10:33:43 +08:00
hiyouga	5a13b3baa6	tiny fix	2024-06-04 00:31:10 +08:00
hiyouga	91611d68c4	fix #3873	2024-06-04 00:21:50 +08:00
hiyouga	a18acf2abe	fix #3992	2024-06-04 00:17:36 +08:00
hiyouga	2187518762	fix abort in webui DDP mode	2024-06-04 00:10:24 +08:00
hoshi-hiyouga	ae18e1e251	Merge pull request #3987 from injet-zhou/main Fix cann't interrupt training when using multi GPUs in webui	2024-06-04 00:04:07 +08:00
hiyouga	79784ebeb6	fix #4043	2024-06-03 23:30:37 +08:00
hiyouga	f9a206509e	remove gc warnings in DPO&KTO	2024-06-03 22:53:54 +08:00
hoshi-hiyouga	24499f40dc	Update trainer.py	2024-06-03 22:08:38 +08:00
enji.zhou	34a2c5087a	fix KTO Trainer Sampler	2024-06-03 21:32:38 +08:00
hoshi-hiyouga	0f01500b68	Merge pull request #4006 from Uminosachi/scheduler-kwargs Set scheduler_specific_kwargs to get_scheduler	2024-06-03 19:27:53 +08:00
hiyouga	eed33862bc	fix #4005 #4013	2024-06-03 19:12:29 +08:00
hoshi-hiyouga	1539c72b94	Merge pull request #4007 from xu-song/patch-3 Update model_args.py	2024-06-03 18:54:37 +08:00
hiyouga	24e1c0e2ee	fix #4022	2024-06-03 18:38:36 +08:00
hiyouga	876bc92865	bump versions transformers 4.37.2->4.41.2 datasets 2.14.3->2.16.0 accelerate 0.27.2->0.30.1 peft 0.10.0->0.11.1 trl 0.8.1->0.8.6	2024-06-03 18:29:38 +08:00
hiyouga	49b1e88e3d	fix data loader hint	2024-06-03 18:28:27 +08:00
ylfeng	b47e317447	remove empty line	2024-05-31 21:43:08 +08:00
ylfeng	84aee57901	fix eos	2024-05-31 21:40:41 +08:00
ylfeng	f9db439cb7	supervised packing with greedy knapsack algorithm	2024-05-31 15:33:54 +08:00
Xu Song	dade2f083d	Update model_args.py	2024-05-31 14:35:48 +08:00
Uminosachi	14e97dc119	Set scheduler_specific_kwargs to get_scheduler	2024-05-31 13:45:39 +09:00
faddddeout	b13d03946e	fix cann't interrupt training when using multi GPUs in webui	2024-05-30 08:39:21 +00:00
hoshi-hiyouga	483eb47e5d	Merge pull request #3829 from seanzhang-zhichen/add_dataset_sample_num Add dataset sample num	2024-05-30 00:25:45 +08:00
hoshi-hiyouga	ca5dd7c6c1	Update loader.py	2024-05-30 00:20:20 +08:00
hoshi-hiyouga	f9a88b89ca	Update loader.py	2024-05-30 00:17:21 +08:00
hoshi-hiyouga	b55fb611c5	Update loader.py	2024-05-30 00:12:12 +08:00
hoshi-hiyouga	51dd454337	Update parser.py	2024-05-30 00:05:20 +08:00
hiyouga	8070871732	better llamaboard * easily resume from checkpoint * support full and freeze checkpoints * faster ui	2024-05-29 23:55:38 +08:00
hiyouga	d0aa36b8ad	fix cohere system	2024-05-29 20:58:23 +08:00
hiyouga	0930f58699	fix #3965	2024-05-29 20:55:51 +08:00
hiyouga	89ca832740	update readme	2024-05-29 18:39:11 +08:00
hzhaoy	0dd632fe9e	add TeleChat-12B/TeleChat-12B-v2 models	2024-05-29 15:00:37 +08:00
hiyouga	97346c1d3d	fix hf chat engine	2024-05-29 01:20:07 +08:00
hiyouga	e4b420c146	add ds config to webui	2024-05-29 01:13:17 +08:00
hiyouga	65cd8bdbdb	10x generate in ppo w/ zero3 https://github.com/huggingface/trl/pull/1483	2024-05-29 00:23:23 +08:00
hiyouga	7c8e01bb74	update dpo, kto trainer	2024-05-29 00:14:29 +08:00
hiyouga	900e1ea622	clean kto trainer	2024-05-28 21:43:26 +08:00
hiyouga	1e80a3a638	bump vllm version to 0.4.1	2024-05-28 21:27:27 +08:00
hiyouga	087b9faa39	update readme	2024-05-28 19:35:52 +08:00
hiyouga	7c016b22aa	support DDP in webui	2024-05-28 19:24:22 +08:00
Yimi81	dc07413e7d	fix yi template	2024-05-27 13:11:25 +00:00
hiyouga	c1fdf81df6	tiny fix	2024-05-27 20:54:26 +08:00
hoshi-hiyouga	87ea0a8bcd	Merge pull request #3921 from gusye1234/main Add openchat-3.6-8B support	2024-05-27 20:52:37 +08:00
hoshi-hiyouga	f1002b9f93	Update template.py	2024-05-27 20:51:56 +08:00
hoshi-hiyouga	122213a7a7	Update template.py	2024-05-27 20:51:26 +08:00
Jianbai Ye	cff815391f	add openchat-3.6-8B support	2024-05-27 20:42:08 +08:00
hiyouga	08564838bd	fix full/freeze tuning for mllm	2024-05-27 20:37:57 +08:00
hoshi-hiyouga	838f2fb3e4	Merge pull request #3835 from BUAADreamer/main fix some features in llava-style training	2024-05-27 20:23:45 +08:00
hiyouga	e626e26446	support Aya23	2024-05-27 20:23:24 +08:00
BUAADreamer	ea2afd429e	Merge branch 'hiyouga:main' into main	2024-05-27 19:00:48 +08:00
BUAADreamer	57eb13b75d	add regex of only tune lm and mm_proj	2024-05-27 18:59:00 +08:00
hiyouga	efa4b196ca	add phi-3 7b/14b, mistral v0.3 models	2024-05-27 18:20:16 +08:00
hiyouga	5581cb2e4e	update readme	2024-05-27 18:14:02 +08:00
BUAADreamer	4bc7c10c00	Merge branch 'hiyouga:main' into main	2024-05-27 11:54:01 +08:00
hiyouga	cb63b32986	support SimPO #3900	2024-05-26 23:46:33 +08:00
BUAADreamer	60170a1da4	Merge branch 'hiyouga:main' into main	2024-05-25 14:18:49 +08:00
hiyouga	063f91cc80	fix #3853	2024-05-24 23:29:45 +08:00
seanzhang-zhichen	27cb51f7f8	Merge branch 'main' into add_dataset_sample_num	2024-05-24 15:57:47 +08:00
BUAADreamer	047a06a1e5	Merge branch 'hiyouga:main' into main	2024-05-24 09:50:00 +08:00
hiyouga	3a023bca2a	refactor data preprocessing, fix mllm rlhf	2024-05-24 04:08:25 +08:00
hiyouga	de0e67aff1	fix paligemma sft requires transformers>=4.41.1	2024-05-24 00:23:40 +08:00
hiyouga	67ebc7b388	fix oom issues in export	2024-05-23 23:32:45 +08:00
BUAADreamer	8d53ec2b5f	Merge branch 'hiyouga:main' into main	2024-05-21 22:18:20 +08:00
hiyouga	7134fb02bb	fix paligemma sft	2024-05-21 20:03:09 +08:00
hiyouga	335501e228	fix #3847	2024-05-21 17:53:06 +08:00
BUAADreamer	29a6d5bdb8	support pretraining of llava	2024-05-21 08:57:14 +08:00
hiyouga	2a67457e39	support paligemma	2024-05-21 00:01:22 +08:00
hiyouga	e55c85ac72	fix paligemma data preprocess	2024-05-20 23:51:32 +08:00
hiyouga	542229abb3	fix paligemma inference	2024-05-20 23:36:43 +08:00
hiyouga	9b0f4d7602	add kto to webui	2024-05-20 21:20:25 +08:00
zhangzc	d956041640	fix conflict	2024-05-20 17:10:01 +08:00
hiyouga	d52fae2fa8	fix chat engines do not use pop(key, default) since api assigns None to dict values	2024-05-20 00:36:43 +08:00
hoshi-hiyouga	aa0bca49e9	Merge pull request #3812 from ycjcl868/feat/chat-support-system-prompt feat: cli chat support system_message	2024-05-20 00:31:32 +08:00
hoshi-hiyouga	a0e8d3d159	Update vllm_engine.py	2024-05-20 00:31:04 +08:00
hoshi-hiyouga	a943a1034b	Update hf_engine.py	2024-05-20 00:30:45 +08:00
hoshi-hiyouga	a1fa7aa63b	Update generating_args.py	2024-05-20 00:29:31 +08:00
hoshi-hiyouga	896c656185	Update chat_model.py	2024-05-20 00:29:12 +08:00
hiyouga	10573e1639	fix jinja template	2024-05-19 23:38:30 +08:00
ycjcl868	a08ba254c8	feat: cli chat support system_message	2024-05-19 23:17:46 +08:00
hiyouga	31a0564d4f	fix zero2 high ram usage	2024-05-19 21:53:54 +08:00
hiyouga	70214b71b1	fix hf gen args	2024-05-19 19:39:32 +08:00
hiyouga	8ee8ac6eba	fix envs	2024-05-19 18:27:18 +08:00
hiyouga	1ebc890a5f	fix #3807	2024-05-19 17:07:57 +08:00
hiyouga	3c2a992caa	safe output path in webui	2024-05-18 22:42:28 +08:00
hiyouga	d43822fcc2	fix jetmoe z3 block	2024-05-18 22:28:45 +08:00
hiyouga	a851056229	improve data process logger	2024-05-18 22:02:42 +08:00
hiyouga	0edc16769f	fix #3803	2024-05-18 16:13:14 +08:00
hiyouga	c450ee87a3	improve KTO impl., replace datasets	2024-05-18 03:44:56 +08:00
hoshi-hiyouga	33a354548e	Merge pull request #3785 from enji-zhou/feature/add_kto add kto	2024-05-18 03:07:18 +08:00
hoshi-hiyouga	9646727453	Update model_args.py	2024-05-17 16:16:41 +08:00
juejuezi	b20d62ba3c	feat: pass the `max_lora_rank` parameter to vLLM backend	2024-05-17 16:07:39 +08:00
hiyouga	8af9817605	add deepseek v2 lite model	2024-05-17 13:25:36 +08:00
enji.zhou	db1d5a4f51	add kto	2024-05-17 13:09:17 +08:00
hiyouga	d9f190ff1e	better dtype handle in loading	2024-05-17 02:14:56 +08:00
hiyouga	694a05fd04	enable inbrowser in webui	2024-05-17 00:08:56 +08:00
hiyouga	d77bed4091	add falcon 11b	2024-05-17 00:08:33 +08:00
hiyouga	308edbc426	rename package	2024-05-16 18:39:08 +08:00
hiyouga	b2fc7aeb03	set dev version	2024-05-16 02:17:31 +08:00
hiyouga	1c910079d8	release v0.7.1	2024-05-16 00:57:16 +08:00
hiyouga	2a67ab3925	fix #3694	2024-05-16 00:35:28 +08:00
hiyouga	44cfa9a1cd	fix #3606 https://github.com/huggingface/peft/pull/1706	2024-05-15 23:05:02 +08:00
hiyouga	a388cadfc0	add Yi-VL-34B model	2024-05-15 22:58:19 +08:00
hiyouga	73845fcc46	add yi-vl 6b model	2024-05-15 20:02:41 +08:00

... 2 3 4 5 6 ...

1370 Commits