LLaMA-Factory-310P3

Commit Graph

Author	SHA1	Message	Date
hiyouga	e43809bced	fix #4683	2024-07-05 00:58:05 +08:00
hiyouga	ed232311e8	fix #4674	2024-07-05 00:41:03 +08:00
hiyouga	226a9e563f	Merge branch 'main' of https://github.com/hiyouga/LLaMA-Factory	2024-07-04 14:23:37 +08:00
hiyouga	1e27e8c776	fix #4677	2024-07-04 14:22:07 +08:00
hzhaoy	738df47748	tiny fix	2024-07-04 10:20:28 +08:00
hiyouga	0c699de39d	tiny fix	2024-07-04 03:47:05 +08:00
hiyouga	44747cebd2	tiny fix	2024-07-04 03:02:23 +08:00
hiyouga	b5d101e1bf	fix data map for packing	2024-07-04 03:01:31 +08:00
hiyouga	6fd6aa4530	fix packing for eager/sdpa attn	2024-07-04 01:52:43 +08:00
hoshi-hiyouga	87d9b2d005	Merge pull request #4224 from chuan298/main Implement efficient packing without cross-contamination attention	2024-07-04 01:18:54 +08:00
hiyouga	cce7083024	update packing	2024-07-04 01:10:55 +08:00
hoshi-hiyouga	a36e8f2dd5	Update packing.py	2024-07-03 23:36:01 +08:00
hiyouga	c346f79f99	update func name	2024-07-03 23:29:33 +08:00
hiyouga	8a6a7b9c8a	update arg name	2024-07-03 23:23:24 +08:00
hiyouga	575a02a23d	update hparams	2024-07-03 23:18:58 +08:00
hiyouga	7f770f6895	update ui	2024-07-03 23:13:49 +08:00
hiyouga	8845e94f91	fix #4609 unwrap_model_for_generation(reward_model) is necessary for zero3 training	2024-07-03 19:45:51 +08:00
hiyouga	8b1172b910	tiny fix	2024-07-03 02:31:50 +08:00
hiyouga	71cdf8956e	tiny fix	2024-07-02 23:06:13 +08:00
hiyouga	821bb6660e	remove rlhf support for chatglm2&3	2024-07-02 23:03:17 +08:00
hiyouga	c13ae2df19	upcast logits	2024-07-02 22:32:05 +08:00
hiyouga	c47ab6c072	improve rlhf	2024-07-02 22:23:08 +08:00
ancv	e8e13b0942	move efficient_packing from data_args to model_args	2024-07-02 18:37:55 +07:00
hoshi-hiyouga	4e4b3cc905	Merge pull request #4651 from hzhaoy/add-telechat-1b Add TeleChat-1B	2024-07-02 17:56:43 +08:00
hzhaoy	57b7c00430	add TeleChat-1B	2024-07-02 17:49:04 +08:00
hiyouga	4c296001c4	fix ppo callbacks	2024-07-02 17:34:56 +08:00
hoshi-hiyouga	e8e6af2651	Merge branch 'main' into main	2024-07-01 21:01:09 +08:00
hiyouga	73280b7dc7	tiny fix	2024-07-01 05:43:17 +08:00
hiyouga	8c41a0aa6d	tiny fix	2024-07-01 03:55:20 +08:00
hiyouga	1856a08e87	add eval acc	2024-07-01 03:51:20 +08:00
hiyouga	1771251ce3	fix #4402 #4617 Deprecate reserved_label_len arg	2024-07-01 01:19:27 +08:00
hiyouga	d74244d568	fix #4398 #4592	2024-06-30 21:28:51 +08:00
hiyouga	2f4b89ace1	loose gemma2 attention	2024-06-29 01:42:14 +08:00
hiyouga	4d35e218b1	bf16 by default, gemma2 attns Gemma2 finetuning cannot work until merging https://github.com/huggingface/transformers/pull/31674	2024-06-28 06:00:26 +08:00
hiyouga	64f4337dac	increase pissa_iter for stability	2024-06-28 03:18:54 +08:00
hiyouga	6f63050e1b	add Gemma2 models	2024-06-28 01:26:50 +08:00
hiyouga	8baf3b22b0	refactor pissa, improve llamaboard	2024-06-28 01:04:24 +08:00
hoshi-hiyouga	ef38daa0a4	Merge pull request #4580 from hzhaoy/bugfix-deepspeed-pissa Fix bug when using pissa method with deepspeed	2024-06-28 00:46:51 +08:00
hiyouga	8ed6b367e2	fix #4549	2024-06-28 00:41:58 +08:00
hiyouga	e44a4f07f0	tiny fix	2024-06-27 20:14:48 +08:00
faddddeout	f6b62f0070	Exit the process with the subprocess's return code when utilizing the CLI	2024-06-27 09:58:00 +00:00
hzhaoy	677c86594e	fix #4579	2024-06-27 13:49:57 +08:00
hiyouga	96a5044394	add quant checks	2024-06-27 01:12:25 +08:00
hiyouga	f17c9dfd84	tiny fix	2024-06-27 00:46:41 +08:00
hiyouga	29c710da3a	tiny fix	2024-06-27 00:36:04 +08:00
hiyouga	ad144c2265	support HQQ/EETQ #4113	2024-06-27 00:29:42 +08:00
hiyouga	addca926de	improve autogptq integration	2024-06-26 22:11:44 +08:00
hiyouga	8d6cd69ac4	fix #4458	2024-06-26 19:52:35 +08:00
hiyouga	59e0b4f616	fix #4556	2024-06-26 19:43:16 +08:00
hiyouga	555ca8d780	lint	2024-06-25 02:55:50 +08:00
hiyouga	1e9d0aa1e4	fix #4432	2024-06-25 02:34:04 +08:00
hiyouga	cc016461e6	fix #4379	2024-06-25 02:31:44 +08:00
hiyouga	095fab58d3	tiny fix about badam	2024-06-25 01:54:53 +08:00
hoshi-hiyouga	d0f953bf5b	Merge pull request #4352 from Ledzy/main [Enhancement] Support ZeRO-3 when using BAdam	2024-06-25 01:49:13 +08:00
hiyouga	41086059b1	tiny fix	2024-06-25 01:15:19 +08:00
hoshi-hiyouga	3bed18c644	Merge pull request #4409 from kno10/patch-2 Print help if no arguments given	2024-06-24 23:21:31 +08:00
hoshi-hiyouga	acb61f7ab7	Update cli.py	2024-06-24 23:21:10 +08:00
hoshi-hiyouga	def6d280db	Merge pull request #4417 from mMrBun/main Add tool_format parameter to rewrite templates for different function call formats.	2024-06-24 23:17:55 +08:00
hoshi-hiyouga	1240bd57d8	Update template.py	2024-06-24 23:12:59 +08:00
hoshi-hiyouga	dddfd516ee	Update loader.py	2024-06-24 23:06:18 +08:00
hiyouga	fca893d73c	fix #4410	2024-06-24 22:34:31 +08:00
hoshi-hiyouga	cc452c32c7	Merge pull request #4446 from stceum/bug-fix Bug Fix: `off` is parsed as `False` in yaml file	2024-06-24 21:41:28 +08:00
hoshi-hiyouga	e90c424f55	Update parser.py	2024-06-24 21:37:42 +08:00
stceum	3ed063f281	Bug Fix: `off` is parsed as `False` in yaml file, changed to `disabled` to avoid this.	2024-06-24 20:39:31 +08:00
hiyouga	e507e60638	update readme	2024-06-24 18:22:12 +08:00
mMrBun	20e2e6fdcb	Add tool_format to overwrite tool formatter template	2024-06-22 02:13:23 +08:00
hiyouga	db9a1912e3	remove dup template	2024-06-22 01:31:32 +08:00
hiyouga	3ce44dda99	fix api	2024-06-22 00:00:38 +08:00
Erich Schubert	7d70ba7fb8	Print help if no arguments given	2024-06-21 09:14:21 +02:00
ancv	770f75dc83	move configure_packing to llamafactory.model.patcher and fix constants	2024-06-21 00:45:06 +07:00
hiyouga	8d4f5093cf	tiny fix	2024-06-20 22:56:05 +08:00
hiyouga	f22d8f9ca4	improve llamaboard	2024-06-19 23:46:03 +08:00
hiyouga	3f84411b5d	fix llamaboard abort	2024-06-19 23:22:28 +08:00
hiyouga	3b040e8e0f	update patcher	2024-06-19 21:27:00 +08:00
hiyouga	42e69a3c63	set dev version	2024-06-19 21:08:16 +08:00
hiyouga	71327ba85a	release v0.8.2	2024-06-19 20:42:09 +08:00
hiyouga	2b596fb55f	fix jinja template	2024-06-19 20:03:50 +08:00
hiyouga	4cff6a4ad5	fix templates	2024-06-19 17:44:05 +08:00
Jonery	5c2ff1b749	Cleaner integration.	2024-06-19 12:29:40 +08:00
hiyouga	6d2bf216ac	fix bug	2024-06-19 03:49:23 +08:00
hiyouga	4f22eae8f4	use prefix to replace force system	2024-06-19 03:39:52 +08:00
hiyouga	cd75b1fe9d	fix tool formatter, allow parallel function #4362	2024-06-19 03:23:51 +08:00
hoshi-hiyouga	c0ca42566c	Merge pull request #4173 from mMrBun/main Implemented the tool_formatter and tool_extractor for glm4 and Qwen2 tool_format	2024-06-19 03:18:55 +08:00
hiyouga	a233fbc258	add deepseek coder v2 #4346	2024-06-18 22:53:54 +08:00
hiyouga	4bd77d8563	fix #4357	2024-06-18 22:42:45 +08:00
hiyouga	c96264bc47	fix #4335	2024-06-18 22:08:56 +08:00
Jonery	8f7c78b641	fix typo	2024-06-18 12:39:26 +08:00
Jonery	0f72aac8c9	Support distributed BAdam.	2024-06-18 12:27:47 +08:00
hiyouga	24c160df3d	lint	2024-06-17 22:35:56 +08:00
hiyouga	7857c0990b	update chat engine #4335	2024-06-17 19:07:17 +08:00
Jonery	ea1f3ba5e0	Merge remote-tracking branch 'upstream/main'	2024-06-17 18:44:51 +08:00
Jonery	33b4372778	adapt for badam with ds zero3	2024-06-17 18:18:10 +08:00
hiyouga	e2665e71c7	fix #4326	2024-06-17 18:17:48 +08:00
ancv	238f5c3d99	update packing with sdpa and eager attention mode	2024-06-16 02:25:47 +07:00
hoshi-hiyouga	29c1f31baa	Update parser.py	2024-06-16 02:57:00 +08:00
hiyouga	46093b5786	fix tol	2024-06-16 01:38:44 +08:00
hiyouga	8c1046d78a	support pissa	2024-06-16 01:08:12 +08:00
hiyouga	38b6b0f52e	tiny fix	2024-06-16 01:06:41 +08:00
ancv	04315c3d92	remove some unused params	2024-06-15 23:00:55 +07:00
hiyouga	80a9e6bf94	use fixture	2024-06-15 20:06:17 +08:00
hiyouga	1b834f50be	add tests	2024-06-15 19:51:20 +08:00
hiyouga	572d8bbfdd	add minicpm #4227	2024-06-15 17:58:52 +08:00
hiyouga	d87108daa6	add license	2024-06-15 17:54:33 +08:00
hiyouga	d519b4d76d	disable DP	2024-06-15 04:57:19 +08:00
hiyouga	9092f963db	fix #4292	2024-06-15 04:47:13 +08:00
hiyouga	78589cf90c	fix #4295	2024-06-15 04:34:55 +08:00
hiyouga	b27269bd2b	add test cases	2024-06-15 04:05:54 +08:00
hiyouga	c94e6c9411	add quant check in webui export tab	2024-06-13 03:19:18 +08:00
hiyouga	6baafd4eb3	fix #4221	2024-06-13 02:48:21 +08:00
hiyouga	cf9f2d6c42	fix #4209 DeepSpeed ZeRO3 has inflight param error when calling model.eval()	2024-06-13 02:25:50 +08:00
hiyouga	2ed8270112	clean code	2024-06-13 01:58:16 +08:00
hoshi-hiyouga	1f23f25226	Merge pull request #4246 from hzhaoy/adapt-vllm-v0.5.0 adapt vllm==0.5.0	2024-06-13 01:54:02 +08:00
hiyouga	713fde4259	fix lint	2024-06-13 00:48:44 +08:00
hzhaoy	8fb6366ebe	adapt vllm==0.5.0	2024-06-12 18:29:03 +08:00
hiyouga	577de2fa07	fix #4242	2024-06-12 16:50:11 +08:00
Arthur Kim	d65a3f7cb6	Support vllm==0.5.0	2024-06-12 16:49:12 +09:00
ancv	b2c367bc61	implement efficient packing without cross-contamination attention	2024-06-12 11:56:01 +07:00
hoshi-hiyouga	9049aab911	Merge pull request #4204 from dignfei/main fixbug：llama3在增量预训练时应该使用<\|end_of_text\|>标识文本的结束	2024-06-11 17:06:10 +08:00
hoshi-hiyouga	0c29233237	Update pretrain.py	2024-06-11 17:02:14 +08:00
hiyouga	cca6f35108	fix deepspeed version	2024-06-11 16:52:36 +08:00
d	6979f3f848	经过大量的增量预训练，进行对比试验，发现这个bug：llama3在预训练时使用的tokenizer.eos_toke是'<\|end_of_text\|>' ，这里在每条数据后面也得用这个，而不是'<\|eot_id\|>'，否则很容易导致严重的性能下降	2024-06-11 16:23:40 +08:00
hiyouga	89f2bd8c8c	fix #4198	2024-06-11 15:38:38 +08:00
hiyouga	90e14a960d	tiny fix	2024-06-11 12:48:53 +08:00
hiyouga	3f24337a8a	tiny fix	2024-06-11 01:04:16 +08:00
hiyouga	91e62a098f	set dev version	2024-06-11 00:50:53 +08:00
hiyouga	2b6ebd6b51	release v0.8.1	2024-06-11 00:44:26 +08:00
hiyouga	a793e8456b	fix #4160 The split heads should be concatenated in dim=2	2024-06-11 00:37:17 +08:00
hiyouga	0012762b04	update evaluator	2024-06-10 23:56:00 +08:00
hiyouga	c907d81667	fix #2666	2024-06-10 21:24:15 +08:00
mMrBun	950e360ca0	Optimize the handling of QWEN2 in scenarios involving multiple tool calls.	2024-06-10 02:00:14 +08:00
mMrBun	6ed0b0c800	Removed unnecessary comments.	2024-06-09 18:25:22 +08:00
mMrBun	0f2609ce19	Merge branch 'hiyouga:main' into main	2024-06-09 18:17:24 +08:00
mMrBun	cb1cbcb293	Implemented the tool_formatter and tool_extractor for glm4 tool_format	2024-06-09 18:16:15 +08:00
hiyouga	972ec9c668	fix llamafactory-cli env	2024-06-08 07:15:45 +08:00
hiyouga	3ac11e77cc	set dev version	2024-06-08 06:46:09 +08:00
hiyouga	5aa4ce4756	release v0.8.0	2024-06-08 05:20:54 +08:00
hiyouga	54cd743ebf	reorganize adapter code	2024-06-08 00:47:23 +08:00
hoshi-hiyouga	cfd62283a9	fix #4139	2024-06-08 00:45:02 +08:00
hiyouga	06e5d136a4	add resume args in webui	2024-06-08 00:22:16 +08:00
hiyouga	8bf9da659c	fix #4137	2024-06-07 19:16:06 +08:00
hiyouga	f8d8690bf4	tiny fix	2024-06-07 05:19:21 +08:00
hiyouga	4489d73ac7	fix ppo trainer save zero3 model accelerator.get_state_dict(ds_model) should be called at all ranks	2024-06-07 05:14:19 +08:00
hiyouga	2702d7e952	fix ppo in trl 0.8.6	2024-06-07 04:48:29 +08:00
hiyouga	f9e818d79c	fix #4120	2024-06-07 04:18:05 +08:00
hiyouga	ccc8b64cc2	update data processors	2024-06-07 04:15:40 +08:00
hoshi-hiyouga	181dbb0d05	Merge pull request #4009 from AlongWY/main supervised packing with greedy knapsack algorithm	2024-06-07 03:48:46 +08:00
hoshi-hiyouga	c09ad8bab3	Update supervised.py	2024-06-07 03:42:08 +08:00
hoshi-hiyouga	788e8232fc	Update supervised.py	2024-06-07 03:38:23 +08:00
hoshi-hiyouga	8cecade708	Update supervised.py	2024-06-07 03:38:04 +08:00
hiyouga	8e95648850	add qwen2 models	2024-06-07 00:22:57 +08:00

1 2 3 4 5 ...

1401 Commits