LLaMA-Factory-310P3

Commit Graph

Author	SHA1	Message	Date
hiyouga	44747cebd2	tiny fix	2024-07-04 03:02:23 +08:00
hiyouga	b5d101e1bf	fix data map for packing	2024-07-04 03:01:31 +08:00
hiyouga	b03e4a74ba	update wechat	2024-07-04 01:55:05 +08:00
hiyouga	6fd6aa4530	fix packing for eager/sdpa attn	2024-07-04 01:52:43 +08:00
hoshi-hiyouga	87d9b2d005	Merge pull request #4224 from chuan298/main Implement efficient packing without cross-contamination attention	2024-07-04 01:18:54 +08:00
hiyouga	cce7083024	update packing	2024-07-04 01:10:55 +08:00
hoshi-hiyouga	a36e8f2dd5	Update packing.py	2024-07-03 23:36:01 +08:00
hiyouga	c346f79f99	update func name	2024-07-03 23:29:33 +08:00
hiyouga	8a6a7b9c8a	update arg name	2024-07-03 23:23:24 +08:00
hiyouga	575a02a23d	update hparams	2024-07-03 23:18:58 +08:00
hiyouga	7f770f6895	update ui	2024-07-03 23:13:49 +08:00
hiyouga	a4a1ddbcb9	test	2024-07-03 23:05:39 +08:00
hiyouga	1e0c860c8c	update scripts	2024-07-03 20:07:44 +08:00
hiyouga	8845e94f91	fix #4609 unwrap_model_for_generation(reward_model) is necessary for zero3 training	2024-07-03 19:45:51 +08:00
hiyouga	87346c0946	update readme	2024-07-03 19:39:05 +08:00
hoshi-hiyouga	3449c3531f	Merge pull request #4662 from wzh1994/wzh/readme Add `LazyLLM` to `Projects using LLaMA Factory` in `README.md`	2024-07-03 15:51:02 +08:00
wangzhihong	6f8f53f879	Update README_zh.md	2024-07-03 14:59:09 +08:00
wangzhihong	22da47ba27	add LazyLLM to `Projects using LLaMA Factory` in `README.md`	2024-07-03 11:12:20 +08:00
hiyouga	8b1172b910	tiny fix	2024-07-03 02:31:50 +08:00
hiyouga	71cdf8956e	tiny fix	2024-07-02 23:06:13 +08:00
hiyouga	821bb6660e	remove rlhf support for chatglm2&3	2024-07-02 23:03:17 +08:00
hiyouga	c13ae2df19	upcast logits	2024-07-02 22:32:05 +08:00
hiyouga	c47ab6c072	improve rlhf	2024-07-02 22:23:08 +08:00
ancv	e8e13b0942	move efficient_packing from data_args to model_args	2024-07-02 18:37:55 +07:00
hiyouga	9dcff3a5b5	Update bug-report.yml	2024-07-02 19:18:56 +08:00
hiyouga	c81687963a	Update bug-report.yml	2024-07-02 19:16:12 +08:00
hoshi-hiyouga	4e4b3cc905	Merge pull request #4651 from hzhaoy/add-telechat-1b Add TeleChat-1B	2024-07-02 17:56:43 +08:00
hzhaoy	57b7c00430	add TeleChat-1B	2024-07-02 17:49:04 +08:00
hiyouga	4c296001c4	fix ppo callbacks	2024-07-02 17:34:56 +08:00
hoshi-hiyouga	e8e6af2651	Merge branch 'main' into main	2024-07-01 21:01:09 +08:00
hiyouga	33f2ddb8b6	Update wechat_npu.jpg	2024-07-01 16:28:54 +08:00
hiyouga	73280b7dc7	tiny fix	2024-07-01 05:43:17 +08:00
hiyouga	8c41a0aa6d	tiny fix	2024-07-01 03:55:20 +08:00
hiyouga	1856a08e87	add eval acc	2024-07-01 03:51:20 +08:00
hiyouga	fc2c15d713	Update label_issue.yml	2024-07-01 01:29:09 +08:00
hiyouga	1771251ce3	fix #4402 #4617 Deprecate reserved_label_len arg	2024-07-01 01:19:27 +08:00
hiyouga	d4e2af1fa4	update readme	2024-07-01 00:22:52 +08:00
hiyouga	d74244d568	fix #4398 #4592	2024-06-30 21:28:51 +08:00
hiyouga	93e6fbb37d	update npu docker	2024-06-30 21:05:31 +08:00
hiyouga	2f4b89ace1	loose gemma2 attention	2024-06-29 01:42:14 +08:00
hiyouga	0e0d69b77c	update readme	2024-06-28 06:55:19 +08:00
hiyouga	4d35e218b1	bf16 by default, gemma2 attns Gemma2 finetuning cannot work until merging https://github.com/huggingface/transformers/pull/31674	2024-06-28 06:00:26 +08:00
hiyouga	64f4337dac	increase pissa_iter for stability	2024-06-28 03:18:54 +08:00
hiyouga	e3141f5f1b	fix docker flashattn	2024-06-28 01:28:59 +08:00
hiyouga	6f63050e1b	add Gemma2 models	2024-06-28 01:26:50 +08:00
hiyouga	2f78b5d62a	update examples	2024-06-28 01:17:07 +08:00
hiyouga	8baf3b22b0	refactor pissa, improve llamaboard	2024-06-28 01:04:24 +08:00
hoshi-hiyouga	ef38daa0a4	Merge pull request #4580 from hzhaoy/bugfix-deepspeed-pissa Fix bug when using pissa method with deepspeed	2024-06-28 00:46:51 +08:00
hiyouga	8ed6b367e2	fix #4549	2024-06-28 00:41:58 +08:00
hiyouga	0f421055da	fix docker file	2024-06-27 20:29:16 +08:00

1 2 3 4 5 ...

1979 Commits All Branches Search

1979 Commits

All Branches