InfiniTensor

History

constroy Li feccd4f318 fix tensor parallel for llama (#159 ) * fix Slice * change default rounds of timeit to 10 to reduce time * fix slice with large ends * Reshape support Int64 * support position_ids as input * skip last MatMul in Llama * skip infer_shapes to parse large model * update launch.py * fix split_concat_kernel * print more message in launch.py * Reshape supports both Int32 and Int64 * try infer_shapes and warn about failure * fix format --------- Co-authored-by: whjthu <haojie0429@gmail.com>		2023-10-30 15:04:16 +08:00
..
launch.py	fix tensor parallel for llama (#159 )	2023-10-30 15:04:16 +08:00
launch_kvcache.py	Support kvcache (#134 )	2023-09-18 14:17:02 +08:00
parallel.py	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
parallel_opt.py	fix tensor parallel for llama (#159 )	2023-10-30 15:04:16 +08:00
placement.py	tensor parallel for transformer (#125 )	2023-09-14 14:19:45 +08:00