InfiniTensor

History

xiaonans a98573990b Accelerate llama (#219 ) * [feature] add cudagraph support * modify code to pass the cuda_all_reduce test * modify rope op * support rmsnorm * add fp16 support to silu cuda op * fix bugs in rmsnorm op * uncomment simplify in onnx.py --------- Co-authored-by: Haojie Wang <haojie0429@gmail.com>		2024-04-01 08:46:05 +08:00
..
G2BMM.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
GBMM.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
activation_backward.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
all_gather.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
all_reduce.cc	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
attention_kvcache.cc	use workspace to optimize kvcache attention	2024-01-25 10:33:01 +08:00
batch_norm.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
broadcast.cc	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
concat.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
conv.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
det.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
dropout.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
element_wise.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
expand.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
extend.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
gather.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
gather_elements.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
layer_norm.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
lrn.cc	Fix bang (#198 )	2023-12-28 13:44:10 +08:00
matmul.cc	feature: add parameter to config matmul compute type (#218 )	2024-03-26 09:00:45 +08:00
membound.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
pad.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
pooling.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
recv.cc	Add send and recv operators based on NCCL (#182 )	2023-12-14 16:38:03 +08:00
reduce.cc	Add ReduceSum op and kernel (#160 )	2023-11-24 09:29:58 +08:00
reshape.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
resize.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
rms_norm.cc	Accelerate llama (#219 )	2024-04-01 08:46:05 +08:00
rope.cc	add test for rotary embedding cuda kernel	2024-02-04 10:24:20 +08:00
send.cc	Add send and recv operators based on NCCL (#182 )	2023-12-14 16:38:03 +08:00
slice.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
softmax.cc	support mixed dtype (#102 )	2023-08-16 21:49:43 +08:00
split.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
squeeze.cc	解除前端对onnx infershape功能的依赖 (#206 )	2024-01-12 14:54:27 +08:00
transpose.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
unary.cc	XCCL support (#171 )	2024-02-29 11:48:35 +08:00
unsqueeze.cc	解除前端对onnx infershape功能的依赖 (#206 )	2024-01-12 14:54:27 +08:00
where.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00