..
G2BMM.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
GBMM.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
activation_backward.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
all_gather.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
all_reduce.cc
impl distributed launch with NCCL ( #106 )
2023-09-05 09:47:35 +08:00
attention_kvcache.cc
use workspace to optimize kvcache attention
2024-01-25 10:33:01 +08:00
batch_norm.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
broadcast.cc
impl distributed launch with NCCL ( #106 )
2023-09-05 09:47:35 +08:00
concat.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
conv.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
det.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
dropout.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
element_wise.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
expand.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
extend.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
gather.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
gather_elements.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
layer_norm.cc
Modify kernel registration & support fp16 ( #205 )
2024-01-15 11:02:13 +08:00
lrn.cc
Fix bang ( #198 )
2023-12-28 13:44:10 +08:00
matmul.cc
feature: add parameter to config matmul compute type ( #218 )
2024-03-26 09:00:45 +08:00
membound.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
pad.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
pooling.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
recv.cc
Add send and recv operators based on NCCL ( #182 )
2023-12-14 16:38:03 +08:00
reduce.cc
Add ReduceSum op and kernel ( #160 )
2023-11-24 09:29:58 +08:00
reshape.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
resize.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
rms_norm.cc
Accelerate llama ( #219 )
2024-04-01 08:46:05 +08:00
rope.cc
add test for rotary embedding cuda kernel
2024-02-04 10:24:20 +08:00
send.cc
Add send and recv operators based on NCCL ( #182 )
2023-12-14 16:38:03 +08:00
slice.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
softmax.cc
support mixed dtype ( #102 )
2023-08-16 21:49:43 +08:00
split.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
squeeze.cc
解除前端对onnx infershape功能的依赖 ( #206 )
2024-01-12 14:54:27 +08:00
transpose.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00
unary.cc
XCCL support ( #171 )
2024-02-29 11:48:35 +08:00
unsqueeze.cc
解除前端对onnx infershape功能的依赖 ( #206 )
2024-01-12 14:54:27 +08:00
where.cc
support Dynamic tensor infer shape and fix memory pool ( #176 )
2023-11-23 13:11:50 +08:00