..
test_cuda_G2BMM.cc
ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. ( #61 )
2023-03-27 21:28:49 +08:00
test_cuda_GBMM.cc
ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. ( #61 )
2023-03-27 21:28:49 +08:00
test_cuda_all_gather.cc
impl distributed launch with NCCL ( #106 )
2023-09-05 09:47:35 +08:00
test_cuda_all_reduce.cc
impl distributed launch with NCCL ( #106 )
2023-09-05 09:47:35 +08:00
test_cuda_batch_norm.cc
ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. ( #61 )
2023-03-27 21:28:49 +08:00
test_cuda_broadcast.cc
impl distributed launch with NCCL ( #106 )
2023-09-05 09:47:35 +08:00
test_cuda_clip.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_concat.cc
修复split concat当dim=0结果出错的问题 ( #138 )
2023-09-25 10:25:54 +08:00
test_cuda_conv.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_conv_fp16.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_conv_transposed_2d.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_element_wise.cc
add CUDNN impl for Min and Max ( #118 )
2023-08-22 16:19:29 +08:00
test_cuda_expand.cc
框架支持bert/gpt2模型构图 ( #94 )
2023-08-29 16:06:52 +08:00
test_cuda_extend.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_gather.cc
框架支持bert/gpt2模型构图 ( #94 )
2023-08-29 16:06:52 +08:00
test_cuda_gather_elements.cc
Add GatherElements op and cuda kernel ( #149 )
2023-10-12 09:18:12 +08:00
test_cuda_inception.cc
Pooling ceil mode ( #155 )
2023-10-09 20:51:39 +08:00
test_cuda_matmul.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_pad.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_pooling.cc
Pooling ceil mode ( #155 )
2023-10-09 20:51:39 +08:00
test_cuda_reduce_mean.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_reshape.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_resize.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_slice.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_softmax.cc
memory_allocator ( #103 )
2023-08-13 13:39:35 +08:00
test_cuda_split.cc
修复split concat当dim=0结果出错的问题 ( #138 )
2023-09-25 10:25:54 +08:00
test_cuda_transpose.cc
Add cuda transpose kernel ( #115 )
2023-08-22 14:22:15 +08:00
test_cuda_unary.cc
Add HardSigmoid and HardSwish ( #156 )
2023-10-10 22:41:06 +08:00
test_cuda_where.cc
"modified where" ( #131 )
2023-09-14 10:45:57 +08:00
test_perfengine.cc
ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. ( #61 )
2023-03-27 21:28:49 +08:00