InfiniTensor

xiaonans 9a3c0f11f6 add test for rotary embedding cuda kernel	2024-02-04 10:24:20 +08:00
..
test_cuda_G2BMM.cc	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )	2023-03-27 21:28:49 +08:00
test_cuda_GBMM.cc	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )	2023-03-27 21:28:49 +08:00
test_cuda_all_gather.cc	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
test_cuda_all_reduce.cc	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
test_cuda_attention.cc	use workspace to optimize kvcache attention	2024-01-25 10:33:01 +08:00
test_cuda_batch_norm.cc	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )	2023-03-27 21:28:49 +08:00
test_cuda_broadcast.cc	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
test_cuda_clip.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_cuda_concat.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
test_cuda_conv.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_cuda_conv_fp16.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_cuda_conv_transposed_2d.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
test_cuda_element_wise.cc	add CUDNN impl for Min and Max (#118 )	2023-08-22 16:19:29 +08:00
test_cuda_expand.cc	框架支持bert/gpt2模型构图 (#94 )	2023-08-29 16:06:52 +08:00
test_cuda_extend.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_cuda_gather.cc	框架支持bert/gpt2模型构图 (#94 )	2023-08-29 16:06:52 +08:00
test_cuda_gather_elements.cc	Add GatherElements op and cuda kernel (#149 )	2023-10-12 09:18:12 +08:00
test_cuda_inception.cc	Pooling ceil mode (#155 )	2023-10-09 20:51:39 +08:00
test_cuda_layernorm.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
test_cuda_matmul.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_cuda_pad.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_cuda_pooling.cc	Pooling ceil mode (#155 )	2023-10-09 20:51:39 +08:00
test_cuda_reduce.cc	Add ReduceSum op and kernel (#160 )	2023-11-24 09:29:58 +08:00
test_cuda_reshape.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_cuda_resize.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_cuda_rope.cc	add test for rotary embedding cuda kernel	2024-02-04 10:24:20 +08:00
test_cuda_sendrecv.cc	Add send and recv operators based on NCCL (#182 )	2023-12-14 16:38:03 +08:00
test_cuda_slice.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_cuda_softmax.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
test_cuda_split.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
test_cuda_transpose.cc	Add cuda transpose kernel (#115 )	2023-08-22 14:22:15 +08:00
test_cuda_unary.cc	add unittest of silu kernel	2024-01-30 10:40:13 +08:00
test_cuda_where.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
test_perfengine.cc	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )	2023-03-27 21:28:49 +08:00

test_cuda_G2BMM.cc

ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )

2023-03-27 21:28:49 +08:00

test_cuda_GBMM.cc

ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )

2023-03-27 21:28:49 +08:00

test_cuda_all_gather.cc

impl distributed launch with NCCL (#106 )

2023-09-05 09:47:35 +08:00

test_cuda_all_reduce.cc

impl distributed launch with NCCL (#106 )

2023-09-05 09:47:35 +08:00

test_cuda_attention.cc

use workspace to optimize kvcache attention

2024-01-25 10:33:01 +08:00

test_cuda_batch_norm.cc

ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )

2023-03-27 21:28:49 +08:00

test_cuda_broadcast.cc

impl distributed launch with NCCL (#106 )

2023-09-05 09:47:35 +08:00

test_cuda_clip.cc

memory_allocator (#103 )

2023-08-13 13:39:35 +08:00

test_cuda_concat.cc

Modify kernel registration & support fp16 (#205 )

2024-01-15 11:02:13 +08:00

test_cuda_conv.cc

memory_allocator (#103 )

2023-08-13 13:39:35 +08:00

test_cuda_conv_fp16.cc

memory_allocator (#103 )

2023-08-13 13:39:35 +08:00

test_cuda_conv_transposed_2d.cc

Modify kernel registration & support fp16 (#205 )

2024-01-15 11:02:13 +08:00

test_cuda_element_wise.cc

add CUDNN impl for Min and Max (#118 )

2023-08-22 16:19:29 +08:00

test_cuda_expand.cc

框架支持bert/gpt2模型构图 (#94 )

2023-08-29 16:06:52 +08:00

test_cuda_extend.cc

memory_allocator (#103 )

2023-08-13 13:39:35 +08:00

test_cuda_gather.cc

框架支持bert/gpt2模型构图 (#94 )

2023-08-29 16:06:52 +08:00

test_cuda_gather_elements.cc

Add GatherElements op and cuda kernel (#149 )

2023-10-12 09:18:12 +08:00

test_cuda_inception.cc

Pooling ceil mode (#155 )

2023-10-09 20:51:39 +08:00

test_cuda_layernorm.cc

Modify kernel registration & support fp16 (#205 )

2024-01-15 11:02:13 +08:00

test_cuda_matmul.cc

memory_allocator (#103 )

2023-08-13 13:39:35 +08:00

test_cuda_pad.cc

memory_allocator (#103 )

2023-08-13 13:39:35 +08:00

test_cuda_pooling.cc

Pooling ceil mode (#155 )

2023-10-09 20:51:39 +08:00

test_cuda_reduce.cc

Add ReduceSum op and kernel (#160 )

2023-11-24 09:29:58 +08:00

test_cuda_reshape.cc

memory_allocator (#103 )

2023-08-13 13:39:35 +08:00

test_cuda_resize.cc

memory_allocator (#103 )

2023-08-13 13:39:35 +08:00

test_cuda_rope.cc

add test for rotary embedding cuda kernel

2024-02-04 10:24:20 +08:00

test_cuda_sendrecv.cc

Add send and recv operators based on NCCL (#182 )

2023-12-14 16:38:03 +08:00

test_cuda_slice.cc

memory_allocator (#103 )

2023-08-13 13:39:35 +08:00

test_cuda_softmax.cc

Modify kernel registration & support fp16 (#205 )

2024-01-15 11:02:13 +08:00

test_cuda_split.cc

Modify kernel registration & support fp16 (#205 )

2024-01-15 11:02:13 +08:00

test_cuda_transpose.cc

Add cuda transpose kernel (#115 )

2023-08-22 14:22:15 +08:00

test_cuda_unary.cc

add unittest of silu kernel

2024-01-30 10:40:13 +08:00

test_cuda_where.cc

Modify kernel registration & support fp16 (#205 )

2024-01-15 11:02:13 +08:00

test_perfengine.cc

ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )

2023-03-27 21:28:49 +08:00