InfiniTensor

History

Chenjie Duan 51086d2b8d Modify kernel registration & support fp16 (#205 ) * - Remove dataType from the kernel registration. * - support fp16 for conv * - cpu kernel: adapt the new registration mechanism * modified all register kernel * add where fp16 * add layernorm fp16 * add split_concat fp16 * - element_wise support fp16 * feat: support transpose fp16 * feat: support sliceOp fp16 * - unary support fp16 * - feat: support reduceOp fp16 * feat: support matmulOp/expandOp fp16 * feat: support powOp int8 * add cuda cast & support half-precision for gather * style: fix style * feat:support int8 for gather * style:fix style * modified test_cuda_conv_transposed * fix: fix dist code to support fp16 * fix(graph.cc): fix topo_sort * fix: fix recv and send kernel registration * feat: add field tensors for stub * refactor(frontend): 先排序后构图 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 为中间结果提供tensor到node的mapping * fix (slice): add guard for area out of range * fix: fix matmul fp16 * fix: fix re-dataMalloc for weight tensor and use of naive allocator * feat: add dataType filter for cuda kernel * feat: bang kernel adapt the new registration mechanism * fix: fix some error on mlu * feat: intelcpu kernel adapt the new registration mechanism * feat: modify kernel registration on kunlun * fix intelcpu compiler bug * feat: bang reshape support all dataType * fix: fix bang reduce * fix(all_reduce.cc): fix as reviewer suggessted * fix: fix style and restore unary test codes --------- Signed-off-by: YdrMaster <ydrml@hotmail.com> Co-authored-by: xgqdut2016 <kenan_gewei@163.com> Co-authored-by: xgqdut2016 <140036308+xgqdut2016@users.noreply.github.com> Co-authored-by: zhangyunze <z13785159769@163.com> Co-authored-by: OdinaryWord <sx-hz@163.com> Co-authored-by: YdrMaster <ydrml@hotmail.com> Co-authored-by: panzezhong <panzezhong@qiyuanlab.com>		2024-01-15 11:02:13 +08:00
..
test_all_gather.cc	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
test_all_reduce.cc	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
test_batch_norm.cc	refactor(core): 添加新的 `OpType` 定义 (#99 )	2023-08-07 11:17:05 +08:00
test_broadcast.cc	impl distributed launch with NCCL (#106 )	2023-09-05 09:47:35 +08:00
test_clip.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
test_concat.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
test_conv.cc	Add layer normalization (#181 )	2023-11-24 15:15:14 +08:00
test_conv_transposed_2d.cc	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )	2023-03-27 21:28:49 +08:00
test_element_wise.cc	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )	2023-03-27 21:28:49 +08:00
test_expand.cc	框架支持bert/gpt2模型构图 (#94 )	2023-08-29 16:06:52 +08:00
test_extend.cc	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )	2023-03-27 21:28:49 +08:00
test_gather.cc	框架支持bert/gpt2模型构图 (#94 )	2023-08-29 16:06:52 +08:00
test_gather_elements.cc	Add GatherElements op and cuda kernel (#149 )	2023-10-12 09:18:12 +08:00
test_matmul.cc	support mixed dtype (#102 )	2023-08-16 21:49:43 +08:00
test_pad.cc	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )	2023-03-27 21:28:49 +08:00
test_pooling.cc	Pooling ceil mode (#155 )	2023-10-09 20:51:39 +08:00
test_reduce.cc	Add ReduceSum op and kernel (#160 )	2023-11-24 09:29:58 +08:00
test_reshape.cc	解除前端对onnx infershape功能的依赖 (#206 )	2024-01-12 14:54:27 +08:00
test_resize.cc	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )	2023-03-27 21:28:49 +08:00
test_sendrecv.cc	Add send and recv operators based on NCCL (#182 )	2023-12-14 16:38:03 +08:00
test_slice.cc	Dev for 202303ddl (#66 )	2023-04-18 15:10:33 +08:00
test_split.cc	impl sqrt on CUDA (#109 )	2023-08-18 12:17:47 +08:00
test_transpose.cc	support mixed dtype (#102 )	2023-08-16 21:49:43 +08:00
test_unary.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
test_where.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00