InfiniTensor

Commit Graph

Author	SHA1	Message	Date
constroy Li	48847958d0	impl sqrt on CUDA (#109 ) * impl sqrt on CUDA fix parser of Gather and ReduceMean * fix test_gather * fix test_cuda_gather * impl sqrt cpu and add sqrt to test_cuda_unary * cuda_unary supports arbitary shapes * fix SplitOp with dim=-1 * fix SplitOp with dim=-1	2023-08-18 12:17:47 +08:00
wendy12022	86ec4036ce	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 ) * move memory format transformation to TensorObj clang format add MemoryFormat for tensorObj. use post_ops for fused conv/deconv Distinguish mkl op_timer from cuda op timer. add act optype to conv and deconv add operator timer add mkl kernel for convTransposed minor fix for group conv do not use cblas_sgemm_batch CpuRuntimeObj->NativeCpuRuntimeObj add matmul op for mkl * fix: fix bugs when rebasing from master fix: fix bugs when rebasing from master * fix: update api after rebasing * fix: fix format; fix onnx import * fix: fix clang-format * [fix] fix conv_transpose test * [fix] use stronger test case for transposed conv * [fix] remove tensor memory format; fix mkl transpose conv * [fix] add FIXME tag for op_timer python api --------- Co-authored-by: whjthu <haojie0429@gmail.com>	2023-03-27 21:28:49 +08:00
wendy12022	a4d6426589	ADD: batch norm operator and cuda kernel. (#44 ) fix numInputs of batchNorm, add new line in file ending. ADD: batch norm operator and cuda kernel. add training remove comments. fix compile error. add batch norm operator and cuda kernel.	2022-10-15 16:29:28 +08:00
wendy12022	3c6e208f42	ADD:concat/split operator and cuda kernels (#29 ) * ADD:concat/split operator and cuda kernels refector minor change comment ADD:concat/split operator and cuda kernels merge split_kernel and concat_kernel to split_concat_kernel. Revert "fix" This reverts commit 459926be09a838658ec55f1e0a72b3cf17037d5c. fix ADD:concat/split operator and cuda kernels change whole tensor name to composed tensor fix some remove unused header. rebase add CudaKernel add test for split. ADD split operator and cuda kernel. modify test. ADD:concat operator and cuda kernel. ADD:concat/split operator and cuda kernels fix some remove unused header. rebase add CudaKernel ADD:concat/split operator and cuda kernels add test for split. ADD split operator and cuda kernel. modify test. ADD:concat operator and cuda kernel. * remove extra comment; typo fix. Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2022-09-29 11:01:30 +08:00

4 Commits