InfiniTensor/include
wendy12022 86ec4036ce
ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61)
* move memory format transformation to TensorObj

clang format

add MemoryFormat for tensorObj.

use post_ops for fused conv/deconv

Distinguish mkl  op_timer from cuda op timer.

add act optype to conv and deconv

add operator timer

add mkl kernel for convTransposed

minor fix for group conv

do not use cblas_sgemm_batch

CpuRuntimeObj->NativeCpuRuntimeObj

add  matmul op for mkl

* fix: fix bugs when rebasing from master

fix: fix bugs when rebasing from master

* fix: update api after rebasing

* fix: fix format; fix onnx import

* fix: fix clang-format

* [fix] fix conv_transpose test

* [fix] use stronger test case for transposed conv

* [fix] remove tensor memory format; fix mkl transpose conv

* [fix] add FIXME tag for op_timer python api

---------

Co-authored-by: whjthu <haojie0429@gmail.com>
2023-03-27 21:28:49 +08:00
..
bang Support bang c kernel wanghailu 0927 (#43) 2022-09-30 11:01:52 +08:00
core ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61) 2023-03-27 21:28:49 +08:00
cuda Add search engine (#64) 2023-02-12 18:27:52 +08:00
ffi Add TVM codegen for MemboundOp (#35) 2022-09-22 18:06:45 +08:00
mkl ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61) 2023-03-27 21:28:49 +08:00
nnet Fix NNet tests after migration (#27) 2022-09-13 15:17:22 +08:00
operators fix: remove inline keyword in class; rename getter and setter for inputOf and outputOf 2023-03-25 12:04:24 +08:00
utils ADD: batch norm operator and cuda kernel. (#44) 2022-10-15 16:29:28 +08:00
test.h Add python interface for CUDA operator evaluation (#42) 2022-09-27 10:41:12 +08:00