InfiniTensor

Commit Graph

Author	SHA1	Message	Date
wendy12022	86ec4036ce	ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 ) * move memory format transformation to TensorObj clang format add MemoryFormat for tensorObj. use post_ops for fused conv/deconv Distinguish mkl op_timer from cuda op timer. add act optype to conv and deconv add operator timer add mkl kernel for convTransposed minor fix for group conv do not use cblas_sgemm_batch CpuRuntimeObj->NativeCpuRuntimeObj add matmul op for mkl * fix: fix bugs when rebasing from master fix: fix bugs when rebasing from master * fix: update api after rebasing * fix: fix format; fix onnx import * fix: fix clang-format * [fix] fix conv_transpose test * [fix] use stronger test case for transposed conv * [fix] remove tensor memory format; fix mkl transpose conv * [fix] add FIXME tag for op_timer python api --------- Co-authored-by: whjthu <haojie0429@gmail.com>	2023-03-27 21:28:49 +08:00
zhengly123	1152adc94a	Add: python API for timing ConvTranspose (#46 ) * Add: python interfaced for timing operators * Fix: CUDA Runtime run Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-10-07 16:03:11 +08:00
zhengly123	1aefc1b27e	Add python interface for CUDA operator evaluation (#42 ) * Refactor: seperate data generator * Add: python bindings for opTimer * Fix: test_perfengine Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-09-27 10:41:12 +08:00

Author

SHA1

Message

Date

wendy12022

86ec4036ce

ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61 )

* move memory format transformation to TensorObj

clang format

add MemoryFormat for tensorObj.

use post_ops for fused conv/deconv

Distinguish mkl  op_timer from cuda op timer.

add act optype to conv and deconv

add operator timer

add mkl kernel for convTransposed

minor fix for group conv

do not use cblas_sgemm_batch

CpuRuntimeObj->NativeCpuRuntimeObj

add  matmul op for mkl

* fix: fix bugs when rebasing from master

fix: fix bugs when rebasing from master

* fix: update api after rebasing

* fix: fix format; fix onnx import

* fix: fix clang-format

* [fix] fix conv_transpose test

* [fix] use stronger test case for transposed conv

* [fix] remove tensor memory format; fix mkl transpose conv

* [fix] add FIXME tag for op_timer python api

---------

Co-authored-by: whjthu <haojie0429@gmail.com>

2023-03-27 21:28:49 +08:00

zhengly123

1152adc94a

Add: python API for timing ConvTranspose (#46 )

* Add: python interfaced for timing operators

* Fix: CUDA Runtime run

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>

2022-10-07 16:03:11 +08:00

zhengly123

1aefc1b27e

Add python interface for CUDA operator evaluation (#42 )

* Refactor: seperate data generator

* Add: python bindings for opTimer

* Fix: test_perfengine

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>

2022-09-27 10:41:12 +08:00

3 Commits