InfiniTensor/include/core
deathwings602 11d5aa1ccc
Add TVM codegen for MemboundOp (#35)
* Add:  interface for membound TVM kernel and test

* add getAnsorCode

* add evaluation, but link failed

* add evaluation of kernel, but link failed

* Fix: link libcuda and nvrtc

* add print

* Add: const for source of copy

* compile and evaluate the kernel

* add compute

* fix gen_ansor_op.py

* fix membound_TVM

* format and fix CMakeLists.txt

* fix memory leak

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>
2022-09-22 18:06:45 +08:00
..
blob.h Add CUDA runtime (#6) 2022-08-22 15:01:03 +08:00
common.h Add: ConvTransposed (#33) 2022-09-19 15:05:39 +08:00
constants.h Add activation operators and kernels 2022-09-16 13:58:57 +08:00
data_type.h Json perfrecord (#32) 2022-09-22 15:34:34 +08:00
graph.h Fix NNet tests after migration (#27) 2022-09-13 15:17:22 +08:00
hash.h Tensor hash and inferShape (#4) 2022-08-15 15:08:56 +08:00
kernel.h Json perfrecord (#32) 2022-09-22 15:34:34 +08:00
mutator.h Update: OpAttrs -> OpPerfKey 2022-08-09 14:58:45 +08:00
object.h Add: perf engine 2022-08-07 21:12:17 +08:00
operator.h Json perfrecord (#32) 2022-09-22 15:34:34 +08:00
perf_engine.h Json perfrecord (#32) 2022-09-22 15:34:34 +08:00
ref.h Add: ConvTransposed (#33) 2022-09-19 15:05:39 +08:00
runtime.h Add TVM codegen for MemboundOp (#35) 2022-09-22 18:06:45 +08:00
tensor.h Add TVM codegen for MemboundOp (#35) 2022-09-22 18:06:45 +08:00
tensor_base.h Simplify tensor transfer between CPU and CUDA (#10) 2022-08-25 11:29:16 +08:00