Derui Yang
57ac94d893
refactor(core): 添加新的 `OpType` 定义 ( #99 )
...
* feat: 添加新的 OpType 定义
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* refactor: 使用新的 OpType 替换原来的,修改整个项目
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: onnx 导入
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 修正 cuda 和 bang kernel 的问题
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 过滤 bang test
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 过滤 bang test
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix bang code.
* fix code on bang
* fmt
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 删除指定文件
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 删两个没用的文件,去掉一个不知道为什么的注释
Signed-off-by: YdrMaster <ydrml@hotmail.com>
---------
Signed-off-by: YdrMaster <ydrml@hotmail.com>
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
2023-08-07 11:17:05 +08:00
zhengly123
c7ec9ee6e7
Add search engine ( #64 )
...
* Add: tensor fuid
* [Intermediate state] Add: Graph ctor for OpVec
* Add: clone for operators
* tmp: search_engine
* search: init search Engine.
* Add: dummy mutator for the test of search engine
* search: add print graph.
* search: add partition.
* search: update comments.
* Fix: remain FUID in Tensor::clone
* Chore: rename GUidBaseType to UidBaseType
* Fix: connect NMutator to SearchEngine
* Chore: output
* Fix test_memboundOp: nmutator uses input runtime
* Chore: clang-format
* Chore: clang-format
* Fix: comments in the review
---------
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: mazx <dyxdy@live.com>
2023-02-12 18:27:52 +08:00
zhengly123
1152adc94a
Add: python API for timing ConvTranspose ( #46 )
...
* Add: python interfaced for timing operators
* Fix: CUDA Runtime run
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-10-07 16:03:11 +08:00
Anmuliar
90eb9d05a8
Json perfrecord ( #32 )
...
Added perfengine serialization&deserialization and corresponding test case.
* Add: perfrecord json representation.
* Add: perfrecord virtual func. to_json&from_json.
* Add: perfengine serilization and deserilization.
* Modify: tune func type to supp derived struct serilization.
* Fix: structure after rebase
* Chore: Remove empty line in conv.h
Co-authored-by: wcz112 <wcz19@mails.tsinghua.edu.cn>
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: zhengly123 <zhengly123@outlook.com>
2022-09-22 15:34:34 +08:00
zhengly123
d39328afce
Fix: PerfRecord in shared pointers ( #31 )
...
* Fix: PerfData in a shared pointer
* Add: abstraction for kernels without configuration
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-09-18 20:27:18 +08:00
wendy12022
48293576c0
Add maxpool and avgpool operators ( #17 )
...
* ADD:maxpool&&avgpool operators.
add OperatorObj::getDType()
clang format
FIX:timeit API has changed.
* Fix: Tensor::getInputs is const method
* Chore
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-31 14:44:53 +08:00
zhengly123
04ea5eed38
Add CUDA runtime ( #6 )
...
* Fix: add warm-up and repetition in timing
* Add: CUDA runtime and float support
* Refactor: Cuda and Cpu runtimes inherit Runtime
* Add: environment script for Lotus
* Add: Lotus build instructions
* Update README.md
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-22 15:01:03 +08:00
Liyan Zheng
8219b0f7ff
Add: comments for Kernel
2022-08-09 20:05:01 +08:00
Liyan Zheng
8b685ae4a6
Update: OpAttrs -> OpPerfKey
2022-08-09 14:58:45 +08:00
Liyan Zheng
efa966a3e2
Add: perf engine
2022-08-07 21:12:17 +08:00
Liyan Zheng
6c356d5b42
Add: kernel registry and naive Matmul kernel
2022-08-06 15:58:40 +08:00