Commit Graph

89 Commits

Author SHA1 Message Date
wendy12022 c8b2c8ed32
Cpu backend2 (#77)
fix review

change Device::MKL to Device::INTELCPU

fix mkl linkage

fix errors according to merge from master

now can call mkl backend

fix softmax/flatten with axis from onnx.

modify README.md

fix memory refree

add env_lotus_intelcpu.sh

fix compile

merge from branch cpu_backend

fix something add gather

fix something

FIX: directory rename from "mkl" to "intelcpu"

ADD: use oneMKL dpcpp interface to implement matmul kernel.

ADD: add dpcpp as compiler for mkl, and fix warnings for clang compiling.
add dpcpp kernel for pow.

ADD: mkl kernel for pad.

ADD: slice mkl kernel.

ADD: reshape/flatten/identity mkl kernel.

ADD: split mkl kernel.

fix compile error

FIX: fix flattenObj with axis.

ADD reduce_mean mkl kernel.

Add concat mkl kernel.

bathNorm for mkl kernel.

sigmoid mkl kernel.

ADD:add mkl kernel for pooling

add more tests for softmax

Now softmax cuda kernel supports any axises.

mkl kernel for softmax

softmax

add axis to softmax operator

add mkl kernel for abs tanh

ADD: relu kernel for mkl

fix binary mkl primitives.

add mkl kernel for binary operators

fix compiler error

move stream to runtime

clang format

add MemoryFormat for tensorObj.

use post_ops for fused conv/deconv

Distinguish mkl  op_timer from cuda op timer.

add act optype to conv and deconv

add operator timer

add mkl kernel for convTransposed

minor fix for group conv

do not use cblas_sgemm_batch

CpuRuntimeObj->NativeCpuRuntimeObj

add  matmul op for mkl
2023-04-17 12:15:23 +08:00
Hardy fe1afe38fa
fix code of bang conv (#76)
* fix code of bang conv

* test: 向 master push 时也执行 ci

Signed-off-by: YdrMaster <ydrml@hotmail.com>

---------

Signed-off-by: YdrMaster <ydrml@hotmail.com>
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: YdrMaster <ydrml@hotmail.com>
2023-03-29 15:47:32 +08:00
Hardy 823e66a9ff
Support perf bang 1115 (#57)
* support matmul

* add matmul

* add matmul

* add code for cnnl matmul operation and test

* add conv

* add code for conv test on mlu

* add code for test cnnl conv on mlu

* add code for perf conv and matmul on mlu

* clang format

* fix convolution operation

* fxi cmaklist

* code format

* fix code

* code format

---------

Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: wanghailu <wanghailu0717@163.com>
2023-03-29 13:52:56 +08:00
wendy12022 86ec4036ce
ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61)
* move memory format transformation to TensorObj

clang format

add MemoryFormat for tensorObj.

use post_ops for fused conv/deconv

Distinguish mkl  op_timer from cuda op timer.

add act optype to conv and deconv

add operator timer

add mkl kernel for convTransposed

minor fix for group conv

do not use cblas_sgemm_batch

CpuRuntimeObj->NativeCpuRuntimeObj

add  matmul op for mkl

* fix: fix bugs when rebasing from master

fix: fix bugs when rebasing from master

* fix: update api after rebasing

* fix: fix format; fix onnx import

* fix: fix clang-format

* [fix] fix conv_transpose test

* [fix] use stronger test case for transposed conv

* [fix] remove tensor memory format; fix mkl transpose conv

* [fix] add FIXME tag for op_timer python api

---------

Co-authored-by: whjthu <haojie0429@gmail.com>
2023-03-27 21:28:49 +08:00
whjthu d9886e9de3 fix: remove inline keyword in class; rename getter and setter for inputOf and outputOf 2023-03-25 12:04:24 +08:00
YdrMaster 5aeacedab3 fix: 从模板导出每个类型的 python 接口
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-22 09:46:40 +08:00
YdrMaster 9db97eb212 refactor: 整合操作张量数据的方法
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-21 14:00:04 +08:00
YdrMaster e294e46436 feat: 导出 pool 到 onnx
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster 40fb8390b1 feat: 导入时保存权重
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster a5e692baea feat: 导出 batchnorm 到 onnx
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster 5b6698bac7 feat: 导出全图的输出张量到 onnx
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster 3d122aebfe feat: 支持导出浮点向量
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster f44a4daf70 feat: 导出未初始化的张量
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster ed81861375 temp: 实现初始值导入,但 resnet 报错
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster 71a87c27d1 feat: 导出 ReduceMean 到 onnx
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 15:09:12 +08:00
YdrMaster 2a23669394 feat: 导出 Reshape 到 onnx
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 15:09:12 +08:00
YdrMaster fe81fccf76 feat: 导出 OperatorObj
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 15:09:12 +08:00
YdrMaster 45a3cdfa30 feat: GraphObj 增加一个拓扑排序方法及其测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 15:09:12 +08:00
YdrMaster f20e791cf5 style: 修改 graph.h/graph.cc
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 15:09:12 +08:00
Haojie Wang 0f52d04882
Merge branch 'master' into dev-onnx 2023-03-15 14:52:03 +08:00
deathwings602 40d1b1c91b
Add ConvTransposedNHWC (#67)
* Add: IT_ASSERT_TODO

* [WIP] Add: ConvTranspose2d mutation test

* add ConvTransposedNHWC

* fix test_cuda_transposed_2d

---------

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>
2023-03-01 14:15:02 +08:00
YdrMaster 6871fff02b feat: 导出分配内存和运行推理的接口
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-23 11:08:00 +08:00
YdrMaster 4c7fdf44c5 feat: 前端支持 Conv 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-22 15:05:44 +08:00
YdrMaster 6a4de807e6 style: remove non-ascii comments from cpp
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-16 14:57:51 +08:00
YdrMaster 315763a83a feat: 前端支持 pad 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-15 11:41:06 +08:00
YdrMaster 8fae67b4b4 feat: 前端支持 slice 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 17:35:18 +08:00
YdrMaster f9d0076a86 opt: 优化 SliceObj 构造器实现
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 16:44:08 +08:00
YdrMaster 341cf1f943 feat: 前端支持 pool 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 16:26:47 +08:00
YdrMaster 62ceb78ae3 feat: 前端支持 reduceMean 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 15:35:01 +08:00
YdrMaster fb9d84dbb7 opt: 优化 ReduceMeanObj 构造器实现
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 15:14:28 +08:00
YdrMaster d11fb0ad5f feat: 前端支持 gather 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 14:16:01 +08:00
YdrMaster 45aa0237da feat: 前端支持 concat 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 13:42:35 +08:00
YdrMaster a7e58bd8d0 feat: 补充 DataType 类型
- 增加了 6 个代数类型,与 onnx 的序号对应
- 现在可以导入 reshape 了

Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 11:27:57 +08:00
YdrMaster 7626efbfa8 feat: 前端支持 reshape
- 无法测试,因为后端不支持 shape 的 INT64 类型

opt: ReshapeObj 构造改为全部传值并在内部 move
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 09:51:11 +08:00
whjthu 26be533faa Add documentation for operators. 2023-02-13 22:51:15 +08:00
YdrMaster cca4d2a491 feat: 前端支持 batchNorm(无单元测试)
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-13 17:15:35 +08:00
YdrMaster e194dd943b feat: 前端支持 flatten 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-13 13:50:07 +08:00
YdrMaster e4ec9c4230 feat: 前端支持 identity 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-13 12:26:11 +08:00
YdrMaster 7f0c8ebae3 feat: 前端支持 relu sigmoid tanh softmax abs 及单元测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-13 11:54:54 +08:00
YdrMaster 6e5beceadd feat: 增加 add sub mul div pow 前端
- 添加每个算子的单元测试
- 添加线性回归模型导入测试

Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-13 11:25:54 +08:00
YdrMaster 296fcc5aa0 feat: 创建 pyinfinitensor 前端
- python 前端项目结构及打包和安装脚本
- 后端编译出 so 改名为 backend,增加 GraphHandler 修改图结构
- ci 支持测试这些功能

Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-13 09:19:05 +08:00
zhengly123 c7ec9ee6e7
Add search engine (#64)
* Add: tensor fuid

* [Intermediate state] Add: Graph ctor for OpVec

* Add: clone for operators

* tmp: search_engine

* search: init search Engine.

* Add: dummy mutator for the test of search engine

* search: add print graph.

* search: add partition.

* search: update comments.

* Fix: remain FUID in Tensor::clone

* Chore: rename GUidBaseType to UidBaseType

* Fix: connect NMutator to SearchEngine

* Chore: output

* Fix test_memboundOp: nmutator uses input runtime

* Chore: clang-format

* Chore: clang-format

* Fix: comments in the review

---------

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: mazx <dyxdy@live.com>
2023-02-12 18:27:52 +08:00
wendy12022 d780f687fc
ADD: reconfig ResizeObj, support "tf_crop_and_resize " and cubic coeff kernel. (#59)
add cubic coef

add tf_crop_and_resize
2022-12-24 04:02:21 +08:00
wendy12022 c5966f8d81
Add: resize operator and cuda kernel,support nearest/linear coef. (#51)
ADD: resize operator and cuda kernel,support nearest/linear coef.

fix some

fix tests

add more tests for linear mode.

add linear coef mode.

add scales

add tests

fix tests.

add notLarger notSmaller

fix

add test

ADD:resize operator and cuda kernel
2022-11-14 09:30:22 +08:00
zhengly123 63d8aff985
Fix: cuCtxCreate before other initialization (#49)
Fix: create cuCtx at the very beginning

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-10-19 15:41:48 +08:00
zhengly123 4e0040c8a0
Add: connection among tensors and operators (#45)
* Add: refs_to_wrefs and wrefs_to_refs

* Add: op and tensor connection

* Add: inception-v3 block test

* Refactor: addOperatorAndConnect

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-10-18 22:02:51 +08:00
wendy12022 d1c913010f
ADD:reduce_mean operator and cuda kernel. (#47)
add new line at file ending.
2022-10-15 16:53:58 +08:00
wendy12022 a4d6426589
ADD: batch norm operator and cuda kernel. (#44)
fix numInputs of batchNorm, add new line in file ending.

ADD: batch norm operator and cuda kernel.

add training

remove comments.

fix compile error.

add batch norm operator and cuda kernel.
2022-10-15 16:29:28 +08:00
zhengly123 1152adc94a
Add: python API for timing ConvTranspose (#46)
* Add: python interfaced for timing operators

* Fix: CUDA Runtime run

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-10-07 16:03:11 +08:00
Hardy b0c2a08252
Support bang c kernel wanghailu 0927 (#43)
* fix a little bug which found by new verison CMake

* add code for support BangC language kernel , just like Cuda kernel, not
library

* add bangc kernel

* support BangC kernel

* add code for support BangC kernel

* support bangc kernel

* fix some code from reviewer

* fix code of template fumction

* add code for support bangc kernel

* fix bangc format

Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2022-09-30 11:01:52 +08:00