whjthu
1ab2118716
add AnyOp and cuda kernel
2023-04-23 00:16:03 +08:00
Liyan Zheng
acc64fd32c
Merge branch 'NNET_transpose' into NNET_e2e
...
Fix: gridSize and blockSize in Reshape kernel
2023-04-22 21:32:31 +08:00
Liyan Zheng
e2f18272c9
Add: no malloc for reshape outputs
2023-04-22 21:13:57 +08:00
whjthu
34ed298725
fix format
2023-04-22 17:00:52 +08:00
whjthu
664f0dbe02
support cuda transpose
2023-04-22 16:57:27 +08:00
Liyan Zheng
16a8c5dce5
Add: Conv1x1 rule
2023-04-21 23:21:04 +08:00
Liyan Zheng
9ce21200c4
Add: NMutator mode in python
2023-04-21 21:31:22 +08:00
Liyan Zheng
b943658713
Finish: GAN
2023-04-21 21:25:43 +08:00
Liyan Zheng
2cd75bd79b
Merge branch 'NNET_e2e_fix' into NNET_e2e
...
Support CUDA Graph for TVM kernels
2023-04-21 13:18:44 +08:00
Liyan Zheng
f0fcbe825f
Add: python verification
2023-04-21 13:18:24 +08:00
huangshuhong
8c91faa948
remove expect
2023-04-21 00:17:04 +08:00
huangshuhong
c0ae03a2d7
fix tvm stream
2023-04-21 00:09:47 +08:00
YdrMaster
8bc2d3e48d
fix: test graph handler
...
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-04-20 21:51:47 +08:00
YdrMaster
28b123753e
feat: 导入 Tensor 类型
...
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-04-20 21:51:47 +08:00
Liyan Zheng
94730d93b5
Add: hash match for membound kernels
2023-04-20 17:16:01 +08:00
Liyan Zheng
6d17c4caa2
Add: getPerfTime in run_models_nnet
2023-04-20 10:54:49 +08:00
Liyan Zheng
15d0eb79cd
Add: import ONNX with membound Op
2023-04-20 10:45:28 +08:00
Liyan Zheng
34ca6bf149
Fix: skip check when Graph is exported to ONNX
2023-04-20 10:45:28 +08:00
YdrMaster
0edd138919
feat: 正反序列化分离为到 string 的和到 file 的
...
fix: 正确设置 `USE_CUDA` cfg
todo: test_search 不过
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-04-20 10:44:39 +08:00
Liyan Zheng
0b23a065ca
Add: debug hacks for InfoGAN
2023-04-20 10:42:56 +08:00
Liyan Zheng
e86e993ed4
Add: CUDA graph stream capture (MemboundOp fails)
2023-04-19 16:32:16 +08:00
Liyan Zheng
537b3b4ea4
Add: Membound operator serialization
2023-04-18 21:53:48 +08:00
Liyan Zheng
01fc19795d
Add: time non-compile-cime-computable operators
2023-04-18 17:21:16 +08:00
Liyan Zheng
99b5c95455
Add: nnet::Serializer supports FuncNode
2023-04-18 17:21:16 +08:00
Liyan Zheng
bc31219bde
Add: exclude compile-time computable operator time
2023-04-18 17:21:16 +08:00
Liyan Zheng
da49e91ab0
Add: fuse membound operators
2023-04-18 17:19:05 +08:00
Liyan Zheng
307614d95d
Add: infogan python interface
2023-04-18 17:16:25 +08:00
YdrMaster
26f0d13c26
Dev for 202303ddl ( #66 )
...
* add activation operatiopn relu, tanh, sigmoid on mlu
* commit for format
* add activation backward operation
* add test for activation_backward
* add test
* add convbpfilter
* fix
* add transpsoe code and test
* add trigon function operation on mlu: sin,cos,tan,asin,sinh,asinh
* add copy operation on mlu
* add ceil operation and floor operation
* add operation clip
* add operation cnnl div, test and test for divdemo bangc kernel
* add divnonan operation and test
* add erf operation
* add exp operation
* add operation fill
* add log operation
* add log1p operation
* add l2loss operation
* add maximum and minimum operation
* add mseloss operation
* add negTensor operation
* add power operation
* add reciprocal operation
* add sqrt and rsqrt operation
* add transform operation
* add addn operation
* add muln operation
* cherrry pick some operation
* add floordiv operation and floordivtrunc operation
* add floormod operation
* add cumsum operation
* add det operation
* add pad operation
* format
* add concat operation
* format
* add split operation
* fix concat and split operation
* add round operation
* add pooling operation
* add square operation
* add squaredDifference operation
* code format fix
* add flip operation
* code format fix
* add hardtanh operation
* add logic operation
* add addcdiv and addcmul operation
* add arange operation
* add bitcompute operation
* add net test
* fmt
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* style: rename
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 用 NativeCpuRuntime 替换 CpuRuntime
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix code
* fix code
* fix code by review suggestion
* remove operation which is not the onnx operation
* fix format
* clang format
* refactor: tensor 的 print 加一层模板的 dataToString
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: onnx 导出
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 增加计算图优化接口
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* add clip operation
* feat: 支持导入 clip
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* test: 导入导出测试加入 ci
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix batch norm
* feat: 增加 Shape 算子
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 支持导入 unsqueeze
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 修正 clip 接口
feat: 支持导入 transpose
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* add broadcast operation
* fix elementwise-broadcast
* fix elementwise broadcast
* add broadcast for gpu elementsie
* feat: pad 支持 axes 负数
feat: 不支持的 padding 导出为独立的 pad 算子
feat: 支持导入 onnxsim 过的 inception
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 修正池化的测试
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 导出 pads,支持 inception 导入导出,已加入 ci
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 支持 densenet 导入导出,并加入 ci
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 导入 squeeze
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix softmax
* feat: 导出 clip 和 transpose
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 支持 Conv 的 bias
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: bias of conv
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: bias of conv
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 导入 split
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 导出 split
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: conv
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: conv group
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: matmul 的 bias 没有放在输入里,修正
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix exmaple
* fix: 改正 reduce_mean 导出
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* refactor: 修改 slice 实现与 onnx 一致
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* style: 不导出两个 runtime 函数
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* doc: 中文使用指南
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* doc: 补全指南
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 修复导入数据的问题
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fmt
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 添加 Dropout 基本结构,但不支持两个输出是不同的类型
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 重新导出优化接口
feat: dropout 导入
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* build: BANG 选项加入 Makefile
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fxi code, change of test/kernels/bang/test* is use NativeCpuRuntime.
chaneg of include/bang/bang_runtime is for the cntoolkit upgrade.
* feat: 导出 bang runtime
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* add USE_BANG=1
* fix matmul
* fix reshape
* fix
* fix activation
* fix transpose
* format
* format
* update Makefile
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 支持导入导出 ConvTranspose
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* add prelu on mlu
* fix: ConvTranspose
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* feat: 支持导入导出 PRelu
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* add convtrans on mlu
* fmt
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* docs: 更新 README_CN.md
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix code by review suggestions
* style
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: Softmax 的 axis 可以用默认值?感觉是 onnx 不标准
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix cuda & intelcpu bugs after merging
---------
Signed-off-by: YdrMaster <ydrml@hotmail.com>
Co-authored-by: wanghailu <wanghailu0717@163.com>
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: whjthu <haojie0429@gmail.com>
2023-04-18 15:10:33 +08:00
zhengly123
a1974aabcd
NNET supports TVM backend and kernels ( #78 )
...
* Add: mutator InfoGAN minimum test
* Add: cache and padding (bugs!!)
* Add: expression reader as a cmake target
* Fix: [Intermediate] NMutator::expressionToGraph
To be fix: matmul with implicit broadcast
* Add: matmul broadcast
* Fix: GraphObj ctor should use cloneTensor
* Fix: cuBLAS failure when codegen is enabled
* Add: Exception for checkCuError
* Fix: graph OpList ctor
* Add: expr simplication for TVM
* Add: TVM headers and CMake include paths
* Add: CMake config
* Add: PackedFunc (broken)
* Fix: remove cuCtxCreate which makes TVM fails
* Fix: membound_tvm
* Fix: test_memboundOp
* Add: PRelu Expr and AsTVMVisitor
* Add: Random generator
* Add: support TVM packed function
* Fix: specify runtime
* Add: CMake support of TVM
* Add: detailed output of Matmul
* Add: comments for Matmul
* Chore: format and comments
* Chore: GraphObj::selfCheck without assert control
* Fix: CMAKE_CXX_FLAGS in CMakeLists
* fix merge bug
* update api for mkl batchnorm test
* fix lotus env
* fig header bug
---------
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>
Co-authored-by: whjthu <haojie0429@gmail.com>
2023-04-18 00:26:36 +08:00
wendy12022
43d4798323
ADD: sub graph replacement. ( #56 )
...
reconfig: connections among op and tensor now is managered by GraphObj .
add some comments
merge from master
merge from master
ADD: sub graph replacement
reconfig inputs of op resize, due to the check of operator inputs.
ResizeObj::clone
clang format
fix some and add test for multi-output.
replacement support multi-inputs and multi-outputs.
add clone for all operators
add replaceSubGraph addSubGraph
remove extra code
add more test
remove extra print
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-04-17 13:09:07 +08:00
wendy12022
c8b2c8ed32
Cpu backend2 ( #77 )
...
fix review
change Device::MKL to Device::INTELCPU
fix mkl linkage
fix errors according to merge from master
now can call mkl backend
fix softmax/flatten with axis from onnx.
modify README.md
fix memory refree
add env_lotus_intelcpu.sh
fix compile
merge from branch cpu_backend
fix something add gather
fix something
FIX: directory rename from "mkl" to "intelcpu"
ADD: use oneMKL dpcpp interface to implement matmul kernel.
ADD: add dpcpp as compiler for mkl, and fix warnings for clang compiling.
add dpcpp kernel for pow.
ADD: mkl kernel for pad.
ADD: slice mkl kernel.
ADD: reshape/flatten/identity mkl kernel.
ADD: split mkl kernel.
fix compile error
FIX: fix flattenObj with axis.
ADD reduce_mean mkl kernel.
Add concat mkl kernel.
bathNorm for mkl kernel.
sigmoid mkl kernel.
ADD:add mkl kernel for pooling
add more tests for softmax
Now softmax cuda kernel supports any axises.
mkl kernel for softmax
softmax
add axis to softmax operator
add mkl kernel for abs tanh
ADD: relu kernel for mkl
fix binary mkl primitives.
add mkl kernel for binary operators
fix compiler error
move stream to runtime
clang format
add MemoryFormat for tensorObj.
use post_ops for fused conv/deconv
Distinguish mkl op_timer from cuda op timer.
add act optype to conv and deconv
add operator timer
add mkl kernel for convTransposed
minor fix for group conv
do not use cblas_sgemm_batch
CpuRuntimeObj->NativeCpuRuntimeObj
add matmul op for mkl
2023-04-17 12:15:23 +08:00
Hardy
fe1afe38fa
fix code of bang conv ( #76 )
...
* fix code of bang conv
* test: 向 master push 时也执行 ci
Signed-off-by: YdrMaster <ydrml@hotmail.com>
---------
Signed-off-by: YdrMaster <ydrml@hotmail.com>
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: YdrMaster <ydrml@hotmail.com>
2023-03-29 15:47:32 +08:00
Hardy
823e66a9ff
Support perf bang 1115 ( #57 )
...
* support matmul
* add matmul
* add matmul
* add code for cnnl matmul operation and test
* add conv
* add code for conv test on mlu
* add code for test cnnl conv on mlu
* add code for perf conv and matmul on mlu
* clang format
* fix convolution operation
* fxi cmaklist
* code format
* fix code
* code format
---------
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: wanghailu <wanghailu0717@163.com>
2023-03-29 13:52:56 +08:00
wendy12022
86ec4036ce
ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. ( #61 )
...
* move memory format transformation to TensorObj
clang format
add MemoryFormat for tensorObj.
use post_ops for fused conv/deconv
Distinguish mkl op_timer from cuda op timer.
add act optype to conv and deconv
add operator timer
add mkl kernel for convTransposed
minor fix for group conv
do not use cblas_sgemm_batch
CpuRuntimeObj->NativeCpuRuntimeObj
add matmul op for mkl
* fix: fix bugs when rebasing from master
fix: fix bugs when rebasing from master
* fix: update api after rebasing
* fix: fix format; fix onnx import
* fix: fix clang-format
* [fix] fix conv_transpose test
* [fix] use stronger test case for transposed conv
* [fix] remove tensor memory format; fix mkl transpose conv
* [fix] add FIXME tag for op_timer python api
---------
Co-authored-by: whjthu <haojie0429@gmail.com>
2023-03-27 21:28:49 +08:00
whjthu
d9886e9de3
fix: remove inline keyword in class; rename getter and setter for inputOf and outputOf
2023-03-25 12:04:24 +08:00
YdrMaster
9db97eb212
refactor: 整合操作张量数据的方法
...
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-21 14:00:04 +08:00
YdrMaster
a27391fcdc
fix: 修正 batchNorm 实现
...
- onnx 和 pytorch 认为 batchNorm 的 4 个参数是 [c] 形状的,cuDNN 可能认为是 [1,c,1,...]。
优化已改为 [c],但 cuDNN 推理没有改;
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster
45a3cdfa30
feat: GraphObj 增加一个拓扑排序方法及其测试
...
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 15:09:12 +08:00
Haojie Wang
0f52d04882
Merge branch 'master' into dev-onnx
2023-03-15 14:52:03 +08:00
deathwings602
40d1b1c91b
Add ConvTransposedNHWC ( #67 )
...
* Add: IT_ASSERT_TODO
* [WIP] Add: ConvTranspose2d mutation test
* add ConvTransposedNHWC
* fix test_cuda_transposed_2d
---------
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>
2023-03-01 14:15:02 +08:00
YdrMaster
a7e58bd8d0
feat: 补充 DataType 类型
...
- 增加了 6 个代数类型,与 onnx 的序号对应
- 现在可以导入 reshape 了
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-14 11:27:57 +08:00
YdrMaster
296fcc5aa0
feat: 创建 pyinfinitensor 前端
...
- python 前端项目结构及打包和安装脚本
- 后端编译出 so 改名为 backend,增加 GraphHandler 修改图结构
- ci 支持测试这些功能
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-13 09:19:05 +08:00
zhengly123
c7ec9ee6e7
Add search engine ( #64 )
...
* Add: tensor fuid
* [Intermediate state] Add: Graph ctor for OpVec
* Add: clone for operators
* tmp: search_engine
* search: init search Engine.
* Add: dummy mutator for the test of search engine
* search: add print graph.
* search: add partition.
* search: update comments.
* Fix: remain FUID in Tensor::clone
* Chore: rename GUidBaseType to UidBaseType
* Fix: connect NMutator to SearchEngine
* Chore: output
* Fix test_memboundOp: nmutator uses input runtime
* Chore: clang-format
* Chore: clang-format
* Fix: comments in the review
---------
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: mazx <dyxdy@live.com>
2023-02-12 18:27:52 +08:00
wendy12022
d780f687fc
ADD: reconfig ResizeObj, support "tf_crop_and_resize " and cubic coeff kernel. ( #59 )
...
add cubic coef
add tf_crop_and_resize
2022-12-24 04:02:21 +08:00
wendy12022
c5966f8d81
Add: resize operator and cuda kernel,support nearest/linear coef. ( #51 )
...
ADD: resize operator and cuda kernel,support nearest/linear coef.
fix some
fix tests
add more tests for linear mode.
add linear coef mode.
add scales
add tests
fix tests.
add notLarger notSmaller
fix
add test
ADD:resize operator and cuda kernel
2022-11-14 09:30:22 +08:00
Zixuan Ma
00b2f18c17
Fix: unsigned compare in test ( #50 )
...
fix: unsigned compare in test.
Test project /home/mazx/git/InfiniTensor/build
Start 1: test_graph
1/18 Test #1 : test_graph ....................... Passed 0.03 sec
Start 2: test_hash
2/18 Test #2 : test_hash ........................ Passed 0.01 sec
Start 3: test_tensor_save
3/18 Test #3 : test_tensor_save ................. Passed 0.02 sec
Start 4: test_verify
4/18 Test #4 : test_verify ...................... Passed 0.01 sec
Start 5: test_batch_norm
5/18 Test #5 : test_batch_norm .................. Passed 0.01 sec
Start 6: test_concat
6/18 Test #6 : test_concat ...................... Passed 0.01 sec
Start 7: test_conv
7/18 Test #7 : test_conv ........................ Passed 0.24 sec
Start 8: test_conv_transposed_2d
8/18 Test #8 : test_conv_transposed_2d .......... Passed 0.01 sec
Start 9: test_element_wise
9/18 Test #9 : test_element_wise ................ Passed 0.01 sec
Start 10: test_extend
10/18 Test #10 : test_extend ...................... Passed 0.01 sec
Start 11: test_gather
11/18 Test #11 : test_gather ...................... Passed 0.01 sec
Start 12: test_matmul
12/18 Test #12 : test_matmul ...................... Passed 0.01 sec
Start 13: test_pad
13/18 Test #13 : test_pad ......................... Passed 0.01 sec
Start 14: test_pooling
14/18 Test #14 : test_pooling ..................... Passed 0.01 sec
Start 15: test_reduce_mean
15/18 Test #15 : test_reduce_mean ................. Passed 0.01 sec
Start 16: test_reshape
16/18 Test #16 : test_reshape ..................... Passed 0.01 sec
Start 17: test_slice
17/18 Test #17 : test_slice ....................... Passed 0.01 sec
Start 18: test_split
18/18 Test #18 : test_split ....................... Passed 0.02 sec
100% tests passed, 0 tests failed out of 18
2022-10-19 15:03:03 +08:00
zhengly123
4e0040c8a0
Add: connection among tensors and operators ( #45 )
...
* Add: refs_to_wrefs and wrefs_to_refs
* Add: op and tensor connection
* Add: inception-v3 block test
* Refactor: addOperatorAndConnect
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-10-18 22:02:51 +08:00
wendy12022
d1c913010f
ADD:reduce_mean operator and cuda kernel. ( #47 )
...
add new line at file ending.
2022-10-15 16:53:58 +08:00
wendy12022
a4d6426589
ADD: batch norm operator and cuda kernel. ( #44 )
...
fix numInputs of batchNorm, add new line in file ending.
ADD: batch norm operator and cuda kernel.
add training
remove comments.
fix compile error.
add batch norm operator and cuda kernel.
2022-10-15 16:29:28 +08:00
Hardy
b0c2a08252
Support bang c kernel wanghailu 0927 ( #43 )
...
* fix a little bug which found by new verison CMake
* add code for support BangC language kernel , just like Cuda kernel, not
library
* add bangc kernel
* support BangC kernel
* add code for support BangC kernel
* support bangc kernel
* fix some code from reviewer
* fix code of template fumction
* add code for support bangc kernel
* fix bangc format
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2022-09-30 11:01:52 +08:00