forked from jiuyuan/InfiniTensor
6 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
YdrMaster |
26f0d13c26
|
Dev for 202303ddl (#66)
* add activation operatiopn relu, tanh, sigmoid on mlu * commit for format * add activation backward operation * add test for activation_backward * add test * add convbpfilter * fix * add transpsoe code and test * add trigon function operation on mlu: sin,cos,tan,asin,sinh,asinh * add copy operation on mlu * add ceil operation and floor operation * add operation clip * add operation cnnl div, test and test for divdemo bangc kernel * add divnonan operation and test * add erf operation * add exp operation * add operation fill * add log operation * add log1p operation * add l2loss operation * add maximum and minimum operation * add mseloss operation * add negTensor operation * add power operation * add reciprocal operation * add sqrt and rsqrt operation * add transform operation * add addn operation * add muln operation * cherrry pick some operation * add floordiv operation and floordivtrunc operation * add floormod operation * add cumsum operation * add det operation * add pad operation * format * add concat operation * format * add split operation * fix concat and split operation * add round operation * add pooling operation * add square operation * add squaredDifference operation * code format fix * add flip operation * code format fix * add hardtanh operation * add logic operation * add addcdiv and addcmul operation * add arange operation * add bitcompute operation * add net test * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * style: rename Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 用 NativeCpuRuntime 替换 CpuRuntime Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix code * fix code * fix code by review suggestion * remove operation which is not the onnx operation * fix format * clang format * refactor: tensor 的 print 加一层模板的 dataToString Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: onnx 导出 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 增加计算图优化接口 Signed-off-by: YdrMaster <ydrml@hotmail.com> * add clip operation * feat: 支持导入 clip Signed-off-by: YdrMaster <ydrml@hotmail.com> * test: 导入导出测试加入 ci Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix batch norm * feat: 增加 Shape 算子 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持导入 unsqueeze Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修正 clip 接口 feat: 支持导入 transpose Signed-off-by: YdrMaster <ydrml@hotmail.com> * add broadcast operation * fix elementwise-broadcast * fix elementwise broadcast * add broadcast for gpu elementsie * feat: pad 支持 axes 负数 feat: 不支持的 padding 导出为独立的 pad 算子 feat: 支持导入 onnxsim 过的 inception Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修正池化的测试 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导出 pads,支持 inception 导入导出,已加入 ci Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持 densenet 导入导出,并加入 ci Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导入 squeeze Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix softmax * feat: 导出 clip 和 transpose Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持 Conv 的 bias Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: bias of conv Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: bias of conv Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导入 split Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导出 split Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: conv Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: conv group Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: matmul 的 bias 没有放在输入里,修正 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix exmaple * fix: 改正 reduce_mean 导出 Signed-off-by: YdrMaster <ydrml@hotmail.com> * refactor: 修改 slice 实现与 onnx 一致 Signed-off-by: YdrMaster <ydrml@hotmail.com> * style: 不导出两个 runtime 函数 Signed-off-by: YdrMaster <ydrml@hotmail.com> * doc: 中文使用指南 Signed-off-by: YdrMaster <ydrml@hotmail.com> * doc: 补全指南 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修复导入数据的问题 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 添加 Dropout 基本结构,但不支持两个输出是不同的类型 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 重新导出优化接口 feat: dropout 导入 Signed-off-by: YdrMaster <ydrml@hotmail.com> * build: BANG 选项加入 Makefile Signed-off-by: YdrMaster <ydrml@hotmail.com> * fxi code, change of test/kernels/bang/test* is use NativeCpuRuntime. chaneg of include/bang/bang_runtime is for the cntoolkit upgrade. * feat: 导出 bang runtime Signed-off-by: YdrMaster <ydrml@hotmail.com> * add USE_BANG=1 * fix matmul * fix reshape * fix * fix activation * fix transpose * format * format * update Makefile Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持导入导出 ConvTranspose Signed-off-by: YdrMaster <ydrml@hotmail.com> * add prelu on mlu * fix: ConvTranspose Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持导入导出 PRelu Signed-off-by: YdrMaster <ydrml@hotmail.com> * add convtrans on mlu * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * docs: 更新 README_CN.md Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix code by review suggestions * style Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: Softmax 的 axis 可以用默认值?感觉是 onnx 不标准 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix cuda & intelcpu bugs after merging --------- Signed-off-by: YdrMaster <ydrml@hotmail.com> Co-authored-by: wanghailu <wanghailu0717@163.com> Co-authored-by: wanghailu <wanghailu@qiyuanlab.com> Co-authored-by: whjthu <haojie0429@gmail.com> |
|
zhengly123 |
a1974aabcd
|
NNET supports TVM backend and kernels (#78)
* Add: mutator InfoGAN minimum test * Add: cache and padding (bugs!!) * Add: expression reader as a cmake target * Fix: [Intermediate] NMutator::expressionToGraph To be fix: matmul with implicit broadcast * Add: matmul broadcast * Fix: GraphObj ctor should use cloneTensor * Fix: cuBLAS failure when codegen is enabled * Add: Exception for checkCuError * Fix: graph OpList ctor * Add: expr simplication for TVM * Add: TVM headers and CMake include paths * Add: CMake config * Add: PackedFunc (broken) * Fix: remove cuCtxCreate which makes TVM fails * Fix: membound_tvm * Fix: test_memboundOp * Add: PRelu Expr and AsTVMVisitor * Add: Random generator * Add: support TVM packed function * Fix: specify runtime * Add: CMake support of TVM * Add: detailed output of Matmul * Add: comments for Matmul * Chore: format and comments * Chore: GraphObj::selfCheck without assert control * Fix: CMAKE_CXX_FLAGS in CMakeLists * fix merge bug * update api for mkl batchnorm test * fix lotus env * fig header bug --------- Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com> Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn> Co-authored-by: whjthu <haojie0429@gmail.com> |
|
wendy12022 |
c8b2c8ed32
|
Cpu backend2 (#77)
fix review change Device::MKL to Device::INTELCPU fix mkl linkage fix errors according to merge from master now can call mkl backend fix softmax/flatten with axis from onnx. modify README.md fix memory refree add env_lotus_intelcpu.sh fix compile merge from branch cpu_backend fix something add gather fix something FIX: directory rename from "mkl" to "intelcpu" ADD: use oneMKL dpcpp interface to implement matmul kernel. ADD: add dpcpp as compiler for mkl, and fix warnings for clang compiling. add dpcpp kernel for pow. ADD: mkl kernel for pad. ADD: slice mkl kernel. ADD: reshape/flatten/identity mkl kernel. ADD: split mkl kernel. fix compile error FIX: fix flattenObj with axis. ADD reduce_mean mkl kernel. Add concat mkl kernel. bathNorm for mkl kernel. sigmoid mkl kernel. ADD:add mkl kernel for pooling add more tests for softmax Now softmax cuda kernel supports any axises. mkl kernel for softmax softmax add axis to softmax operator add mkl kernel for abs tanh ADD: relu kernel for mkl fix binary mkl primitives. add mkl kernel for binary operators fix compiler error move stream to runtime clang format add MemoryFormat for tensorObj. use post_ops for fused conv/deconv Distinguish mkl op_timer from cuda op timer. add act optype to conv and deconv add operator timer add mkl kernel for convTransposed minor fix for group conv do not use cblas_sgemm_batch CpuRuntimeObj->NativeCpuRuntimeObj add matmul op for mkl |
|
zhengly123 |
172d03d6f2
|
Fix NNet tests after migration (#27)
* Fix: interpreter ``` 4 - readlog (Failed) 8 - test_TConv2gemm (Failed) 11 - test_conv2conv (Failed) 12 - test_conv2gemm (Failed) 15 - test_g2bmm (Failed) 16 - test_guidedDLT (Subprocess aborted) 22 - test_mergeStage (Subprocess aborted) ``` * Exclude readlog from ctest * Fix: change the path of logs ``` 85% tests passed, 4 tests failed out of 27 Total Test time (real) = 100.69 sec The following tests FAILED: 10 - test_conv2conv (Timeout) 11 - test_conv2gemm (Timeout) 15 - test_guidedDLT (Subprocess aborted) 21 - test_mergeStage (Subprocess aborted) Errors while running CTest ``` - test_conv2conv 38529 ms total - test_conv2gemm 37098 ms total * Fix: test_mergeStage * Fix: test_guidedDLT ``` Start 1: test_graph 1/27 Test #1: test_graph ....................... Passed 0.05 sec Start 2: test_hash 2/27 Test #2: test_hash ........................ Passed 0.02 sec Start 3: test_conv 3/27 Test #3: test_conv ........................ Passed 4.98 sec Start 4: test_Interpreter 4/27 Test #4: test_Interpreter ................. Passed 6.30 sec Start 5: test_OpSearch 5/27 Test #5: test_OpSearch .................... Passed 0.02 sec Start 6: test_Rule2VariableMerging 6/27 Test #6: test_Rule2VariableMerging ........ Passed 0.03 sec Start 7: test_TConv2gemm 7/27 Test #7: test_TConv2gemm .................. Passed 29.45 sec Start 8: test_as_tvm 8/27 Test #8: test_as_tvm ...................... Passed 0.02 sec Start 9: test_compareFormulas 9/27 Test #9: test_compareFormulas ............. Passed 0.02 sec Start 10: test_conv2conv 10/27 Test #10: test_conv2conv ................... Passed 36.55 sec Start 11: test_conv2gemm 11/27 Test #11: test_conv2gemm ................... Passed 39.70 sec Start 12: test_dlt 12/27 Test #12: test_dlt ......................... Passed 0.03 sec Start 13: test_exprHash 13/27 Test #13: test_exprHash .................... Passed 0.02 sec Start 14: test_g2bmm 14/27 Test #14: test_g2bmm ....................... Passed 0.16 sec Start 15: test_guidedDLT 15/27 Test #15: test_guidedDLT ................... Passed 0.07 sec Start 16: test_matchConv 16/27 Test #16: test_matchConv ................... Passed 0.02 sec Start 17: test_matchElementWise 17/27 Test #17: test_matchElementWise ............ Passed 0.03 sec Start 18: test_matchMatmul 18/27 Test #18: test_matchMatmul ................. Passed 0.02 sec Start 19: test_matchReshape 19/27 Test #19: test_matchReshape ................ Passed 0.02 sec Start 20: test_memboundOp 20/27 Test #20: test_memboundOp .................. Passed 0.02 sec Start 21: test_mergeStage 21/27 Test #21: test_mergeStage .................. Passed 0.02 sec Start 22: test_oobChecker 22/27 Test #22: test_oobChecker .................. Passed 0.02 sec Start 23: test_rangeMagnify 23/27 Test #23: test_rangeMagnify ................ Passed 0.02 sec Start 24: test_relaxation 24/27 Test #24: test_relaxation .................. Passed 0.02 sec Start 25: test_serializer 25/27 Test #25: test_serializer .................. Passed 0.03 sec Start 26: test_simplify 26/27 Test #26: test_simplify .................... Passed 0.02 sec Start 27: test_subset 27/27 Test #27: test_subset ...................... Passed 0.01 sec 100% tests passed, 0 tests failed out of 27 Total Test time (real) = 117.72 sec ``` * Fix: format * Replace nnet:Ref with infini::Ref ``` Start 1: test_graph 1/27 Test 1: test_graph ....................... Passed 0.02 sec Start 2: test_hash 2/27 Test 2: test_hash ........................ Passed 0.02 sec Start 3: test_conv 3/27 Test 3: test_conv ........................ Passed 4.45 sec Start 4: test_Interpreter 4/27 Test 4: test_Interpreter ................. Passed 4.37 sec Start 5: test_OpSearch 5/27 Test 5: test_OpSearch .................... Passed 0.02 sec Start 6: test_Rule2VariableMerging 6/27 Test 6: test_Rule2VariableMerging ........ Passed 0.02 sec Start 7: test_TConv2gemm 7/27 Test 7: test_TConv2gemm .................. Passed 23.40 sec Start 8: test_as_tvm 8/27 Test 8: test_as_tvm ...................... Passed 0.02 sec Start 9: test_compareFormulas 9/27 Test 9: test_compareFormulas ............. Passed 0.01 sec Start 10: test_conv2conv 10/27 Test 10: test_conv2conv ................... Passed 32.28 sec Start 11: test_conv2gemm 11/27 Test 11: test_conv2gemm ................... Passed 29.41 sec Start 12: test_dlt 12/27 Test 12: test_dlt ......................... Passed 0.02 sec Start 13: test_exprHash 13/27 Test 13: test_exprHash .................... Passed 0.01 sec Start 14: test_g2bmm 14/27 Test 14: test_g2bmm ....................... Passed 0.14 sec Start 15: test_guidedDLT 15/27 Test 15: test_guidedDLT ................... Passed 0.06 sec Start 16: test_matchConv 16/27 Test 16: test_matchConv ................... Passed 0.02 sec Start 17: test_matchElementWise 17/27 Test 17: test_matchElementWise ............ Passed 0.02 sec Start 18: test_matchMatmul 18/27 Test 18: test_matchMatmul ................. Passed 0.02 sec Start 19: test_matchReshape 19/27 Test 19: test_matchReshape ................ Passed 0.01 sec Start 20: test_memboundOp 20/27 Test 20: test_memboundOp .................. Passed 0.02 sec Start 21: test_mergeStage 21/27 Test 21: test_mergeStage .................. Passed 0.01 sec Start 22: test_oobChecker 22/27 Test 22: test_oobChecker .................. Passed 0.01 sec Start 23: test_rangeMagnify 23/27 Test 23: test_rangeMagnify ................ Passed 0.01 sec Start 24: test_relaxation 24/27 Test 24: test_relaxation .................. Passed 0.01 sec Start 25: test_serializer 25/27 Test 25: test_serializer .................. Passed 0.02 sec Start 26: test_simplify 26/27 Test 26: test_simplify .................... Passed 0.01 sec Start 27: test_subset 27/27 Test 27: test_subset ...................... Passed 0.00 sec 100% tests passed, 0 tests failed out of 27 Total Test time (real) = 94.47 sec ``` * Relax time limit for CPU conv ``` Start 1: test_graph 1/29 Test 1: test_graph ....................... Passed 0.02 sec Start 2: test_hash 2/29 Test 2: test_hash ........................ Passed 0.02 sec Start 3: test_conv 3/29 Test 3: test_conv ........................ Passed 4.47 sec Start 4: test_matmul 4/29 Test 4: test_matmul ...................... Passed 2.61 sec Start 5: test_pooling 5/29 Test 5: test_pooling ..................... Passed 2.57 sec Start 6: test_Interpreter 6/29 Test 6: test_Interpreter ................. Passed 4.35 sec Start 7: test_OpSearch 7/29 Test 7: test_OpSearch .................... Passed 0.02 sec Start 8: test_Rule2VariableMerging 8/29 Test 8: test_Rule2VariableMerging ........ Passed 0.02 sec Start 9: test_TConv2gemm 9/29 Test 9: test_TConv2gemm .................. Passed 23.32 sec Start 10: test_as_tvm 10/29 Test 10: test_as_tvm ...................... Passed 0.02 sec Start 11: test_compareFormulas 11/29 Test 11: test_compareFormulas ............. Passed 0.02 sec Start 12: test_conv2conv 12/29 Test 12: test_conv2conv ................... Passed 32.12 sec Start 13: test_conv2gemm 13/29 Test 13: test_conv2gemm ................... Passed 30.59 sec Start 14: test_dlt 14/29 Test 14: test_dlt ......................... Passed 0.02 sec Start 15: test_exprHash 15/29 Test 15: test_exprHash .................... Passed 0.01 sec Start 16: test_g2bmm 16/29 Test 16: test_g2bmm ....................... Passed 0.14 sec Start 17: test_guidedDLT 17/29 Test 17: test_guidedDLT ................... Passed 0.07 sec Start 18: test_matchConv 18/29 Test 18: test_matchConv ................... Passed 0.02 sec Start 19: test_matchElementWise 19/29 Test 19: test_matchElementWise ............ Passed 0.02 sec Start 20: test_matchMatmul 20/29 Test 20: test_matchMatmul ................. Passed 0.02 sec Start 21: test_matchReshape 21/29 Test 21: test_matchReshape ................ Passed 0.02 sec Start 22: test_memboundOp 22/29 Test 22: test_memboundOp .................. Passed 0.02 sec Start 23: test_mergeStage 23/29 Test 23: test_mergeStage .................. Passed 0.01 sec Start 24: test_oobChecker 24/29 Test 24: test_oobChecker .................. Passed 0.02 sec Start 25: test_rangeMagnify 25/29 Test 25: test_rangeMagnify ................ Passed 0.02 sec Start 26: test_relaxation 26/29 Test 26: test_relaxation .................. Passed 0.02 sec Start 27: test_serializer 27/29 Test 27: test_serializer .................. Passed 0.03 sec Start 28: test_simplify 28/29 Test 28: test_simplify .................... Passed 0.02 sec Start 29: test_subset 29/29 Test 29: test_subset ...................... Passed 0.00 sec 100% tests passed, 0 tests failed out of 29 Total Test time (real) = 100.65 sec ``` * Remove out-of-date tests Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com> |
|
Liyan Zheng | a4fb9fa413 | Chore: format dbg | |
Liyan Zheng | b7e2096a26 | Add: nnet code |