InfiniTensor

Commit Graph

Author	SHA1	Message	Date
zhangyunze	331f7ab2b8	support Dynamic tensor infer shape and fix memory pool (#176 ) * feat: support dynamic tensor part1 * feat: support dynamic-tensor part2 * feat: support dynamic tensor part 3 * fix: fix some .. * - add kvcache example * feat: support concat to identity kernel * add a simple mempory pool for allocator * fix: rebase to master * fix bug after merging * - remove outdated script * fix: fix as review --------- Co-authored-by: kilinchange <kilinchange@163.com> Co-authored-by: Haojie Wang <haojie0429@gmail.com>	2023-11-23 13:11:50 +08:00
YdrMaster	26f0d13c26	Dev for 202303ddl (#66 ) * add activation operatiopn relu, tanh, sigmoid on mlu * commit for format * add activation backward operation * add test for activation_backward * add test * add convbpfilter * fix * add transpsoe code and test * add trigon function operation on mlu: sin,cos,tan,asin,sinh,asinh * add copy operation on mlu * add ceil operation and floor operation * add operation clip * add operation cnnl div, test and test for divdemo bangc kernel * add divnonan operation and test * add erf operation * add exp operation * add operation fill * add log operation * add log1p operation * add l2loss operation * add maximum and minimum operation * add mseloss operation * add negTensor operation * add power operation * add reciprocal operation * add sqrt and rsqrt operation * add transform operation * add addn operation * add muln operation * cherrry pick some operation * add floordiv operation and floordivtrunc operation * add floormod operation * add cumsum operation * add det operation * add pad operation * format * add concat operation * format * add split operation * fix concat and split operation * add round operation * add pooling operation * add square operation * add squaredDifference operation * code format fix * add flip operation * code format fix * add hardtanh operation * add logic operation * add addcdiv and addcmul operation * add arange operation * add bitcompute operation * add net test * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * style: rename Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 用 NativeCpuRuntime 替换 CpuRuntime Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix code * fix code * fix code by review suggestion * remove operation which is not the onnx operation * fix format * clang format * refactor: tensor 的 print 加一层模板的 dataToString Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: onnx 导出 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 增加计算图优化接口 Signed-off-by: YdrMaster <ydrml@hotmail.com> * add clip operation * feat: 支持导入 clip Signed-off-by: YdrMaster <ydrml@hotmail.com> * test: 导入导出测试加入 ci Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix batch norm * feat: 增加 Shape 算子 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持导入 unsqueeze Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修正 clip 接口 feat: 支持导入 transpose Signed-off-by: YdrMaster <ydrml@hotmail.com> * add broadcast operation * fix elementwise-broadcast * fix elementwise broadcast * add broadcast for gpu elementsie * feat: pad 支持 axes 负数 feat: 不支持的 padding 导出为独立的 pad 算子 feat: 支持导入 onnxsim 过的 inception Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修正池化的测试 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导出 pads，支持 inception 导入导出，已加入 ci Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持 densenet 导入导出，并加入 ci Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导入 squeeze Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix softmax * feat: 导出 clip 和 transpose Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持 Conv 的 bias Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: bias of conv Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: bias of conv Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导入 split Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 导出 split Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: conv Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: conv group Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: matmul 的 bias 没有放在输入里，修正 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix exmaple * fix: 改正 reduce_mean 导出 Signed-off-by: YdrMaster <ydrml@hotmail.com> * refactor: 修改 slice 实现与 onnx 一致 Signed-off-by: YdrMaster <ydrml@hotmail.com> * style: 不导出两个 runtime 函数 Signed-off-by: YdrMaster <ydrml@hotmail.com> * doc: 中文使用指南 Signed-off-by: YdrMaster <ydrml@hotmail.com> * doc: 补全指南 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: 修复导入数据的问题 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 添加 Dropout 基本结构，但不支持两个输出是不同的类型 Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 重新导出优化接口 feat: dropout 导入 Signed-off-by: YdrMaster <ydrml@hotmail.com> * build: BANG 选项加入 Makefile Signed-off-by: YdrMaster <ydrml@hotmail.com> * fxi code, change of test/kernels/bang/test* is use NativeCpuRuntime. chaneg of include/bang/bang_runtime is for the cntoolkit upgrade. * feat: 导出 bang runtime Signed-off-by: YdrMaster <ydrml@hotmail.com> * add USE_BANG=1 * fix matmul * fix reshape * fix * fix activation * fix transpose * format * format * update Makefile Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持导入导出 ConvTranspose Signed-off-by: YdrMaster <ydrml@hotmail.com> * add prelu on mlu * fix: ConvTranspose Signed-off-by: YdrMaster <ydrml@hotmail.com> * feat: 支持导入导出 PRelu Signed-off-by: YdrMaster <ydrml@hotmail.com> * add convtrans on mlu * fmt Signed-off-by: YdrMaster <ydrml@hotmail.com> * docs: 更新 README_CN.md Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix code by review suggestions * style Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix: Softmax 的 axis 可以用默认值？感觉是 onnx 不标准 Signed-off-by: YdrMaster <ydrml@hotmail.com> * fix cuda & intelcpu bugs after merging --------- Signed-off-by: YdrMaster <ydrml@hotmail.com> Co-authored-by: wanghailu <wanghailu0717@163.com> Co-authored-by: wanghailu <wanghailu@qiyuanlab.com> Co-authored-by: whjthu <haojie0429@gmail.com>	2023-04-18 15:10:33 +08:00
zhengly123	a1974aabcd	NNET supports TVM backend and kernels (#78 ) * Add: mutator InfoGAN minimum test * Add: cache and padding (bugs!!) * Add: expression reader as a cmake target * Fix: [Intermediate] NMutator::expressionToGraph To be fix: matmul with implicit broadcast * Add: matmul broadcast * Fix: GraphObj ctor should use cloneTensor * Fix: cuBLAS failure when codegen is enabled * Add: Exception for checkCuError * Fix: graph OpList ctor * Add: expr simplication for TVM * Add: TVM headers and CMake include paths * Add: CMake config * Add: PackedFunc (broken) * Fix: remove cuCtxCreate which makes TVM fails * Fix: membound_tvm * Fix: test_memboundOp * Add: PRelu Expr and AsTVMVisitor * Add: Random generator * Add: support TVM packed function * Fix: specify runtime * Add: CMake support of TVM * Add: detailed output of Matmul * Add: comments for Matmul * Chore: format and comments * Chore: GraphObj::selfCheck without assert control * Fix: CMAKE_CXX_FLAGS in CMakeLists * fix merge bug * update api for mkl batchnorm test * fix lotus env * fig header bug --------- Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com> Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn> Co-authored-by: whjthu <haojie0429@gmail.com>	2023-04-18 00:26:36 +08:00
whjthu	26be533faa	Add documentation for operators.	2023-02-13 22:51:15 +08:00
zhengly123	c7ec9ee6e7	Add search engine (#64 ) * Add: tensor fuid * [Intermediate state] Add: Graph ctor for OpVec * Add: clone for operators * tmp: search_engine * search: init search Engine. * Add: dummy mutator for the test of search engine * search: add print graph. * search: add partition. * search: update comments. * Fix: remain FUID in Tensor::clone * Chore: rename GUidBaseType to UidBaseType * Fix: connect NMutator to SearchEngine * Chore: output * Fix test_memboundOp: nmutator uses input runtime * Chore: clang-format * Chore: clang-format * Fix: comments in the review --------- Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com> Co-authored-by: mazx <dyxdy@live.com>	2023-02-12 18:27:52 +08:00
zhengly123	172d03d6f2	Fix NNet tests after migration (#27 ) * Fix: interpreter ``` 4 - readlog (Failed) 8 - test_TConv2gemm (Failed) 11 - test_conv2conv (Failed) 12 - test_conv2gemm (Failed) 15 - test_g2bmm (Failed) 16 - test_guidedDLT (Subprocess aborted) 22 - test_mergeStage (Subprocess aborted) ``` * Exclude readlog from ctest * Fix: change the path of logs ``` 85% tests passed, 4 tests failed out of 27 Total Test time (real) = 100.69 sec The following tests FAILED: 10 - test_conv2conv (Timeout) 11 - test_conv2gemm (Timeout) 15 - test_guidedDLT (Subprocess aborted) 21 - test_mergeStage (Subprocess aborted) Errors while running CTest ``` - test_conv2conv 38529 ms total - test_conv2gemm 37098 ms total * Fix: test_mergeStage * Fix: test_guidedDLT ``` Start 1: test_graph 1/27 Test #1: test_graph ....................... Passed 0.05 sec Start 2: test_hash 2/27 Test #2: test_hash ........................ Passed 0.02 sec Start 3: test_conv 3/27 Test #3: test_conv ........................ Passed 4.98 sec Start 4: test_Interpreter 4/27 Test #4: test_Interpreter ................. Passed 6.30 sec Start 5: test_OpSearch 5/27 Test #5: test_OpSearch .................... Passed 0.02 sec Start 6: test_Rule2VariableMerging 6/27 Test #6: test_Rule2VariableMerging ........ Passed 0.03 sec Start 7: test_TConv2gemm 7/27 Test #7: test_TConv2gemm .................. Passed 29.45 sec Start 8: test_as_tvm 8/27 Test #8: test_as_tvm ...................... Passed 0.02 sec Start 9: test_compareFormulas 9/27 Test #9: test_compareFormulas ............. Passed 0.02 sec Start 10: test_conv2conv 10/27 Test #10: test_conv2conv ................... Passed 36.55 sec Start 11: test_conv2gemm 11/27 Test #11: test_conv2gemm ................... Passed 39.70 sec Start 12: test_dlt 12/27 Test #12: test_dlt ......................... Passed 0.03 sec Start 13: test_exprHash 13/27 Test #13: test_exprHash .................... Passed 0.02 sec Start 14: test_g2bmm 14/27 Test #14: test_g2bmm ....................... Passed 0.16 sec Start 15: test_guidedDLT 15/27 Test #15: test_guidedDLT ................... Passed 0.07 sec Start 16: test_matchConv 16/27 Test #16: test_matchConv ................... Passed 0.02 sec Start 17: test_matchElementWise 17/27 Test #17: test_matchElementWise ............ Passed 0.03 sec Start 18: test_matchMatmul 18/27 Test #18: test_matchMatmul ................. Passed 0.02 sec Start 19: test_matchReshape 19/27 Test #19: test_matchReshape ................ Passed 0.02 sec Start 20: test_memboundOp 20/27 Test #20: test_memboundOp .................. Passed 0.02 sec Start 21: test_mergeStage 21/27 Test #21: test_mergeStage .................. Passed 0.02 sec Start 22: test_oobChecker 22/27 Test #22: test_oobChecker .................. Passed 0.02 sec Start 23: test_rangeMagnify 23/27 Test #23: test_rangeMagnify ................ Passed 0.02 sec Start 24: test_relaxation 24/27 Test #24: test_relaxation .................. Passed 0.02 sec Start 25: test_serializer 25/27 Test #25: test_serializer .................. Passed 0.03 sec Start 26: test_simplify 26/27 Test #26: test_simplify .................... Passed 0.02 sec Start 27: test_subset 27/27 Test #27: test_subset ...................... Passed 0.01 sec 100% tests passed, 0 tests failed out of 27 Total Test time (real) = 117.72 sec ``` * Fix: format * Replace nnet:Ref with infini::Ref ``` Start 1: test_graph 1/27 Test 1: test_graph ....................... Passed 0.02 sec Start 2: test_hash 2/27 Test 2: test_hash ........................ Passed 0.02 sec Start 3: test_conv 3/27 Test 3: test_conv ........................ Passed 4.45 sec Start 4: test_Interpreter 4/27 Test 4: test_Interpreter ................. Passed 4.37 sec Start 5: test_OpSearch 5/27 Test 5: test_OpSearch .................... Passed 0.02 sec Start 6: test_Rule2VariableMerging 6/27 Test 6: test_Rule2VariableMerging ........ Passed 0.02 sec Start 7: test_TConv2gemm 7/27 Test 7: test_TConv2gemm .................. Passed 23.40 sec Start 8: test_as_tvm 8/27 Test 8: test_as_tvm ...................... Passed 0.02 sec Start 9: test_compareFormulas 9/27 Test 9: test_compareFormulas ............. Passed 0.01 sec Start 10: test_conv2conv 10/27 Test 10: test_conv2conv ................... Passed 32.28 sec Start 11: test_conv2gemm 11/27 Test 11: test_conv2gemm ................... Passed 29.41 sec Start 12: test_dlt 12/27 Test 12: test_dlt ......................... Passed 0.02 sec Start 13: test_exprHash 13/27 Test 13: test_exprHash .................... Passed 0.01 sec Start 14: test_g2bmm 14/27 Test 14: test_g2bmm ....................... Passed 0.14 sec Start 15: test_guidedDLT 15/27 Test 15: test_guidedDLT ................... Passed 0.06 sec Start 16: test_matchConv 16/27 Test 16: test_matchConv ................... Passed 0.02 sec Start 17: test_matchElementWise 17/27 Test 17: test_matchElementWise ............ Passed 0.02 sec Start 18: test_matchMatmul 18/27 Test 18: test_matchMatmul ................. Passed 0.02 sec Start 19: test_matchReshape 19/27 Test 19: test_matchReshape ................ Passed 0.01 sec Start 20: test_memboundOp 20/27 Test 20: test_memboundOp .................. Passed 0.02 sec Start 21: test_mergeStage 21/27 Test 21: test_mergeStage .................. Passed 0.01 sec Start 22: test_oobChecker 22/27 Test 22: test_oobChecker .................. Passed 0.01 sec Start 23: test_rangeMagnify 23/27 Test 23: test_rangeMagnify ................ Passed 0.01 sec Start 24: test_relaxation 24/27 Test 24: test_relaxation .................. Passed 0.01 sec Start 25: test_serializer 25/27 Test 25: test_serializer .................. Passed 0.02 sec Start 26: test_simplify 26/27 Test 26: test_simplify .................... Passed 0.01 sec Start 27: test_subset 27/27 Test 27: test_subset ...................... Passed 0.00 sec 100% tests passed, 0 tests failed out of 27 Total Test time (real) = 94.47 sec ``` * Relax time limit for CPU conv ``` Start 1: test_graph 1/29 Test 1: test_graph ....................... Passed 0.02 sec Start 2: test_hash 2/29 Test 2: test_hash ........................ Passed 0.02 sec Start 3: test_conv 3/29 Test 3: test_conv ........................ Passed 4.47 sec Start 4: test_matmul 4/29 Test 4: test_matmul ...................... Passed 2.61 sec Start 5: test_pooling 5/29 Test 5: test_pooling ..................... Passed 2.57 sec Start 6: test_Interpreter 6/29 Test 6: test_Interpreter ................. Passed 4.35 sec Start 7: test_OpSearch 7/29 Test 7: test_OpSearch .................... Passed 0.02 sec Start 8: test_Rule2VariableMerging 8/29 Test 8: test_Rule2VariableMerging ........ Passed 0.02 sec Start 9: test_TConv2gemm 9/29 Test 9: test_TConv2gemm .................. Passed 23.32 sec Start 10: test_as_tvm 10/29 Test 10: test_as_tvm ...................... Passed 0.02 sec Start 11: test_compareFormulas 11/29 Test 11: test_compareFormulas ............. Passed 0.02 sec Start 12: test_conv2conv 12/29 Test 12: test_conv2conv ................... Passed 32.12 sec Start 13: test_conv2gemm 13/29 Test 13: test_conv2gemm ................... Passed 30.59 sec Start 14: test_dlt 14/29 Test 14: test_dlt ......................... Passed 0.02 sec Start 15: test_exprHash 15/29 Test 15: test_exprHash .................... Passed 0.01 sec Start 16: test_g2bmm 16/29 Test 16: test_g2bmm ....................... Passed 0.14 sec Start 17: test_guidedDLT 17/29 Test 17: test_guidedDLT ................... Passed 0.07 sec Start 18: test_matchConv 18/29 Test 18: test_matchConv ................... Passed 0.02 sec Start 19: test_matchElementWise 19/29 Test 19: test_matchElementWise ............ Passed 0.02 sec Start 20: test_matchMatmul 20/29 Test 20: test_matchMatmul ................. Passed 0.02 sec Start 21: test_matchReshape 21/29 Test 21: test_matchReshape ................ Passed 0.02 sec Start 22: test_memboundOp 22/29 Test 22: test_memboundOp .................. Passed 0.02 sec Start 23: test_mergeStage 23/29 Test 23: test_mergeStage .................. Passed 0.01 sec Start 24: test_oobChecker 24/29 Test 24: test_oobChecker .................. Passed 0.02 sec Start 25: test_rangeMagnify 25/29 Test 25: test_rangeMagnify ................ Passed 0.02 sec Start 26: test_relaxation 26/29 Test 26: test_relaxation .................. Passed 0.02 sec Start 27: test_serializer 27/29 Test 27: test_serializer .................. Passed 0.03 sec Start 28: test_simplify 28/29 Test 28: test_simplify .................... Passed 0.02 sec Start 29: test_subset 29/29 Test 29: test_subset ...................... Passed 0.00 sec 100% tests passed, 0 tests failed out of 29 Total Test time (real) = 100.65 sec ``` * Remove out-of-date tests Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-09-13 15:17:22 +08:00
wendy12022	c3bc278c12	Op matmul (#20 ) ADD:add cuda kernel for matmul. matmul tune Add test_matmul.cc	2022-09-01 21:06:55 +08:00
zhengly123	93f86d3f4d	Simplify tensor transfer between CPU and CUDA (#10 ) * Add: OP infers data type & Graph clones tensor * Fix: vecToString format * Add: static assert for Tensor methods * Rename: getDataRawPtr -> getRawDataPtr Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-25 11:29:16 +08:00
zhengly123	a26890abce	Tensor hash and inferShape (#4 ) * Refactor: operator hash and inferShape * Add: hash without shape * Add: inferShape interface for given input tensors * Add: construct outputs in op ctor * Add: comments for matmul * Add: opType in AttrVector and WorkloadVector * Chore: _graph -> graph in Op ctor * Chore: change the "Node" suffix to "Obj" Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>	2022-08-15 15:08:56 +08:00
Liyan Zheng	2054b0eda4	Chore: rename getOpAttrs to getOpPerfKey	2022-08-09 15:34:28 +08:00
Liyan Zheng	8b685ae4a6	Update: OpAttrs -> OpPerfKey	2022-08-09 14:58:45 +08:00
Liyan Zheng	1205240218	Add: mutator abstract class	2022-08-08 15:54:17 +08:00

12 Commits