Commit Graph

200 Commits

Author SHA1 Message Date
zhengly123 24f6eb273e
Chore: remove redundant semicolon 2023-04-23 00:25:33 +08:00
whjthu f820117acd fix unused code 2023-04-23 00:18:26 +08:00
whjthu 1ab2118716 add AnyOp and cuda kernel 2023-04-23 00:16:03 +08:00
Liyan Zheng acc64fd32c Merge branch 'NNET_transpose' into NNET_e2e
Fix: gridSize and blockSize in Reshape kernel
2023-04-22 21:32:31 +08:00
Liyan Zheng 33ab5dcd3e Fix: gbmm kernel 2023-04-22 21:14:52 +08:00
Liyan Zheng e2f18272c9 Add: no malloc for reshape outputs 2023-04-22 21:13:57 +08:00
Liyan Zheng 40e6db6608 Add: tensor FUID in exported ONNX 2023-04-22 20:28:17 +08:00
Liyan Zheng c451918224 Fix: tensor size overflow 2023-04-22 20:28:00 +08:00
whjthu 34ed298725 fix format 2023-04-22 17:00:52 +08:00
whjthu 664f0dbe02 support cuda transpose 2023-04-22 16:57:27 +08:00
Liyan Zheng a732b6f176 Fix: ignore transpose in CudaGraph since no kernel 2023-04-22 16:08:40 +08:00
Liyan Zheng 0865f8d823 Chore: move TensorObj::clone to .cc 2023-04-22 16:03:16 +08:00
Liyan Zheng 84f9d6731a Add: Longformer models 2023-04-22 16:00:29 +08:00
Liyan Zheng 4f02eeb08c Add: G2BMM kernels generated by tvm 0.10 2023-04-22 15:40:59 +08:00
whjthu 225a42f22d add rule for dilated conv 2023-04-21 23:40:45 +08:00
Liyan Zheng 4e9ece76f4 Chore: remove out-of-date code 2023-04-21 23:22:40 +08:00
Liyan Zheng 16a8c5dce5 Add: Conv1x1 rule 2023-04-21 23:21:04 +08:00
Liyan Zheng d051460c23 Chore: suppress output 2023-04-21 22:58:18 +08:00
Liyan Zheng d8a133684e Add: remove independent tensors in graph 2023-04-21 22:57:23 +08:00
Liyan Zheng 9ce21200c4 Add: NMutator mode in python 2023-04-21 21:31:22 +08:00
Liyan Zheng b943658713 Finish: GAN 2023-04-21 21:25:43 +08:00
Liyan Zheng 2cd75bd79b Merge branch 'NNET_e2e_fix' into NNET_e2e
Support CUDA Graph for TVM kernels
2023-04-21 13:18:44 +08:00
Liyan Zheng f0fcbe825f Add: python verification 2023-04-21 13:18:24 +08:00
huangshuhong 8c91faa948 remove expect 2023-04-21 00:17:04 +08:00
huangshuhong c0ae03a2d7 fix tvm stream 2023-04-21 00:09:47 +08:00
Liyan Zheng 0cb8729bc1 Add: different ONNX names for inputs and weights 2023-04-20 21:51:47 +08:00
YdrMaster 8bc2d3e48d fix: test graph handler
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-04-20 21:51:47 +08:00
YdrMaster 28b123753e feat: 导入 Tensor 类型
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-04-20 21:51:47 +08:00
Liyan Zheng 94730d93b5 Add: hash match for membound kernels 2023-04-20 17:16:01 +08:00
Liyan Zheng 6d17c4caa2 Add: getPerfTime in run_models_nnet 2023-04-20 10:54:49 +08:00
Liyan Zheng 15d0eb79cd Add: import ONNX with membound Op 2023-04-20 10:45:28 +08:00
Liyan Zheng 2a343e240e Add: shape of intermediate tensor in exported ONNX 2023-04-20 10:45:28 +08:00
Liyan Zheng 34ca6bf149 Fix: skip check when Graph is exported to ONNX 2023-04-20 10:45:28 +08:00
YdrMaster a6019e79e3 feat(py): 支持从 Graph 直接创建 OnnxStub
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-04-20 10:45:28 +08:00
YdrMaster 4e1cc8d3e4 refactor(py): 使用工厂方法创建 OnnxStub
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-04-20 10:44:39 +08:00
YdrMaster 725f9260cf feat: 支持导出 membound
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-04-20 10:44:39 +08:00
YdrMaster 0edd138919 feat: 正反序列化分离为到 string 的和到 file 的
fix: 正确设置 `USE_CUDA` cfg

todo: test_search 不过

Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-04-20 10:44:39 +08:00
Liyan Zheng 0b23a065ca Add: debug hacks for InfoGAN 2023-04-20 10:42:56 +08:00
Liyan Zheng e86e993ed4 Add: CUDA graph stream capture (MemboundOp fails) 2023-04-19 16:32:16 +08:00
Liyan Zheng e4c20a9ae2 Add: warmup and repeat args in timeNonCtcOperators 2023-04-19 16:22:59 +08:00
Liyan Zheng 537b3b4ea4 Add: Membound operator serialization 2023-04-18 21:53:48 +08:00
Liyan Zheng 2812900ea2 Fix: OpType and print device tensors 2023-04-18 20:28:08 +08:00
Liyan Zheng 01fc19795d Add: time non-compile-cime-computable operators 2023-04-18 17:21:16 +08:00
Liyan Zheng afc4123328 Chore: remove deprecated function 2023-04-18 17:21:16 +08:00
Liyan Zheng b981951a47 Add: NMutator::memboundToJson to export memboundOp 2023-04-18 17:21:16 +08:00
Liyan Zheng 99b5c95455 Add: nnet::Serializer supports FuncNode 2023-04-18 17:21:16 +08:00
Liyan Zheng 9d50b30af8 Chore: disable nnet_unimplemented_continue output 2023-04-18 17:21:16 +08:00
Liyan Zheng bc31219bde Add: exclude compile-time computable operator time 2023-04-18 17:21:16 +08:00
Liyan Zheng edf4e33353 Add: C++ callback to export ONNX 2023-04-18 17:19:05 +08:00
Liyan Zheng 872f3504a9 Add: RangeOpNode::getFullExpression() 2023-04-18 17:19:05 +08:00