whjthu
34ed298725
fix format
2023-04-22 17:00:52 +08:00
whjthu
664f0dbe02
support cuda transpose
2023-04-22 16:57:27 +08:00
Liyan Zheng
307614d95d
Add: infogan python interface
2023-04-18 17:16:25 +08:00
zhengly123
a1974aabcd
NNET supports TVM backend and kernels ( #78 )
...
* Add: mutator InfoGAN minimum test
* Add: cache and padding (bugs!!)
* Add: expression reader as a cmake target
* Fix: [Intermediate] NMutator::expressionToGraph
To be fix: matmul with implicit broadcast
* Add: matmul broadcast
* Fix: GraphObj ctor should use cloneTensor
* Fix: cuBLAS failure when codegen is enabled
* Add: Exception for checkCuError
* Fix: graph OpList ctor
* Add: expr simplication for TVM
* Add: TVM headers and CMake include paths
* Add: CMake config
* Add: PackedFunc (broken)
* Fix: remove cuCtxCreate which makes TVM fails
* Fix: membound_tvm
* Fix: test_memboundOp
* Add: PRelu Expr and AsTVMVisitor
* Add: Random generator
* Add: support TVM packed function
* Fix: specify runtime
* Add: CMake support of TVM
* Add: detailed output of Matmul
* Add: comments for Matmul
* Chore: format and comments
* Chore: GraphObj::selfCheck without assert control
* Fix: CMAKE_CXX_FLAGS in CMakeLists
* fix merge bug
* update api for mkl batchnorm test
* fix lotus env
* fig header bug
---------
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>
Co-authored-by: whjthu <haojie0429@gmail.com>
2023-04-18 00:26:36 +08:00
wendy12022
a4d6426589
ADD: batch norm operator and cuda kernel. ( #44 )
...
fix numInputs of batchNorm, add new line in file ending.
ADD: batch norm operator and cuda kernel.
add training
remove comments.
fix compile error.
add batch norm operator and cuda kernel.
2022-10-15 16:29:28 +08:00
zhengly123
1aefc1b27e
Add python interface for CUDA operator evaluation ( #42 )
...
* Refactor: seperate data generator
* Add: python bindings for opTimer
* Fix: test_perfengine
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-09-27 10:41:12 +08:00
Anmuliar
90eb9d05a8
Json perfrecord ( #32 )
...
Added perfengine serialization&deserialization and corresponding test case.
* Add: perfrecord json representation.
* Add: perfrecord virtual func. to_json&from_json.
* Add: perfengine serilization and deserilization.
* Modify: tune func type to supp derived struct serilization.
* Fix: structure after rebase
* Chore: Remove empty line in conv.h
Co-authored-by: wcz112 <wcz19@mails.tsinghua.edu.cn>
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: zhengly123 <zhengly123@outlook.com>
2022-09-22 15:34:34 +08:00
Hardy
03de74f4bc
Tensor serialization ( #25 )
...
* use protobuf for tensor data save,write,read, in chinese 序列化和反序列化
* add protobuf
* add code for tensor load & save from/to file
* add code for tensor laod & save
* add code for tensor load & save
* add code for tensor save & load
* add code for tensor save & load
* add code for save & load
* add code for load & save
* add code for tensor load & save
* add code for tensor save & load
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
2022-09-13 11:27:41 +08:00
Hardy
e1d43202d7
Verify wanghailu 0902 ( #22 )
...
* commit for verify, add some difference function
* add code for verify
* add code for verify
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
2022-09-05 15:45:52 +08:00
Hardy
32a01efbbe
add code for backtrace ( #21 )
...
* add code for backtrace
* Add: infini::Exception
```
Test project /home/zly/InfiniTensor_aux/build
Start 1: test_graph
1/4 Test #1 : test_graph ....................... Passed 0.05 sec
Start 2: test_hash
2/4 Test #2 : test_hash ........................ Passed 0.02 sec
Start 3: test_conv
3/4 Test #3 : test_conv ........................ Passed 4.40 sec
Start 4: test_pooling
4/4 Test #4 : test_pooling ..................... Passed 2.47 sec
100% tests passed, 0 tests failed out of 4
Total Test time (real) = 6.94 sec
```
* Fix: USE_BACKTRACE in cmake
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-09-01 20:30:12 +08:00