zhengly123
a1974aabcd
NNET supports TVM backend and kernels ( #78 )
...
* Add: mutator InfoGAN minimum test
* Add: cache and padding (bugs!!)
* Add: expression reader as a cmake target
* Fix: [Intermediate] NMutator::expressionToGraph
To be fix: matmul with implicit broadcast
* Add: matmul broadcast
* Fix: GraphObj ctor should use cloneTensor
* Fix: cuBLAS failure when codegen is enabled
* Add: Exception for checkCuError
* Fix: graph OpList ctor
* Add: expr simplication for TVM
* Add: TVM headers and CMake include paths
* Add: CMake config
* Add: PackedFunc (broken)
* Fix: remove cuCtxCreate which makes TVM fails
* Fix: membound_tvm
* Fix: test_memboundOp
* Add: PRelu Expr and AsTVMVisitor
* Add: Random generator
* Add: support TVM packed function
* Fix: specify runtime
* Add: CMake support of TVM
* Add: detailed output of Matmul
* Add: comments for Matmul
* Chore: format and comments
* Chore: GraphObj::selfCheck without assert control
* Fix: CMAKE_CXX_FLAGS in CMakeLists
* fix merge bug
* update api for mkl batchnorm test
* fix lotus env
* fig header bug
---------
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>
Co-authored-by: whjthu <haojie0429@gmail.com>
2023-04-18 00:26:36 +08:00
whjthu
d9886e9de3
fix: remove inline keyword in class; rename getter and setter for inputOf and outputOf
2023-03-25 12:04:24 +08:00
YdrMaster
5aeacedab3
fix: 从模板导出每个类型的 python 接口
...
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-22 09:46:40 +08:00
YdrMaster
9db97eb212
refactor: 整合操作张量数据的方法
...
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-21 14:00:04 +08:00
YdrMaster
40fb8390b1
feat: 导入时保存权重
...
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster
5b6698bac7
feat: 导出全图的输出张量到 onnx
...
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster
3d122aebfe
feat: 支持导出浮点向量
...
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
deathwings602
40d1b1c91b
Add ConvTransposedNHWC ( #67 )
...
* Add: IT_ASSERT_TODO
* [WIP] Add: ConvTranspose2d mutation test
* add ConvTransposedNHWC
* fix test_cuda_transposed_2d
---------
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>
2023-03-01 14:15:02 +08:00
zhengly123
c7ec9ee6e7
Add search engine ( #64 )
...
* Add: tensor fuid
* [Intermediate state] Add: Graph ctor for OpVec
* Add: clone for operators
* tmp: search_engine
* search: init search Engine.
* Add: dummy mutator for the test of search engine
* search: add print graph.
* search: add partition.
* search: update comments.
* Fix: remain FUID in Tensor::clone
* Chore: rename GUidBaseType to UidBaseType
* Fix: connect NMutator to SearchEngine
* Chore: output
* Fix test_memboundOp: nmutator uses input runtime
* Chore: clang-format
* Chore: clang-format
* Fix: comments in the review
---------
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: mazx <dyxdy@live.com>
2023-02-12 18:27:52 +08:00
wendy12022
5560d0f2fb
ADD:pad/slice operator and cuda kernel. ( #39 )
...
fix compile error
refector
clang format
split test.
fix compile error.
ADD slice cuda kernel.
ADD slice operator.
ADD:pad operator and cuda kernel.
2022-09-29 10:29:24 +08:00
deathwings602
11d5aa1ccc
Add TVM codegen for MemboundOp ( #35 )
...
* Add: interface for membound TVM kernel and test
* add getAnsorCode
* add evaluation, but link failed
* add evaluation of kernel, but link failed
* Fix: link libcuda and nvrtc
* add print
* Add: const for source of copy
* compile and evaluate the kernel
* add compute
* fix gen_ansor_op.py
* fix membound_TVM
* format and fix CMakeLists.txt
* fix memory leak
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>
2022-09-22 18:06:45 +08:00
Hardy
03de74f4bc
Tensor serialization ( #25 )
...
* use protobuf for tensor data save,write,read, in chinese 序列化和反序列化
* add protobuf
* add code for tensor load & save from/to file
* add code for tensor laod & save
* add code for tensor load & save
* add code for tensor save & load
* add code for tensor save & load
* add code for save & load
* add code for load & save
* add code for tensor load & save
* add code for tensor save & load
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
2022-09-13 11:27:41 +08:00
wendy12022
13b7a2604b
ADD add/mul/sub/div/pow operators and CPU/CUDA kernels ( #26 )
...
Fix some
remove useless code.
add div/pow kernel
Add add/mul/sub operators.
fix cpu kernel.
add element wise kenerl for cuda.
ADD element wise operator.
2022-09-09 13:43:59 +08:00
wendy12022
48293576c0
Add maxpool and avgpool operators ( #17 )
...
* ADD:maxpool&&avgpool operators.
add OperatorObj::getDType()
clang format
FIX:timeit API has changed.
* Fix: Tensor::getInputs is const method
* Chore
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-31 14:44:53 +08:00
zhengly123
93f86d3f4d
Simplify tensor transfer between CPU and CUDA ( #10 )
...
* Add: OP infers data type & Graph clones tensor
* Fix: vecToString format
* Add: static assert for Tensor methods
* Rename: getDataRawPtr -> getRawDataPtr
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-25 11:29:16 +08:00
zhengly123
af08df32d2
Extended DataType class and Runtime interaction ( #9 )
...
* Add: DataType class
* Add: data-type-oblivious tensor interface
* Rename: copyBlobToCPU
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-23 16:55:59 +08:00
zhengly123
04ea5eed38
Add CUDA runtime ( #6 )
...
* Fix: add warm-up and repetition in timing
* Add: CUDA runtime and float support
* Refactor: Cuda and Cpu runtimes inherit Runtime
* Add: environment script for Lotus
* Add: Lotus build instructions
* Update README.md
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-22 15:01:03 +08:00
zhengly123
9303ddda8e
Add Conv operator and naive CPU implemenation ( #5 )
...
* Add: Conv definition
* Add: tensor copy data from vector
* Add: CPU conv kernel
* Fix: replace Int32 with UInt32 in DataType
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-17 14:16:01 +08:00
zhengly123
a26890abce
Tensor hash and inferShape ( #4 )
...
* Refactor: operator hash and inferShape
* Add: hash without shape
* Add: inferShape interface for given input tensors
* Add: construct outputs in op ctor
* Add: comments for matmul
* Add: opType in AttrVector and WorkloadVector
* Chore: _graph -> graph in Op ctor
* Chore: change the "Node" suffix to "Obj"
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-15 15:08:56 +08:00
Liyan Zheng
efa966a3e2
Add: perf engine
2022-08-07 21:12:17 +08:00
Liyan Zheng
6c356d5b42
Add: kernel registry and naive Matmul kernel
2022-08-06 15:58:40 +08:00
Liyan Zheng
559be5866d
Add: Matmul operator
2022-08-05 12:50:34 +08:00
Liyan Zheng
e6101b0336
Add: graph, tensor, and operator
2022-07-31 21:44:03 +08:00