ADD: resize operator and cuda kernel,support nearest/linear coef.
fix some
fix tests
add more tests for linear mode.
add linear coef mode.
add scales
add tests
fix tests.
add notLarger notSmaller
fix
add test
ADD:resize operator and cuda kernel
fix numInputs of batchNorm, add new line in file ending.
ADD: batch norm operator and cuda kernel.
add training
remove comments.
fix compile error.
add batch norm operator and cuda kernel.
* fix a little bug which found by new verison CMake
* add code for support BangC language kernel , just like Cuda kernel, not
library
* add bangc kernel
* support BangC kernel
* add code for support BangC kernel
* support bangc kernel
* fix some code from reviewer
* fix code of template fumction
* add code for support bangc kernel
* fix bangc format
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
* ADD:concat/split operator and cuda kernels
refector
minor change comment
ADD:concat/split operator and cuda kernels
merge split_kernel and concat_kernel to split_concat_kernel.
Revert "fix"
This reverts commit 459926be09a838658ec55f1e0a72b3cf17037d5c.
fix
ADD:concat/split operator and cuda kernels
change whole tensor name to composed tensor
fix some
remove unused header.
rebase
add CudaKernel
add test for split.
ADD split operator and cuda kernel.
modify test.
ADD:concat operator and cuda kernel.
ADD:concat/split operator and cuda kernels
fix some
remove unused header.
rebase
add CudaKernel
ADD:concat/split operator and cuda kernels
add test for split.
ADD split operator and cuda kernel.
modify test.
ADD:concat operator and cuda kernel.
* remove extra comment; typo fix.
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
* add code for cambricon mlu, bang, cnnl
* add code for support cambricon mlu,cnnl,cnrt
* add code for support mlu
* add code for support cambricon cnnl
* add code for support mlu
* add code for mlu
* add code for mlu
`
* Update CMakeLists.txt
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: zhengly123 <zhengly123@outlook.com>
* ADD:reshape/flatten/identity operators and cuda kernel.
fix: use cudaMemcpyAsync
clang format.
ADD flatten/identity operator.
add test for reshape.
ADD: reshape operator and cuda kernel.
* Fix: seperate CUDA tests & remove old header
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>