Commit Graph

9 Commits

Author SHA1 Message Date
Derui Yang 1dc65e2788
build: 实现格式化 git added c/c++ 源码的脚本 (#98)
* build: 实现格式化 git added c/c++ 源码的脚本

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* feat: 扩充 c 风格文件类型

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* feat: format py files

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* feat: 支持从任意 commit 开始格式化所有修改的文件

Signed-off-by: YdrMaster <ydrml@hotmail.com>

---------

Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-07-21 12:29:50 +08:00
zhengly123 a1974aabcd
NNET supports TVM backend and kernels (#78)
* Add: mutator InfoGAN minimum test

* Add: cache and padding (bugs!!)

* Add: expression reader as a cmake target

* Fix: [Intermediate] NMutator::expressionToGraph

To be fix: matmul with implicit broadcast

* Add: matmul broadcast

* Fix: GraphObj ctor should use cloneTensor

* Fix: cuBLAS failure when codegen is enabled

* Add: Exception for checkCuError

* Fix: graph OpList ctor

* Add: expr simplication for TVM

* Add: TVM headers and CMake include paths

* Add: CMake config

* Add: PackedFunc (broken)

* Fix: remove cuCtxCreate which makes TVM fails

* Fix: membound_tvm

* Fix: test_memboundOp

* Add: PRelu Expr and AsTVMVisitor

* Add: Random generator

* Add: support TVM packed function

* Fix: specify runtime

* Add: CMake support of TVM

* Add: detailed output of Matmul

* Add: comments for Matmul

* Chore: format and comments

* Chore: GraphObj::selfCheck without assert control

* Fix: CMAKE_CXX_FLAGS in CMakeLists

* fix merge bug

* update api for mkl batchnorm test

* fix lotus env

* fig header bug

---------

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
Co-authored-by: huangshuhong <huangsh19@mails.tsinghua.edu.cn>
Co-authored-by: whjthu <haojie0429@gmail.com>
2023-04-18 00:26:36 +08:00
wendy12022 c8b2c8ed32
Cpu backend2 (#77)
fix review

change Device::MKL to Device::INTELCPU

fix mkl linkage

fix errors according to merge from master

now can call mkl backend

fix softmax/flatten with axis from onnx.

modify README.md

fix memory refree

add env_lotus_intelcpu.sh

fix compile

merge from branch cpu_backend

fix something add gather

fix something

FIX: directory rename from "mkl" to "intelcpu"

ADD: use oneMKL dpcpp interface to implement matmul kernel.

ADD: add dpcpp as compiler for mkl, and fix warnings for clang compiling.
add dpcpp kernel for pow.

ADD: mkl kernel for pad.

ADD: slice mkl kernel.

ADD: reshape/flatten/identity mkl kernel.

ADD: split mkl kernel.

fix compile error

FIX: fix flattenObj with axis.

ADD reduce_mean mkl kernel.

Add concat mkl kernel.

bathNorm for mkl kernel.

sigmoid mkl kernel.

ADD:add mkl kernel for pooling

add more tests for softmax

Now softmax cuda kernel supports any axises.

mkl kernel for softmax

softmax

add axis to softmax operator

add mkl kernel for abs tanh

ADD: relu kernel for mkl

fix binary mkl primitives.

add mkl kernel for binary operators

fix compiler error

move stream to runtime

clang format

add MemoryFormat for tensorObj.

use post_ops for fused conv/deconv

Distinguish mkl  op_timer from cuda op timer.

add act optype to conv and deconv

add operator timer

add mkl kernel for convTransposed

minor fix for group conv

do not use cblas_sgemm_batch

CpuRuntimeObj->NativeCpuRuntimeObj

add  matmul op for mkl
2023-04-17 12:15:23 +08:00
wendy12022 86ec4036ce
ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61)
* move memory format transformation to TensorObj

clang format

add MemoryFormat for tensorObj.

use post_ops for fused conv/deconv

Distinguish mkl  op_timer from cuda op timer.

add act optype to conv and deconv

add operator timer

add mkl kernel for convTransposed

minor fix for group conv

do not use cblas_sgemm_batch

CpuRuntimeObj->NativeCpuRuntimeObj

add  matmul op for mkl

* fix: fix bugs when rebasing from master

fix: fix bugs when rebasing from master

* fix: update api after rebasing

* fix: fix format; fix onnx import

* fix: fix clang-format

* [fix] fix conv_transpose test

* [fix] use stronger test case for transposed conv

* [fix] remove tensor memory format; fix mkl transpose conv

* [fix] add FIXME tag for op_timer python api

---------

Co-authored-by: whjthu <haojie0429@gmail.com>
2023-03-27 21:28:49 +08:00
Anmuliar bd63f738dc
cuDNN conv tuning (#16)
* Function tune and corresponding testcase.

*Add: Tune function in /src/kernel/cuda/conv.cc and corresponding testcase in test_conv.

*Fix: A little bug of perfRecord using in /src/core/runtime.cc.

* Tune part debug

*Add: recover the code, fixed the commit error.

*Add: some anotations in tune function

* clang formmat test

* Fix: mem leak in CUDA Runtime and Conv

* Fix: sync in conv and default sync in timeit

* Change the way to tune operator conv.

Timeit function cudNNUnfused -> Timeit function cudnnConvolutionForward.

* Change: merge the common part of cudnnunfused&tune into cudnndescriptoraccess

* clang test

* clang-format

* clang-format bash.

* Chore: remove print and blank lines

Co-authored-by: wcz112 <wcz19@mails.tsinghua.edu.cn>
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-29 21:37:07 +08:00
Anmuliar e076991f2f
Revert "Operator serialization (#14)" (#15)
This reverts commit 25f0c441d2.
2022-08-29 16:02:48 +08:00
Anmuliar 25f0c441d2
Operator serialization (#14)
Class "Cuda Runtime" fulfills function "tune" and adds corresponding testcase.

*Add: convCudnn::tune, convCudnn::cuDNNdescriptorAccess.

*Add: testcase tune.

*Fix: a brief bug in CPU Runtime.
2022-08-29 15:59:03 +08:00
zhengly123 04ea5eed38
Add CUDA runtime (#6)
* Fix: add warm-up and repetition in timing

* Add: CUDA runtime and float support

* Refactor: Cuda and Cpu runtimes inherit Runtime

* Add: environment script for Lotus

* Add: Lotus build instructions

* Update README.md

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-22 15:01:03 +08:00
Liyan Zheng ce5d49c79b Add: clang format script 2022-08-09 19:50:23 +08:00