Commit Graph

7 Commits

Author SHA1 Message Date
wendy12022 c8b2c8ed32
Cpu backend2 (#77)
fix review

change Device::MKL to Device::INTELCPU

fix mkl linkage

fix errors according to merge from master

now can call mkl backend

fix softmax/flatten with axis from onnx.

modify README.md

fix memory refree

add env_lotus_intelcpu.sh

fix compile

merge from branch cpu_backend

fix something add gather

fix something

FIX: directory rename from "mkl" to "intelcpu"

ADD: use oneMKL dpcpp interface to implement matmul kernel.

ADD: add dpcpp as compiler for mkl, and fix warnings for clang compiling.
add dpcpp kernel for pow.

ADD: mkl kernel for pad.

ADD: slice mkl kernel.

ADD: reshape/flatten/identity mkl kernel.

ADD: split mkl kernel.

fix compile error

FIX: fix flattenObj with axis.

ADD reduce_mean mkl kernel.

Add concat mkl kernel.

bathNorm for mkl kernel.

sigmoid mkl kernel.

ADD:add mkl kernel for pooling

add more tests for softmax

Now softmax cuda kernel supports any axises.

mkl kernel for softmax

softmax

add axis to softmax operator

add mkl kernel for abs tanh

ADD: relu kernel for mkl

fix binary mkl primitives.

add mkl kernel for binary operators

fix compiler error

move stream to runtime

clang format

add MemoryFormat for tensorObj.

use post_ops for fused conv/deconv

Distinguish mkl  op_timer from cuda op timer.

add act optype to conv and deconv

add operator timer

add mkl kernel for convTransposed

minor fix for group conv

do not use cblas_sgemm_batch

CpuRuntimeObj->NativeCpuRuntimeObj

add  matmul op for mkl
2023-04-17 12:15:23 +08:00
YdrMaster 59bf59c10b docs: update README.md
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-03-15 17:23:32 +08:00
YdrMaster 14c9c82dab
test: enhance ci (#62)
* test: enhance ci

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* typo: README.md

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* fix: typo in workflow files

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* test: ci 安装 protobuf

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* test: cache protobuf

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* docs: update README.md

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* test: ci 调试完成,恢复只在代码更新时执行

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* test: ci 执行 cpu 上测试

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* fix: action paths

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* build: 4 个 submodule 规范到发布版本号

> <https://github.com/ArthurSonzogni/nlohmann_json_cmake_fetchcontent>
> 这个项目无法使用最新版因为每个次级版本号 api 都有变化,目前使用的是最接近原来版本的 v3.10.5

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* typo: README.md

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* test: 扩大测试执行范围方便后续扩充检查范围

Signed-off-by: YdrMaster <ydrml@hotmail.com>

* docs: update README.md

Signed-off-by: YdrMaster <ydrml@hotmail.com>

---------

Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-02-12 00:01:36 +08:00
zhengly123 ba0b11a499
Update README.md 2022-09-22 17:38:15 +08:00
zhengly123 fb067e46f9
Add contributor guide 2022-09-13 14:19:07 +08:00
zhengly123 04ea5eed38
Add CUDA runtime (#6)
* Fix: add warm-up and repetition in timing

* Add: CUDA runtime and float support

* Refactor: Cuda and Cpu runtimes inherit Runtime

* Add: environment script for Lotus

* Add: Lotus build instructions

* Update README.md

Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-08-22 15:01:03 +08:00
Haojie Wang b89495a782
Initial commit 2022-07-27 22:40:23 +08:00