Commit Graph

5 Commits

Author SHA1 Message Date
zhangyue 00e6cc2587
XCCL support (#171)
* add reduce_mean and gather

* fix format

* add kunlun allreduce and cmakefile

* add kunlun allreduce and cmakefile

* deltete cmake opt

* fix format

* fix makefile

* add DIST option in Makefile

* add xpu allgather

* delete xpu_wait()

* add xpu allgather

* delete specific compiler

* fix format

* fix gather

* add broadcast

* fix format

* fix

* fix xpu, add where operation, fix element-wise operation

* fix softmax

* fix softmax

* log internal input and output

* fix kunlun gather bugs

* update CMakeList.txt and Makefile

* fix some kunlun kernels

* fix Makefile

* fix Makefile

* set cmake version 3.12

* format

* fix where, gather and support gpt2

* "fix format"

* fix format

* copy onnx.py from master

* use KUNLUN_HOME instead of absolute path

* fix torchvision models

* support torchvison model-zoo

* fix format

* format fix, CMakeList fix

* fix review

* fix vecToString return value

* fix format

* delete  empty file

---------

Co-authored-by: wanghailu <wanghailu0717@163.com>
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2024-02-29 11:48:35 +08:00
zhangyunze b15c4979fa
fix Issue-189 question 1-15 (#195)
* fix: fix nativecpu elementwise only support 4d tensor

* fix format

---------

Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2024-01-05 08:40:18 +08:00
Chenjie Duan 3f34372012
- modify error info when kernel not found (#191)
* - modify error info when kernel not found

* - modify code as reviewer suggested

---------

Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-12-27 09:43:57 +08:00
xgqdut2016 a7293c12ba
Add layer normalization (#181)
* - add layernorm kernel

* success:add layernorm kernel and test

* fix: remove unusalble comments

* fix: modify code as reviewer suggested

* debug,modified .cu and test

* optional bias support

* overloading function

* fix bug after merging; remove time constrain in conv test

---------

Co-authored-by: kilinchange <kilinchange@163.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-11-24 15:15:14 +08:00
zhangyunze ef672894d0
support mixed dtype (#102)
* feat: support mixed dtype

* feat: support cast op

* test: add test for cast op

* feat: support datatype BFloat16

* feat: support data convert fp32 <-> bfp16

* fix: fix all op's infershape func

* fix as review comment
2023-08-16 21:49:43 +08:00