zhangyue
|
00e6cc2587
|
XCCL support (#171)
* add reduce_mean and gather
* fix format
* add kunlun allreduce and cmakefile
* add kunlun allreduce and cmakefile
* deltete cmake opt
* fix format
* fix makefile
* add DIST option in Makefile
* add xpu allgather
* delete xpu_wait()
* add xpu allgather
* delete specific compiler
* fix format
* fix gather
* add broadcast
* fix format
* fix
* fix xpu, add where operation, fix element-wise operation
* fix softmax
* fix softmax
* log internal input and output
* fix kunlun gather bugs
* update CMakeList.txt and Makefile
* fix some kunlun kernels
* fix Makefile
* fix Makefile
* set cmake version 3.12
* format
* fix where, gather and support gpt2
* "fix format"
* fix format
* copy onnx.py from master
* use KUNLUN_HOME instead of absolute path
* fix torchvision models
* support torchvison model-zoo
* fix format
* format fix, CMakeList fix
* fix review
* fix vecToString return value
* fix format
* delete empty file
---------
Co-authored-by: wanghailu <wanghailu0717@163.com>
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
|
2024-02-29 11:48:35 +08:00 |