* MLU CNCL base
* add FindCNCL.cmake, not find -lcncl
* bangPrintFloat not find
* docker:make sucessful, test error
* delete net file and onnxtest.py
* init
* fix cncl
* format
* fix
* format
* fix cncl
* run dist gpt2 on mlu
* format
* fix import error on mlu docker
* run llama single card
* run distributed llama2
* add test for slice/reduce on mlu
* fix cncl related test
* fix format
* format
* delete comments
* change GPU to MLU
* MLU CNCL base
* add FindCNCL.cmake, not find -lcncl
* bangPrintFloat not find
* docker:make sucessful, test error
* delete net file and onnxtest.py
* init
* fix cncl
* format
* fix
* format
* fix cncl
* run dist gpt2 on mlu
* format
* fix import error on mlu docker
* run llama single card
* run distributed llama2
* add test for slice/reduce on mlu
* fix cncl related test
* fix format
* format
* delete comments
* change GPU to MLU
* modify launch script
* fix name
* fix format
* fix gather
* format python script
---------
Co-authored-by: xgqdut2016 <kenan_gewei@163.com>
Co-authored-by: Bolun <chamberlain0w0@gmail.com>
Co-authored-by: Bolun Zhang <48948016+Chamberlain0w0@users.noreply.github.com>
* support matmul
* add matmul
* add matmul
* add code for cnnl matmul operation and test
* add conv
* add code for conv test on mlu
* add code for test cnnl conv on mlu
* add code for perf conv and matmul on mlu
* clang format
* fix convolution operation
* fxi cmaklist
* code format
* fix code
* code format
---------
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: wanghailu <wanghailu0717@163.com>
* fix a little bug which found by new verison CMake
* add code for support BangC language kernel , just like Cuda kernel, not
library
* add bangc kernel
* support BangC kernel
* add code for support BangC kernel
* support bangc kernel
* fix some code from reviewer
* fix code of template fumction
* add code for support bangc kernel
* fix bangc format
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
* add code for cambricon mlu, bang, cnnl
* add code for support cambricon mlu,cnnl,cnrt
* add code for support mlu
* add code for support cambricon cnnl
* add code for support mlu
* add code for mlu
* add code for mlu
`
* Update CMakeLists.txt
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: zhengly123 <zhengly123@outlook.com>