InfiniTensor

History

zhangyue 00e6cc2587 XCCL support (#171 ) * add reduce_mean and gather * fix format * add kunlun allreduce and cmakefile * add kunlun allreduce and cmakefile * deltete cmake opt * fix format * fix makefile * add DIST option in Makefile * add xpu allgather * delete xpu_wait() * add xpu allgather * delete specific compiler * fix format * fix gather * add broadcast * fix format * fix * fix xpu, add where operation, fix element-wise operation * fix softmax * fix softmax * log internal input and output * fix kunlun gather bugs * update CMakeList.txt and Makefile * fix some kunlun kernels * fix Makefile * fix Makefile * set cmake version 3.12 * format * fix where, gather and support gpt2 * "fix format" * fix format * copy onnx.py from master * use KUNLUN_HOME instead of absolute path * fix torchvision models * support torchvison model-zoo * fix format * format fix, CMakeList fix * fix review * fix vecToString return value * fix format * delete empty file --------- Co-authored-by: wanghailu <wanghailu0717@163.com> Co-authored-by: wanghailu <wanghailu@qiyuanlab.com> Co-authored-by: Haojie Wang <haojie0429@gmail.com>		2024-02-29 11:48:35 +08:00
..
bang	fix mlu some kernel registration & gather op (#210 )	2024-02-01 15:02:02 +08:00
cuda	add test for rotary embedding cuda kernel	2024-02-04 10:24:20 +08:00
intelcpu	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
kunlun	XCCL support (#171 )	2024-02-29 11:48:35 +08:00
nativecpu	feat: add reshape/identity/squeeze/flatten/unsqueeze op cpu kernel (#213 )	2024-01-30 10:29:59 +08:00