InfiniTensor

History

xgqdut2016 d3e7543291 Cuda softmax (#129 ) * "add softmax.cu,.cc,.h" * Modify cuda softmax * "modified the introduction of softmax.cu" * "add format of cuda_softmax.h" * "modified where.cc(.cu,.h) and softmax.cu" * "modified format" * Fix cpu softmax kernel * "modified the // introduction of softmax.cu" * "modified softmax.cu and use 1D block" * "modified softmax.cu,format, and use 1D block" * "introduce share mem to speed softmax" * "reduce the input of function" * modified the format * remodify 2D block softmax * remodify 1D block softmax * modified the share memory * add warp reduce * conflict solve two * remove extra space line * solve comment --------- Co-authored-by: Haojie Wang <haojie0429@gmail.com> Co-authored-by: panzezhong <panzezhong@qiyuanlab.com>		2023-11-06 08:56:23 +08:00
..
bang	fix: fix cuda conv_fp16 run fail (#105 )	2023-08-10 15:22:18 +08:00
core	Xpu (#82 )	2023-10-16 10:57:08 +08:00
cuda	tensor parallel for transformer (#125 )	2023-09-14 14:19:45 +08:00
ffi	Xpu (#82 )	2023-10-16 10:57:08 +08:00
intelcpu	Cpu backend2 (#77 )	2023-04-17 12:15:23 +08:00
kernels	Cuda softmax (#129 )	2023-11-06 08:56:23 +08:00
kunlun	Xpu (#82 )	2023-10-16 10:57:08 +08:00
nnet	test: 支持编译 einnet 单元测试，但不是所有测试都能通过 (#174 )	2023-11-03 13:21:49 +08:00
operators	fix tensor parallel for llama (#159 )	2023-10-30 15:04:16 +08:00
utils	tensor parallel for transformer (#125 )	2023-09-14 14:19:45 +08:00