InfiniTensor

History

xgqdut2016 d3e7543291 Cuda softmax (#129 ) * "add softmax.cu,.cc,.h" * Modify cuda softmax * "modified the introduction of softmax.cu" * "add format of cuda_softmax.h" * "modified where.cc(.cu,.h) and softmax.cu" * "modified format" * Fix cpu softmax kernel * "modified the // introduction of softmax.cu" * "modified softmax.cu and use 1D block" * "modified softmax.cu,format, and use 1D block" * "introduce share mem to speed softmax" * "reduce the input of function" * modified the format * remodify 2D block softmax * remodify 1D block softmax * modified the share memory * add warp reduce * conflict solve two * remove extra space line * solve comment --------- Co-authored-by: Haojie Wang <haojie0429@gmail.com> Co-authored-by: panzezhong <panzezhong@qiyuanlab.com>		2023-11-06 08:56:23 +08:00
..
bang	fix bang runtime bug after merging distributed branch (#137 )	2023-09-19 14:10:39 +08:00
core	fix tensor parallel for llama (#159 )	2023-10-30 15:04:16 +08:00
cuda	Cuda softmax (#129 )	2023-11-06 08:56:23 +08:00
ffi	Add TVM codegen for MemboundOp (#35 )	2022-09-22 18:06:45 +08:00
intelcpu	Cpu backend2 (#77 )	2023-04-17 12:15:23 +08:00
kunlun	Xpu (#82 )	2023-10-16 10:57:08 +08:00
nnet	test: 支持编译 einnet 单元测试，但不是所有测试都能通过 (#174 )	2023-11-03 13:21:49 +08:00
operators	add transpose, concat and split for native cpu (#158 )	2023-10-12 10:14:28 +08:00
utils	tensor parallel for transformer (#125 )	2023-09-14 14:19:45 +08:00
test.h	Add python interface for CUDA operator evaluation (#42 )	2022-09-27 10:41:12 +08:00