Chenjie Duan
51086d2b8d
Modify kernel registration & support fp16 ( #205 )
...
* - Remove dataType from the kernel registration.
* - support fp16 for conv
* - cpu kernel: adapt the new registration mechanism
* modified all register kernel
* add where fp16
* add layernorm fp16
* add split_concat fp16
* - element_wise support fp16
* feat: support transpose fp16
* feat: support sliceOp fp16
* - unary support fp16
* - feat: support reduceOp fp16
* feat: support matmulOp/expandOp fp16
* feat: support powOp int8
* add cuda cast & support half-precision for gather
* style: fix style
* feat:support int8 for gather
* style:fix style
* modified test_cuda_conv_transposed
* fix: fix dist code to support fp16
* fix(graph.cc): fix topo_sort
* fix: fix recv and send kernel registration
* feat: add field tensors for stub
* refactor(frontend): 先排序后构图
Signed-off-by: YdrMaster <ydrml@hotmail.com>
* fix: 为中间结果提供tensor到node的mapping
* fix (slice): add guard for area out of range
* fix: fix matmul fp16
* fix: fix re-dataMalloc for weight tensor and use of naive allocator
* feat: add dataType filter for cuda kernel
* feat: bang kernel adapt the new registration mechanism
* fix: fix some error on mlu
* feat: intelcpu kernel adapt the new registration mechanism
* feat: modify kernel registration on kunlun
* fix intelcpu compiler bug
* feat: bang reshape support all dataType
* fix: fix bang reduce
* fix(all_reduce.cc): fix as reviewer suggessted
* fix: fix style and restore unary test codes
---------
Signed-off-by: YdrMaster <ydrml@hotmail.com>
Co-authored-by: xgqdut2016 <kenan_gewei@163.com>
Co-authored-by: xgqdut2016 <140036308+xgqdut2016@users.noreply.github.com>
Co-authored-by: zhangyunze <z13785159769@163.com>
Co-authored-by: OdinaryWord <sx-hz@163.com>
Co-authored-by: YdrMaster <ydrml@hotmail.com>
Co-authored-by: panzezhong <panzezhong@qiyuanlab.com>
2024-01-15 11:02:13 +08:00
ChengXiang Qi
7f16fa353e
【Hackathon No.108】Add Gelu operator, ffi, kernel for cpu and gpu. ( #148 )
...
feat: Add Gelu kernel, operator, ffi.
2023-10-10 15:21:13 +08:00
zhengly123
2f8f706f1c
Fix CMake USE_CUDA ( #36 )
...
* Fix: build lib without cuda
* Chore: rename GBMM and G2BMM files
* Fix: seperate CUDA tests from operator tests
* Fix: CMake CMP0104
* Chore: fix typo
* Chore: remove unused headers
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-09-21 12:28:00 +08:00
Hardy
6ac106cba4
Add activation operators and kernels
...
* add code for activation operation
* add code for activation operation on GPU
* add test code for activation operation
* add code for activation operation
* add code for activation on gpu ,use cudnn
* add code for activation on GPU use cudnn
* Chore: add constants.h and remove comments
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Liyan Zheng <liyan-zheng@outlook.com>
2022-09-16 13:58:57 +08:00