InfiniTensor/include/core
xiaonans a98573990b
Accelerate llama (#219)
* [feature] add cudagraph support

* modify code to pass the cuda_all_reduce test

* modify rope op

* support rmsnorm

* add fp16 support to silu cuda op

* fix bugs in rmsnorm op

* uncomment simplify in onnx.py

---------

Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2024-04-01 08:46:05 +08:00
..
blob.h Support bang c kernel wanghailu 0927 (#43) 2022-09-30 11:01:52 +08:00
common.h XCCL support (#171) 2024-02-29 11:48:35 +08:00
communicator.h impl distributed launch with NCCL (#106) 2023-09-05 09:47:35 +08:00
constants.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
data_type.h 框架支持bert/gpt2模型构图 (#94) 2023-08-29 16:06:52 +08:00
dummy_mutator.h Add search engine (#64) 2023-02-12 18:27:52 +08:00
graph.h support Dynamic tensor infer shape and fix memory pool (#176) 2023-11-23 13:11:50 +08:00
graph_handler.h Accelerate llama (#219) 2024-04-01 08:46:05 +08:00
graph_match.h ADD: sub graph replacement. (#56) 2023-04-17 13:09:07 +08:00
hash.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
kernel.h Modify kernel registration & support fp16 (#205) 2024-01-15 11:02:13 +08:00
lazy_allocator.h support Dynamic tensor infer shape and fix memory pool (#176) 2023-11-23 13:11:50 +08:00
mutator.h ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61) 2023-03-27 21:28:49 +08:00
object.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
op_type.h Accelerate llama (#219) 2024-04-01 08:46:05 +08:00
operator.h Modify kernel registration & support fp16 (#205) 2024-01-15 11:02:13 +08:00
perf_engine.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
ref.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
runtime.h XCCL support (#171) 2024-02-29 11:48:35 +08:00
search_engine.h Add search engine (#64) 2023-02-12 18:27:52 +08:00
tensor.h XCCL support (#171) 2024-02-29 11:48:35 +08:00
tensor_base.h Copyout numpy接口 (#135) 2023-09-15 16:40:44 +08:00
tensor_type.h Support kvcache (#134) 2023-09-18 14:17:02 +08:00
workspace.h XCCL support (#171) 2024-02-29 11:48:35 +08:00