InfiniTensor

History

xiaonans a98573990b Accelerate llama (#219 ) * [feature] add cudagraph support * modify code to pass the cuda_all_reduce test * modify rope op * support rmsnorm * add fp16 support to silu cuda op * fix bugs in rmsnorm op * uncomment simplify in onnx.py --------- Co-authored-by: Haojie Wang <haojie0429@gmail.com>		2024-04-01 08:46:05 +08:00
..
blob.cc	memory_allocator (#103 )	2023-08-13 13:39:35 +08:00
common.cc	add code for backtrace (#21 )	2022-09-01 20:30:12 +08:00
data_type.cc	框架支持bert/gpt2模型构图 (#94 )	2023-08-29 16:06:52 +08:00
dummy_mutator.cc	refactor(core): 添加新的 `OpType` 定义 (#99 )	2023-08-07 11:17:05 +08:00
graph.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
graph_handler.cc	Accelerate llama (#219 )	2024-04-01 08:46:05 +08:00
graph_match.cc	ADD: sub graph replacement. (#56 )	2023-04-17 13:09:07 +08:00
lazy_allocator.cc	support Dynamic tensor infer shape and fix memory pool (#176 )	2023-11-23 13:11:50 +08:00
op_type.cc	【Hackathon No.108】Add Gelu operator, ffi, kernel for cpu and gpu. (#148 )	2023-10-10 15:21:13 +08:00
operator.cc	use workspace to optimize kvcache attention	2024-01-25 10:33:01 +08:00
perf_engine.cc	支持fp16 dtype (#96 )	2023-08-02 16:38:16 +08:00
runtime.cc	Modify kernel registration & support fp16 (#205 )	2024-01-15 11:02:13 +08:00
search_engine.cc	refactor(core): 添加新的 `OpType` 定义 (#99 )	2023-08-07 11:17:05 +08:00
tensor.cc	XCCL support (#171 )	2024-02-29 11:48:35 +08:00
tensor_base.cc	refactor: 整合操作张量数据的方法	2023-03-21 14:00:04 +08:00