Commit Graph

246 Commits

Author SHA1 Message Date
kilinchange dc6befb549 fix: fix re-dataMalloc for weight tensor and use of naive allocator 2023-12-29 17:27:36 +08:00
zhangyunze 935b465cf2 fix: fix matmul fp16 2023-12-29 16:55:38 +08:00
panzezhong a91ed84354 fix (slice): add guard for area out of range 2023-12-28 16:35:47 +08:00
kilinchange e5ca66db66 feat: support int8 llama 2023-12-27 15:28:05 +08:00
panzezhong 85de28ef1e fix: 为中间结果提供tensor到node的mapping 2023-12-27 10:48:06 +08:00
YdrMaster c34946a0d8 refactor(frontend): 先排序后构图
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-12-25 17:58:29 +08:00
xgqdut2016 ce23b8356f modified dynamic_quantize_linear 2023-12-25 17:33:43 +08:00
kilinchange 7b48b93fb3 feat: add field tensors for stub 2023-12-25 15:49:21 +08:00
kilinchange 8d901ba7aa fix: run int8 llama but has nan output 2023-12-19 17:17:22 +08:00
OdinaryWord 8ae5958b29 feat: add support for dynamic_quantize_linear 2023-12-19 16:41:17 +08:00
kilinchange 0e75f99e7e feat: add dynamic quantize linear kernel 2023-12-19 14:46:14 +08:00
zhangyunze 97e3377ca5 feat: add matmulinteger op 2023-12-19 14:14:33 +08:00
xgqdut2016 9c82936386 add half and float dequantizeLinear 2023-12-18 17:47:53 +08:00
kilinchange 03ed8c4de7 feat: support unary int8 2023-12-18 17:32:22 +08:00
kilinchange c63ed4326d feat: add frontend DynamicQuantizeLinear and DequantizeLinear kernels 2023-12-18 13:58:20 +08:00
xgqdut2016 f51ce3231a where support int8 2023-12-15 15:40:30 +08:00
kilinchange 9d9e996713 fix(graph.cc): fix topo_sort 2023-12-15 10:13:18 +08:00
xgqdut2016 c859e655d3 Merge branch 'master' into support_fp16 2023-12-15 10:02:03 +08:00
learner2468 9a9587556c
Add examples: inference of Paddle models (#192)
* Add paddle model and infer with InfiniTensor

* Remove unused import

---------

Co-authored-by: kilinchange <44265800+kilinchange@users.noreply.github.com>

【Hackathon No.106】Add paddle model and infer with InfiniTensor
2023-12-14 19:42:43 +08:00
zhangyunze e66f1c0421 fix: fix dist code to support fp16 2023-12-14 18:02:08 +08:00
xgqdut2016 a3929c25f8
Add send and recv operators based on NCCL (#182)
* baseline sendrecv, bug

* success sendrecv

* get rank from comm

* set output shape

* successful:set output shape equal to input shape

* shape as attribute

* success:shape as attribute

* success send recv, output 0

* add onnx test

* split send and recv

* success split send and recv

* test-onnx bug

* success test-onnx

* modified onnx.py

* solve review
2023-12-14 16:38:03 +08:00
xgqdut2016 ff98241db7 modified test_cuda_conv_transposed 2023-12-14 14:44:28 +08:00
OdinaryWord 046b2d68d8 style:fix style 2023-12-14 13:35:08 +08:00
OdinaryWord 2af4c1276b feat:support int8 for gather 2023-12-14 13:28:41 +08:00
OdinaryWord db8c3eec15 style: fix style 2023-12-14 11:32:07 +08:00
OdinaryWord c29dcf1e6d add cuda cast & support half-precision for gather 2023-12-14 11:24:25 +08:00
zhangyunze 5ed7db1506 feat: support powOp int8 2023-12-14 11:15:28 +08:00
zhangyunze bdb8d8d65f feat: support matmulOp/expandOp fp16 2023-12-14 11:07:45 +08:00
kilinchange cbdeb73e86 - feat: support reduceOp fp16 2023-12-13 17:39:39 +08:00
kilinchange 5af7f1e753 - unary support fp16 2023-12-13 17:06:27 +08:00
zhangyunze ee4ecd27e2 feat: support sliceOp fp16 2023-12-13 16:55:16 +08:00
zhangyunze d5e775397d feat: support transpose fp16 2023-12-13 16:36:37 +08:00
kilinchange 4b02de7e17 - element_wise support fp16 2023-12-13 15:57:25 +08:00
xgqdut2016 e07516ebe9
Merge branch 'master' into support_fp16 2023-12-11 16:50:02 +08:00
xgqdut2016 dd4a90fb5e add split_concat fp16 2023-12-11 16:45:16 +08:00
xgqdut2016 fda0a5f982 add layernorm fp16 2023-12-11 15:05:34 +08:00
Derui Yang c143eebdf7
不依赖 onnx models 的模型存储 (#196)
Signed-off-by: YdrMaster <ydrml@hotmail.com>
2023-12-11 10:44:06 +08:00
xgqdut2016 8b2e3b8e19 add where fp16 2023-12-08 16:57:49 +08:00
xgqdut2016 a000cb0304 modified all register kernel 2023-12-07 17:53:28 +08:00
kilinchange c587901586 - cpu kernel: adapt the new registration mechanism 2023-12-07 13:43:40 +08:00
kilinchange c19256bca6 - support fp16 for conv 2023-12-04 16:56:16 +08:00
kilinchange 4db6699e09 - Remove dataType from the kernel registration. 2023-11-30 13:51:24 +08:00
Hardy 67974aee8a
Fix https://github.com/InfiniTensor/InfiniTensor/pull/160 (#185)
Co-authored-by: wanghailu <wanghailu0717@163.com>
2023-11-27 14:18:12 +08:00
Hardy 3ead20a23a
Fix workspace & bang conv (#183)
* fix bang workspace

* fix convbpdata

* fix code

* add code

* fix

* fix

* fix conv

* fix test conv

---------

Co-authored-by: wanghailu <wanghailu0717@163.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-11-24 15:16:25 +08:00
xgqdut2016 a7293c12ba
Add layer normalization (#181)
* - add layernorm kernel

* success:add layernorm kernel and test

* fix: remove unusalble comments

* fix: modify code as reviewer suggested

* debug,modified .cu and test

* optional bias support

* overloading function

* fix bug after merging; remove time constrain in conv test

---------

Co-authored-by: kilinchange <kilinchange@163.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-11-24 15:15:14 +08:00
PanZezhong1725 6ece3f4a77
Add ReduceSum op and kernel (#160)
* Add reduceSum op and kernel

* fix merge and format

* Reduce: reuse cat macro, add doc string

---------

Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-11-24 09:29:58 +08:00
xgqdut2016 595a9906d2
add infer index function (#175)
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-11-24 09:24:25 +08:00
zhangyunze 331f7ab2b8
support Dynamic tensor infer shape and fix memory pool (#176)
* feat: support dynamic tensor part1

* feat: support dynamic-tensor part2

* feat: support dynamic tensor part 3

* fix: fix some ..

* - add kvcache example

* feat: support concat to identity kernel

* add a simple mempory pool for allocator

* fix: rebase to master

* fix bug after merging

* - remove outdated script

* fix: fix as review

---------

Co-authored-by: kilinchange <kilinchange@163.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-11-23 13:11:50 +08:00
xiaonans 965df4e294
[feature] add fused attention_kvcache operator support (#179)
* [feature] add fused attention_kvcache operator support

* add test to attention_kvcache op

* Add space line at EOF

---------

Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-11-14 23:44:22 +08:00
Hardy f22fa2766e
add reduce_mean and gather on bang (#167)
* add code

* fix reduce_mean

* add softmax on BANG

* fix gather

* fix boradcast on ele kernel when dim size is zero

* add where kernel and fix softmax kernel

* fix convbpdata bug

* fix format

---------

Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-11-10 18:02:44 +08:00