Commit Graph

  • f6176124ec add softmax/element_wise kernel OdinaryWord 2024-01-26 15:40:21 +0800
  • 030e5ca9c1 Merge branch 'master' of github.com:InfiniTensor/InfiniTensor into rope_and_silu xiaonans 2024-01-26 10:07:53 +0800
  • e8d111ef5d add rope and silu support xiaonans 2024-01-11 15:44:07 +0800
  • d1a90ba3e2
    [feature] support kvcache with static graph (#209) xiaonans 2024-01-25 14:20:43 +0800
  • afed5d3c3d use workspace to optimize kvcache attention xiaonans 2024-01-25 09:08:25 +0800
  • a5062f3f89
    Update README.md Haojie Wang 2024-01-24 22:16:48 +0800
  • 09b2ecf98a
    support more data type on mlu (#211) Hardy 2024-01-24 13:33:33 +0800
  • c970c93ba1 Merge branch 'master' into ascend OdinaryWord 2024-01-18 15:23:47 +0800
  • dcbbc82d5b Merge branch 'master' into ascend OdinaryWord 2024-01-18 15:15:55 +0800
  • 70950e3fbb fix concat&pooling test code OdinaryWord 2024-01-18 14:52:36 +0800
  • 6a1bfd6c45 [feature] support kvcache with static graph xiaonans 2024-01-17 11:26:05 +0800
  • e33131ce5c fix comment dropout Haojie Wang 2024-01-17 02:57:44 +0000
  • 886c5e5bd2 add hack comment Haojie Wang 2024-01-16 06:33:03 +0000
  • 9b63a62b70 fix test-onnx kilinchange 2024-01-15 06:55:59 +0000
  • 8baa34a1d2 fix wanghailu 2024-01-15 05:21:38 +0000
  • 19d3e831f9 Merge branch 'master' into dropout wanghailu 2024-01-15 03:39:26 +0000
  • 51086d2b8d
    Modify kernel registration & support fp16 (#205) Chenjie Duan 2024-01-15 11:02:13 +0800
  • 58993d4339
    解除前端对onnx infershape功能的依赖 (#206) zhangyunze 2024-01-12 14:54:27 +0800
  • b3b1d3c2bf
    Merge branch 'master' into dropout Derui Yang 2024-01-05 18:36:47 +0800
  • 3b5dd7d28c
    Merge branch 'master' into update_pybind11 update_pybind11 Haojie Wang 2024-01-05 09:20:33 +0800
  • 46e61a5bd4
    修正Slice内存越界问题 (#204) PanZezhong1725 2024-01-05 09:19:50 +0800
  • b15c4979fa
    fix Issue-189 question 1-15 (#195) zhangyunze 2024-01-05 08:40:18 +0800
  • 42032356fb
    Bang cncl (#163) Hardy 2024-01-03 13:28:03 +0800
  • 689f8c6a5d fix wanghailu 2024-01-02 05:52:04 +0000
  • e3a2e65c47 fix wanghailu 2024-01-02 05:50:42 +0000
  • dc6befb549 fix: fix re-dataMalloc for weight tensor and use of naive allocator support_fp16 kilinchange 2023-12-29 17:27:36 +0800
  • 935b465cf2 fix: fix matmul fp16 zhangyunze 2023-12-29 16:55:16 +0800
  • 83f1de93d0
    add frontend resize kernel (#194) Chenjie Duan 2023-12-29 13:32:56 +0800
  • 3967b437c8
    fix Issue 187 split infershape wrong (#197) zhangyunze 2023-12-28 21:39:24 +0800
  • 6e7bd6ca0c
    fix(perf.py): change NNmodel commit to fix perf.py (#203) Chenjie Duan 2023-12-28 21:31:39 +0800
  • 06f5e82d8b add dropout Baoming Li 2023-12-28 17:36:51 +0800
  • a91ed84354 fix (slice): add guard for area out of range panzezhong 2023-12-28 16:16:06 +0800
  • cada8ec6c8 add dropout wanghailu 2023-12-28 08:15:21 +0000
  • 5ac0ab442f
    Fix bang (#198) Hardy 2023-12-28 13:44:10 +0800
  • e5ca66db66 feat: support int8 llama kilinchange 2023-12-27 15:28:05 +0800
  • 85de28ef1e fix: 为中间结果提供tensor到node的mapping panzezhong 2023-12-27 10:48:06 +0800
  • 3f34372012
    - modify error info when kernel not found (#191) Chenjie Duan 2023-12-27 09:43:57 +0800
  • 1c7d011634 update pybind11 wanghailu 2023-12-26 16:11:55 +0800
  • c34946a0d8 refactor(frontend): 先排序后构图 YdrMaster 2023-12-25 17:58:29 +0800
  • ce23b8356f modified dynamic_quantize_linear xgqdut2016 2023-12-25 17:33:43 +0800
  • 7b48b93fb3 feat: add field tensors for stub kilinchange 2023-12-25 15:49:21 +0800
  • 8d901ba7aa fix: run int8 llama but has nan output kilinchange 2023-12-19 17:17:22 +0800
  • 8ae5958b29 feat: add support for dynamic_quantize_linear OdinaryWord 2023-12-19 16:41:17 +0800
  • 0e75f99e7e feat: add dynamic quantize linear kernel kilinchange 2023-12-19 14:40:50 +0800
  • 97e3377ca5 feat: add matmulinteger op zhangyunze 2023-12-19 09:57:31 +0800
  • 9c82936386 add half and float dequantizeLinear xgqdut2016 2023-12-18 17:47:53 +0800
  • 03ed8c4de7 feat: support unary int8 kilinchange 2023-12-18 15:02:44 +0800
  • c63ed4326d feat: add frontend DynamicQuantizeLinear and DequantizeLinear kernels kilinchange 2023-12-18 13:58:20 +0800
  • f51ce3231a where support int8 xgqdut2016 2023-12-15 15:40:30 +0800
  • 9d9e996713 fix(graph.cc): fix topo_sort kilinchange 2023-12-15 10:13:18 +0800
  • c859e655d3 Merge branch 'master' into support_fp16 xgqdut2016 2023-12-15 09:55:50 +0800
  • 9a9587556c
    Add examples: inference of Paddle models (#192) learner2468 2023-12-14 19:42:43 +0800
  • e66f1c0421 fix: fix dist code to support fp16 zhangyunze 2023-12-14 18:01:03 +0800
  • a3929c25f8
    Add send and recv operators based on NCCL (#182) xgqdut2016 2023-12-14 16:38:03 +0800
  • ff98241db7 modified test_cuda_conv_transposed xgqdut2016 2023-12-14 14:44:28 +0800
  • 046b2d68d8 style:fix style OdinaryWord 2023-12-14 13:35:08 +0800
  • 2af4c1276b feat:support int8 for gather OdinaryWord 2023-12-14 13:28:41 +0800
  • db8c3eec15 style: fix style OdinaryWord 2023-12-14 11:32:07 +0800
  • c29dcf1e6d add cuda cast & support half-precision for gather OdinaryWord 2023-12-14 11:24:25 +0800
  • 5ed7db1506 feat: support powOp int8 zhangyunze 2023-12-14 11:15:28 +0800
  • bdb8d8d65f feat: support matmulOp/expandOp fp16 zhangyunze 2023-12-14 11:01:46 +0800
  • cbdeb73e86 - feat: support reduceOp fp16 kilinchange 2023-12-13 17:39:39 +0800
  • 5af7f1e753 - unary support fp16 kilinchange 2023-12-13 17:05:17 +0800
  • ee4ecd27e2 feat: support sliceOp fp16 zhangyunze 2023-12-13 16:55:16 +0800
  • d5e775397d feat: support transpose fp16 zhangyunze 2023-12-13 16:36:20 +0800
  • 4b02de7e17 - element_wise support fp16 kilinchange 2023-12-12 15:15:05 +0800
  • e07516ebe9
    Merge branch 'master' into support_fp16 xgqdut2016 2023-12-11 16:50:02 +0800
  • dd4a90fb5e add split_concat fp16 xgqdut2016 2023-12-11 16:45:16 +0800
  • fda0a5f982 add layernorm fp16 xgqdut2016 2023-12-11 15:05:34 +0800
  • c143eebdf7
    不依赖 onnx models 的模型存储 (#196) Derui Yang 2023-12-11 10:44:06 +0800
  • 8b2e3b8e19 add where fp16 xgqdut2016 2023-12-08 16:57:49 +0800
  • a000cb0304 modified all register kernel xgqdut2016 2023-12-07 17:53:28 +0800
  • c587901586 - cpu kernel: adapt the new registration mechanism kilinchange 2023-12-05 10:49:28 +0800
  • a68ac10107 Enrich dev doc add_paddle_model learner2468 2023-12-05 17:14:28 +0800
  • 6d62350631 Change function name and add dev doc learner2468 2023-12-05 17:10:46 +0800
  • 57954fd523 Add paddle model and use InfiniTensor to infer learner2468 2023-12-05 16:53:28 +0800
  • c19256bca6 - support fp16 for conv kilinchange 2023-11-30 14:36:43 +0800
  • 4db6699e09 - Remove dataType from the kernel registration. kilinchange 2023-11-30 13:51:24 +0800
  • 67974aee8a
    Fix https://github.com/InfiniTensor/InfiniTensor/pull/160 (#185) test-models Hardy 2023-11-27 14:18:12 +0800
  • 3ead20a23a
    Fix workspace & bang conv (#183) Hardy 2023-11-24 15:16:25 +0800
  • a7293c12ba
    Add layer normalization (#181) xgqdut2016 2023-11-24 15:15:14 +0800
  • 6ece3f4a77
    Add ReduceSum op and kernel (#160) PanZezhong1725 2023-11-24 09:29:58 +0800
  • 595a9906d2
    add infer index function (#175) xgqdut2016 2023-11-24 09:24:25 +0800
  • 331f7ab2b8
    support Dynamic tensor infer shape and fix memory pool (#176) zhangyunze 2023-11-23 13:11:50 +0800
  • 54f4265296 modified logic xgqdut2016 2023-11-17 17:43:52 +0800
  • 965df4e294
    [feature] add fused attention_kvcache operator support (#179) point2point xiaonans 2023-11-14 23:44:22 +0800
  • 0a5d273130 Add: print derivation steps for conv2gemm NNET_231111_from_master wanghailu0717 2023-11-10 23:16:10 +0800
  • 295450e5f4 Add: show conv2gemm derivation NNET_231111 Liyan Zheng 2023-11-10 22:49:07 +0800
  • f22fa2766e
    add reduce_mean and gather on bang (#167) Hardy 2023-11-10 18:02:44 +0800
  • 50862df765
    [Kunlun & CUDA & BANG] add depth2space operator (#178) Hardy 2023-11-10 17:58:26 +0800
  • 1ea450882b
    add reduce_mean and gather on kunlun (#169) Hardy 2023-11-10 17:52:09 +0800
  • d3e7543291
    Cuda softmax (#129) xgqdut2016 2023-11-06 08:56:23 +0800
  • 39484e0cc4 add kernels OdinaryWord 2023-11-03 14:43:21 +0800
  • 1a6fccccbe
    test: 支持编译 einnet 单元测试,但不是所有测试都能通过 (#174) Derui Yang 2023-11-03 13:21:49 +0800
  • ec3adf6fa7
    support 8D tensor, add test example (#170) xgqdut2016 2023-10-31 10:47:36 +0800
  • 23b825efc4
    Xpu task4 support: add softmax (#172) Bolun Zhang 2023-10-30 16:01:05 +0800
  • feccd4f318
    fix tensor parallel for llama (#159) constroy Li 2023-10-30 15:04:16 +0800
  • a9bd73528d more Unary OdinaryWord 2023-10-30 11:24:53 +0800
  • 95ee579338 addAbs OdinaryWord 2023-10-26 16:37:03 +0800
  • 11e2b08be3 fix wanghailu0717 2023-10-26 10:18:15 +0800