Commit Graph

  • 825b0170c0 fix broken link in docs fix_broken_doc_links baominghelly 2024-02-19 17:13:04 +0800
  • 936797b960 support rmsnorm accelerate_llama xiaonans 2024-02-08 10:40:03 +0800
  • 17bd98d453 modify rope op xiaonans 2024-02-06 17:04:05 +0800
  • 8cc6af0a83 modify code to pass the cuda_all_reduce test cudagraph xiaonans 2024-02-06 10:41:23 +0800
  • c04910f118 [feature] add cudagraph support xiaonans 2024-02-05 16:19:58 +0800
  • 1b9ef0f0ef
    Merge branch 'master' into xpu_xccl xpu_xccl Haojie Wang 2024-02-05 09:35:57 +0800
  • 0e95689ea9 fix format zhangyue 2024-02-04 11:41:59 +0800
  • 05ce239a08 support torchvison model-zoo zhangyue 2024-02-04 11:29:07 +0800
  • 900d8e58e3
    Rope and silu (#214) master xiaonans 2024-02-04 11:05:27 +0800
  • babb853726 fix torchvision models zhangyue 2024-02-04 11:02:49 +0800
  • b0876a13ce
    Merge branch 'master' into rope_and_silu xiaonans 2024-02-04 10:57:36 +0800
  • ae9f61de5a add comment for rope operator xiaonans 2024-02-04 10:40:25 +0800
  • 9a3c0f11f6 add test for rotary embedding cuda kernel xiaonans 2024-01-30 15:27:04 +0800
  • 67b2bcb7d5
    fix mlu some kernel registration & gather op (#210) zhangyunze 2024-02-01 15:02:02 +0800
  • 956ce37458 add unittest of silu kernel xiaonans 2024-01-30 10:40:13 +0800
  • 4813204a36
    feat: add reshape/identity/squeeze/flatten/unsqueeze op cpu kernel (#213) zhangyunze 2024-01-30 10:29:59 +0800
  • e7d34badfb fix format ascend OdinaryWord 2024-01-26 16:11:30 +0800
  • f6176124ec add softmax/element_wise kernel OdinaryWord 2024-01-26 15:40:21 +0800
  • 030e5ca9c1 Merge branch 'master' of github.com:InfiniTensor/InfiniTensor into rope_and_silu xiaonans 2024-01-26 10:07:53 +0800
  • e8d111ef5d add rope and silu support xiaonans 2024-01-11 15:44:07 +0800
  • d1a90ba3e2
    [feature] support kvcache with static graph (#209) xiaonans 2024-01-25 14:20:43 +0800
  • afed5d3c3d use workspace to optimize kvcache attention xiaonans 2024-01-25 09:08:25 +0800
  • a5062f3f89
    Update README.md Haojie Wang 2024-01-24 22:16:48 +0800
  • 09b2ecf98a
    support more data type on mlu (#211) Hardy 2024-01-24 13:33:33 +0800
  • e78953ba92 Merge branch 'master' of github.com:InfiniTensor/InfiniTensor into xpu_xccl zhangyue 2024-01-22 13:38:41 +0800
  • c970c93ba1 Merge branch 'master' into ascend OdinaryWord 2024-01-18 15:23:47 +0800
  • dcbbc82d5b Merge branch 'master' into ascend OdinaryWord 2024-01-18 15:15:55 +0800
  • 70950e3fbb fix concat&pooling test code OdinaryWord 2024-01-18 14:52:36 +0800
  • 6a1bfd6c45 [feature] support kvcache with static graph xiaonans 2024-01-17 11:26:05 +0800
  • e33131ce5c fix comment dropout Haojie Wang 2024-01-17 02:57:44 +0000
  • 886c5e5bd2 add hack comment Haojie Wang 2024-01-16 06:33:03 +0000
  • 9b63a62b70 fix test-onnx kilinchange 2024-01-15 06:55:59 +0000
  • 8baa34a1d2 fix wanghailu 2024-01-15 05:21:38 +0000
  • 19d3e831f9 Merge branch 'master' into dropout wanghailu 2024-01-15 03:39:26 +0000
  • 51086d2b8d
    Modify kernel registration & support fp16 (#205) Chenjie Duan 2024-01-15 11:02:13 +0800
  • 58993d4339
    解除前端对onnx infershape功能的依赖 (#206) zhangyunze 2024-01-12 14:54:27 +0800
  • 3d98e53b12 add rope and silu support kvcache_9G xiaonans 2024-01-11 15:44:07 +0800
  • 014e51e304 use KUNLUN_HOME instead of absolute path zhangyue 2024-01-11 11:43:27 +0800
  • 0ac4a58b2b copy onnx.py from master zhangyue 2024-01-11 11:18:39 +0800
  • fbfc7fe0e3 fix format zhangyue 2024-01-11 10:53:17 +0800
  • 5decb265f8 "fix format" zhangyue 2024-01-11 10:46:50 +0800
  • 9873394943 fix where, gather and support gpt2 zhangyue 2024-01-11 10:29:11 +0800
  • b3b1d3c2bf
    Merge branch 'master' into dropout Derui Yang 2024-01-05 18:36:47 +0800
  • addcdd1b00 fix env.sh zhangyue 2024-01-05 18:16:17 +0800
  • 554794647e Merge branch 'master' of github.com:InfiniTensor/InfiniTensor zhangyue 2024-01-05 18:10:12 +0800
  • 080c2cb504 format zhangyue 2024-01-05 17:58:49 +0800
  • ca5feb75b7 set cmake version 3.12 zhangyue 2024-01-05 17:51:01 +0800
  • d4c773d771 fix Makefile zhangyue 2024-01-05 17:15:29 +0800
  • c8cc073717 fix Makefile zhangyue 2024-01-05 17:12:41 +0800
  • 7c9285807c Merge branch 'master' of github.com:InfiniTensor/InfiniTensor into xpu_xccl zhangyue 2024-01-05 17:10:36 +0800
  • 6385e6ca10 fix some kunlun kernels zhangyue 2024-01-05 17:07:51 +0800
  • 3b5dd7d28c
    Merge branch 'master' into update_pybind11 update_pybind11 Haojie Wang 2024-01-05 09:20:33 +0800
  • 46e61a5bd4
    修正Slice内存越界问题 (#204) PanZezhong1725 2024-01-05 09:19:50 +0800
  • b15c4979fa
    fix Issue-189 question 1-15 (#195) zhangyunze 2024-01-05 08:40:18 +0800
  • 6a855085d2 modified kernel kvcache_fp16 xgqdut2016 2024-01-04 16:52:33 +0800
  • 42032356fb
    Bang cncl (#163) Hardy 2024-01-03 13:28:03 +0800
  • e44329c19e Merge branch 'master' of github.com:InfiniTensor/InfiniTensor zhangyue 2024-01-02 17:34:54 +0800
  • 272bfc6d48 update CMakeList.txt and Makefile zhangyue 2024-01-02 17:27:41 +0800
  • 689f8c6a5d fix wanghailu 2024-01-02 05:52:04 +0000
  • e3a2e65c47 fix wanghailu 2024-01-02 05:50:42 +0000
  • 331e787a92 fix (slice): add guard for area out of range panzezhong 2023-12-28 16:16:06 +0800
  • dc6befb549 fix: fix re-dataMalloc for weight tensor and use of naive allocator support_fp16 kilinchange 2023-12-29 17:27:36 +0800
  • 935b465cf2 fix: fix matmul fp16 zhangyunze 2023-12-29 16:55:16 +0800
  • 83f1de93d0
    add frontend resize kernel (#194) Chenjie Duan 2023-12-29 13:32:56 +0800
  • 3967b437c8
    fix Issue 187 split infershape wrong (#197) zhangyunze 2023-12-28 21:39:24 +0800
  • 6e7bd6ca0c
    fix(perf.py): change NNmodel commit to fix perf.py (#203) Chenjie Duan 2023-12-28 21:31:39 +0800
  • 06f5e82d8b add dropout Baoming Li 2023-12-28 17:36:51 +0800
  • a91ed84354 fix (slice): add guard for area out of range panzezhong 2023-12-28 16:16:06 +0800
  • cada8ec6c8 add dropout wanghailu 2023-12-28 08:15:21 +0000
  • 5ac0ab442f
    Fix bang (#198) Hardy 2023-12-28 13:44:10 +0800
  • e5ca66db66 feat: support int8 llama kilinchange 2023-12-27 15:28:05 +0800
  • 430f297801 support cublaslt xiaonans 2023-12-27 11:08:13 +0800
  • 85de28ef1e fix: 为中间结果提供tensor到node的mapping panzezhong 2023-12-27 10:48:06 +0800
  • 3f34372012
    - modify error info when kernel not found (#191) Chenjie Duan 2023-12-27 09:43:57 +0800
  • 1c7d011634 update pybind11 wanghailu 2023-12-26 16:11:55 +0800
  • c34946a0d8 refactor(frontend): 先排序后构图 YdrMaster 2023-12-25 17:58:29 +0800
  • ce23b8356f modified dynamic_quantize_linear xgqdut2016 2023-12-25 17:33:43 +0800
  • 7b48b93fb3 feat: add field tensors for stub kilinchange 2023-12-25 15:49:21 +0800
  • bc872f3e0f cudagraph support xiaonans 2023-12-22 16:34:18 +0800
  • cb43dfbde7 cleaning xiaonans 2023-11-28 16:29:48 +0800
  • 0e79a5cce5 gemv2N to gemv2T xiaonans 2023-11-27 16:19:01 +0800
  • 1d97b9aa29 remove cudamalloc in attention op xiaonans 2023-11-24 13:14:06 +0800
  • d68353fd9a kvcache_attention support reduce intra blocks xiaonans 2023-11-21 17:30:21 +0800
  • 8ab4c87145 add test to attention_kvcache op xiaonans 2023-11-14 10:21:34 +0800
  • e9ae4566e3 [feature] add fused attention_kvcache operator support xiaonans 2023-11-10 10:51:44 +0800
  • 6458093da4 fix graph topo & add cublaslt support & others kvcache xiaonans 2023-12-20 16:33:49 +0800
  • 5c7b456c95 fix kunlun gather bugs zhangyue 2023-12-20 11:35:42 +0800
  • 8d901ba7aa fix: run int8 llama but has nan output kilinchange 2023-12-19 17:17:22 +0800
  • 8ae5958b29 feat: add support for dynamic_quantize_linear OdinaryWord 2023-12-19 16:41:17 +0800
  • 0e75f99e7e feat: add dynamic quantize linear kernel kilinchange 2023-12-19 14:40:50 +0800
  • 97e3377ca5 feat: add matmulinteger op zhangyunze 2023-12-19 09:57:31 +0800
  • 9c82936386 add half and float dequantizeLinear xgqdut2016 2023-12-18 17:47:53 +0800
  • 03ed8c4de7 feat: support unary int8 kilinchange 2023-12-18 15:02:44 +0800
  • c63ed4326d feat: add frontend DynamicQuantizeLinear and DequantizeLinear kernels kilinchange 2023-12-18 13:58:20 +0800
  • f51ce3231a where support int8 xgqdut2016 2023-12-15 15:40:30 +0800
  • 9d9e996713 fix(graph.cc): fix topo_sort kilinchange 2023-12-15 10:13:18 +0800
  • c859e655d3 Merge branch 'master' into support_fp16 xgqdut2016 2023-12-15 09:55:50 +0800
  • 9a9587556c
    Add examples: inference of Paddle models (#192) learner2468 2023-12-14 19:42:43 +0800
  • e66f1c0421 fix: fix dist code to support fp16 zhangyunze 2023-12-14 18:01:03 +0800
  • a3929c25f8
    Add send and recv operators based on NCCL (#182) xgqdut2016 2023-12-14 16:38:03 +0800