Commit Graph

  • 1ea450882b
    add reduce_mean and gather on kunlun (#169) Hardy 2023-11-10 17:52:09 +0800
  • d3e7543291
    Cuda softmax (#129) xgqdut2016 2023-11-06 08:56:23 +0800
  • 39484e0cc4 add kernels OdinaryWord 2023-11-03 14:43:21 +0800
  • 1a6fccccbe
    test: 支持编译 einnet 单元测试,但不是所有测试都能通过 (#174) Derui Yang 2023-11-03 13:21:49 +0800
  • ec3adf6fa7
    support 8D tensor, add test example (#170) xgqdut2016 2023-10-31 10:47:36 +0800
  • 23b825efc4
    Xpu task4 support: add softmax (#172) Bolun Zhang 2023-10-30 16:01:05 +0800
  • feccd4f318
    fix tensor parallel for llama (#159) constroy Li 2023-10-30 15:04:16 +0800
  • a9bd73528d more Unary OdinaryWord 2023-10-30 11:24:53 +0800
  • 95ee579338 addAbs OdinaryWord 2023-10-26 16:37:03 +0800
  • 11e2b08be3 fix wanghailu0717 2023-10-26 10:18:15 +0800
  • cc057bcf80 fix wanghailu0717 2023-10-26 10:08:58 +0800
  • 7f5188bedd
    remove dimension limit of elementwise operators on xpu (#168) Haojie Wang 2023-10-25 14:38:47 +0800
  • 6b06ab0534 fix format wanghailu 2023-10-23 14:54:10 +0800
  • 412f301323 fix format wanghailu 2023-10-23 10:48:35 +0800
  • 07ef587c65
    Change onnx-simplifier to onnxsim to resolve build issue on xpu (#164) baominghelly 2023-10-21 02:58:32 +0800
  • b1bdbbf478
    Merge branch 'master' into ascend Haojie Wang 2023-10-21 02:57:51 +0800
  • 56634b3b19 fix code wanghailu 2023-10-20 15:06:08 +0800
  • b6ff4514fe fix wanghailu0717 2023-10-20 14:08:39 +0800
  • 9272d709da add a simple mempory pool for allocator allocator_memPool kilinchange 2023-10-17 17:22:43 +0800
  • d0f9792613
    Fix: add building option for NNet (#162) Derui Yang 2023-10-16 19:53:28 +0800
  • 1184fa131f
    Xpu (#82) Hardy 2023-10-16 10:57:08 +0800
  • ee6dd3deac update test test_codegen Bolun 2023-10-12 14:07:23 +0800
  • 8e4d88fb9f
    add transpose, concat and split for native cpu (#158) Haojie Wang 2023-10-12 10:14:28 +0800
  • c774c9182d fix Bolun 2023-10-12 09:59:08 +0800
  • 36ae7b7fb6
    Add GatherElements op and cuda kernel (#149) PanZezhong1725 2023-10-12 09:18:12 +0800
  • 7c484d72b4
    Merge branch 'master' into change_path change_path Haojie Wang 2023-10-12 09:16:12 +0800
  • 764702beb2 format Bolun 2023-10-11 15:01:01 +0800
  • 3366cfa943 add cuda and bang test for codegen Bolun 2023-10-11 14:56:01 +0800
  • 5a74f8fa4b add test for codegen Bolun 2023-10-11 14:53:53 +0800
  • ed3034f878
    Add HardSigmoid and HardSwish (#156) PanZezhong1725 2023-10-10 22:41:06 +0800
  • 6319e12c75 merge master xpu_allreduce zhangyue 2023-10-10 17:04:09 +0800
  • 53bba11333 Merge branch 'master' of github.com:InfiniTensor/InfiniTensor into xpu_allreduce zhangyue 2023-10-10 16:59:15 +0800
  • 68deba42d3 Merge branch 'master' into xpu zhangyue 2023-10-10 16:53:49 +0800
  • 1151101fb9
    add naive allocator for debugging (#140) kilinchange 2023-10-10 16:42:23 +0800
  • 90b9a80f72
    add onnx simplify (#153) Haojie Wang 2023-10-10 15:45:27 +0800
  • c82d5fdc60
    Merge branch 'master' into xpu Haojie Wang 2023-10-10 15:32:37 +0800
  • 7f16fa353e
    【Hackathon No.108】Add Gelu operator, ffi, kernel for cpu and gpu. (#148) ChengXiang Qi 2023-10-10 15:21:13 +0800
  • 7600fe688c
    Add Neg operator and kernel (#152) PanZezhong1725 2023-10-10 10:54:56 +0800
  • 7a9fcd93b2
    Pooling ceil mode (#155) Haojie Wang 2023-10-09 20:51:39 +0800
  • 444bf53b41
    Merge branch 'master' into xpu Hardy 2023-10-09 14:08:06 +0800
  • 0294dca5c1 format wanghailu0717 2023-10-09 13:58:15 +0800
  • d4f8f55849 fix pooling test wanghailu 2023-10-09 13:47:06 +0800
  • 9d915938e8 fix wanghailu 2023-10-09 13:37:35 +0800
  • 93447fb83c fix wanghailu 2023-10-09 13:21:47 +0800
  • 785853b0a3
    Add erf kernel for cpu and gpu (#147) PanZezhong1725 2023-10-09 09:36:55 +0800
  • 33cae7fc41 Merge branch 'master' into xpu wanghailu 2023-10-09 09:30:01 +0800
  • c0ff584e04
    add constant op; fix concat bug (#151) Haojie Wang 2023-10-08 21:42:41 +0800
  • 3629881dfa blockreduce matrix times xgqdut2016 2023-10-08 15:15:39 +0800
  • 79dd3364df modified threadIdx.y to threadIdx.x xgqdut2016 2023-10-08 11:33:16 +0800
  • 56e2c87c9b modified reduce,8ms xgqdut2016 2023-10-07 18:14:12 +0800
  • de392beb87 build: 更新 RefactorGraph dev-dynamic-graph YdrMaster 2023-10-06 15:57:51 +0800
  • 2fd02eec59 build: update RefactorGraph YdrMaster 2023-09-30 10:26:08 +0800
  • e254e07a4a Merge commit 'f25bcca076b9b479673b62bfe9a36eeb7f33c9de' into dev-dynamic-graph YdrMaster 2023-09-30 10:24:05 +0800
  • 819484eda2 matrix reduce,threadIdx.x=0,17ms xgqdut2016 2023-09-28 16:49:01 +0800
  • 8ba5815daa feat: update refactor graph commit zhangyunze 2023-09-28 15:14:08 +0800
  • ddbec7d60a BLOCK_DIM_x=1,num_block_x=N xgqdut2016 2023-09-28 12:37:10 +0800
  • f25bcca076
    add python examples (#143) Haojie Wang 2023-09-28 10:40:45 +0800
  • 2a9e3f19e9 feat: 正确设置和使用 Converter 和它构造出的图 YdrMaster 2023-09-28 10:12:58 +0800
  • 7ed6c9b78c fmt YdrMaster 2023-09-27 18:13:56 +0800
  • ec4d12f5d4 feat: 所有 gpu 上数据由缓存管理,设置为 external YdrMaster 2023-09-27 18:10:08 +0800
  • af2b409878 Merge remote-tracking branch 'origin/dev-dynamic-graph-allocator' into dev-dynamic-graph YdrMaster 2023-09-27 17:47:05 +0800
  • d573455053 feat: support weight cache zhangyunze 2023-09-27 17:44:26 +0800
  • 206fbedbef -add external tag dev-dynamic-graph-allocator kilinchange 2023-09-27 13:51:50 +0800
  • 9e17481119 build: 更新 RefactorGraph YdrMaster 2023-09-27 13:13:11 +0800
  • 877db21021
    Fix support kvcache (#142) kilinchange 2023-09-27 11:08:44 +0800
  • c5cc00a010
    Add doc support for xpu (#141) baominghelly 2023-09-27 08:55:01 +0800
  • 0fad4e16d8 refactor: 不下降后端不需要的张量 YdrMaster 2023-09-26 15:32:19 +0800
  • b640ab1689 modified attention.cu,BLOCK_DIM_x must leq 32 xgqdut2016 2023-09-26 14:53:02 +0800
  • 4e481e72a5 change cmake option for kunlunxin xpu from 'xpu' to 'kunlun'; fix bug after merging distributed infrastructure whjthu 2023-09-26 14:10:22 +0800
  • 445442813d refactor: 更新 RefactorGraph,调整子项目顺序 YdrMaster 2023-09-26 11:36:38 +0800
  • 66ef2a9c61 fix: 支持更多 numpy 类型 YdrMaster 2023-09-25 11:34:35 +0800
  • ec391674ac 2D block, share S xgqdut2016 2023-09-25 13:02:25 +0800
  • 62be816f53
    修复split concat当dim=0结果出错的问题 (#138) PanZezhong1725 2023-09-25 10:25:54 +0800
  • 8ae83caa7c build: 更新 RefactorGraph 追上 dev,移除 par YdrMaster 2023-09-25 10:00:21 +0800
  • 024fc359e7 optimize: 优化调整 YdrMaster 2023-09-22 19:50:18 +0800
  • 73356102b5 feat: 使用 slice 和 range 替换 vector 传序号 YdrMaster 2023-09-22 18:37:27 +0800
  • 60c5c9fd56 feat: 使用移动语义等方法优化性能 YdrMaster 2023-09-22 18:17:07 +0800
  • f4f80c6f3b fix: 先设置性质后 malloc YdrMaster 2023-09-22 17:41:53 +0800
  • b7775b852a fix: 为所有张量打标记 YdrMaster 2023-09-22 17:08:35 +0800
  • f1dc440a3c 1D attention ,global S matrix xgqdut2016 2023-09-22 16:59:42 +0800
  • 1e05a159e1 build: 更新 RefactorGraph YdrMaster 2023-09-22 16:33:50 +0800
  • 69df615c70 feat: support less kernel zhangyunze 2023-09-22 15:51:33 +0800
  • cbdb982006 feat: support less/expand/unsqueeze op convert zhangyunze 2023-09-22 15:23:36 +0800
  • 0512bfca63 build: 更新 RefactorGraph YdrMaster 2023-09-22 12:58:38 +0800
  • 3efb0a9963 style: 调整用法 YdrMaster 2023-09-22 09:00:26 +0800
  • b0d3edecac refactor: 修改 RefactorGraph 命名空间 YdrMaster 2023-09-21 17:39:33 +0800
  • b1a2d91aba modified the format from master xgqdut2016 2023-09-21 15:03:57 +0800
  • 410844c058 modified error in onnx.py xgqdut2016 2023-09-21 14:59:12 +0800
  • 3f5178d069 the baseline of flash attention xgqdut2016 2023-09-21 14:31:43 +0800
  • d18b26c5f6 fix: inputs 忘了初始化了 YdrMaster 2023-09-21 11:36:48 +0800
  • 3e2904b18d refactor: 适配类内无名枚举 DataType YdrMaster 2023-09-21 08:31:13 +0800
  • c2b11b999f feat: 为所有边推导依赖变量 YdrMaster 2023-09-21 05:59:35 +0800
  • e8f820f47b refactor: 简化编译器的 setInput YdrMaster 2023-09-21 03:40:49 +0800
  • 7edf1983dd feat: 支持执行器设置输入 YdrMaster 2023-09-20 17:18:35 +0800
  • b80881f596 feat: 前端拆分编译器和执行器 YdrMaster 2023-09-20 17:07:20 +0800
  • 0174ab73bc fix: 改正 run 的名字,并重新导出 YdrMaster 2023-09-20 15:52:09 +0800
  • cb8dee1a20 feat: 修改 run 的定义,支持先构造 runtime 再传递给 run YdrMaster 2023-09-20 15:46:10 +0800
  • 184dd7968c Pass in const operator ref dump/init panzezhong 2023-09-20 14:07:33 +0800
  • bf5b372933 Move dump field from graph obj to runtime obj panzezhong 2023-09-20 13:45:33 +0800
  • d0e57dbcb1 Merge commit '8f2597a508a22456569b38e0a540cf9833492fb7' into dev-dynamic-graph YdrMaster 2023-09-20 13:45:28 +0800