InfiniTensor/include/operators
constroy Li f60767a770
impl distributed launch with NCCL (#106)
* add cmake bits about NCCL

* move example to examples/NNmodel

* impl NCCL communicator

* add comm related function to Runtime

* export runtime interface

* add launch.py

* use unique name to distingush the the NCCL ID file

* add timeout to communicator init

* expose communicator obj from runtime obj, add unit test for nccl communicator

* reformat files

* Add allReduce operator and cuda nccl allReduce kernel

* impl model parallel for resnet

* add allGather nccl kernel and operator

* Add allreduce allgather operator tests, change allgather kernel to output list of tensor, fix shape infer, handle nullptr output

* fix format of onnx.py

* use concat following AllGather

* get tensor parallel for resnet

* fix format of graph_handler.cc

* change BUILD_DIST default to OFF

* polish code of communicator

* update .gitignore

* Add broadcast operator and cuda kernel

* Add comments for operators

* remove const of class member

* move communicator to CudaRuntimeObj

* Add an empty line at EOF.

---------

Co-authored-by: panzezhong <panzezhong@qiyuanlab.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2023-09-05 09:47:35 +08:00
..
G2BMM.h Add documentation for operators. 2023-02-13 22:51:15 +08:00
GBMM.h Add documentation for operators. 2023-02-13 22:51:15 +08:00
activation_backward.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
all_gather.h impl distributed launch with NCCL (#106) 2023-09-05 09:47:35 +08:00
all_reduce.h impl distributed launch with NCCL (#106) 2023-09-05 09:47:35 +08:00
batch_norm.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
broadcast.h impl distributed launch with NCCL (#106) 2023-09-05 09:47:35 +08:00
concat.h Add documentation for operators. 2023-02-13 22:51:15 +08:00
conv.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
det.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
dropout.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
element_wise.h refactor(core): 添加新的 `OpType` 定义 (#99) 2023-08-07 11:17:05 +08:00
expand.h 框架支持bert/gpt2模型构图 (#94) 2023-08-29 16:06:52 +08:00
extend.h Add documentation for operators. 2023-02-13 22:51:15 +08:00
gather.h feat: 前端支持 gather 及单元测试 2023-02-14 14:16:01 +08:00
matmul.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
membound.h NNET supports TVM backend and kernels (#78) 2023-04-18 00:26:36 +08:00
pad.h feat: 前端支持 pad 及单元测试 2023-02-15 11:41:06 +08:00
pooling.h refactor(core): 添加新的 `OpType` 定义 (#99) 2023-08-07 11:17:05 +08:00
reduce_mean.h feat: 导出 ReduceMean 到 onnx 2023-03-15 15:09:12 +08:00
reshape.h 支持fp16 dtype (#96) 2023-08-02 16:38:16 +08:00
resize.h Cpu backend2 (#77) 2023-04-17 12:15:23 +08:00
slice.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
softmax.h Cpu backend2 (#77) 2023-04-17 12:15:23 +08:00
split.h ADD: sub graph replacement. (#56) 2023-04-17 13:09:07 +08:00
transpose.h support mixed dtype (#102) 2023-08-16 21:49:43 +08:00
unary.h support mixed dtype (#102) 2023-08-16 21:49:43 +08:00
where.h 框架支持bert/gpt2模型构图 (#94) 2023-08-29 16:06:52 +08:00