InfiniTensor/include/core
Hardy 1184fa131f
Xpu (#82)
* support kunlun xpu and add an operator named Add

* add sub, mul, div, pow, maximum, minimum

* add code

* add xpu code

* add code

* add matmul

* add transpose

* add unary operator

* add unary operator

* add some operator

* add code

* support run resnet18 on xpu

* add code

* add max pool2d

* fix xpu code, let it can run.

* 添加XPU算子 (#120)

* add floordiv for xpu

* add batchnorm for xpu

* add more cast types for xpu

* add conv_trans for xpu

* add pad for xpu

* add logical ops for xpu

* fix format for xpu src and include

* fix format for xpu test

* fix format for xpu src

---------

Co-authored-by: Bolun <bolunz@u.nus.edu>

* Xpu abs (#121)

* add: unary kernel for xpu

* formatting

* format

* format

* format

* fix: pointer jump

* fix optype comments

* fix bug introduced while resolving conflict

* change cmake option for kunlunxin xpu from 'xpu' to 'kunlun'; fix bug after merging distributed infrastructure

* Add doc support for xpu (#141)

* fix

* fix

* fix pooling test

* format

* format

* fix

* fix

* set cmake version requirement

* fix cmakelists

* rename xpu to kunlun

* fix

* fix format

* fix format

* fix format

* fix change name to kunlun

* format

* fix format

* clang format

* fix format

---------

Co-authored-by: root <root@localhost.localdomain>
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: wanghailu <wanghailu0717@163.com>
Co-authored-by: Bolun Zhang <48948016+Chamberlain0w0@users.noreply.github.com>
Co-authored-by: Bolun <bolunz@u.nus.edu>
Co-authored-by: zhangyue207 <138768300+zhangyue207@users.noreply.github.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
Co-authored-by: baominghelly <41820386+baominghelly@users.noreply.github.com>
Co-authored-by: Bolun <chamberlain0w0@gmail.com>
2023-10-16 10:57:08 +08:00
..
blob.h Support bang c kernel wanghailu 0927 (#43) 2022-09-30 11:01:52 +08:00
common.h tensor parallel for transformer (#125) 2023-09-14 14:19:45 +08:00
communicator.h impl distributed launch with NCCL (#106) 2023-09-05 09:47:35 +08:00
constants.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
data_type.h 框架支持bert/gpt2模型构图 (#94) 2023-08-29 16:06:52 +08:00
dummy_mutator.h Add search engine (#64) 2023-02-12 18:27:52 +08:00
graph.h add naive allocator for debugging (#140) 2023-10-10 16:42:23 +08:00
graph_handler.h Add GatherElements op and cuda kernel (#149) 2023-10-12 09:18:12 +08:00
graph_match.h ADD: sub graph replacement. (#56) 2023-04-17 13:09:07 +08:00
hash.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
kernel.h refactor(core): 添加新的 `OpType` 定义 (#99) 2023-08-07 11:17:05 +08:00
lazy_allocator.h Support kvcache (#134) 2023-09-18 14:17:02 +08:00
mutator.h ADD: add mkl runtime for intel cpu , and add mkl kernel for matmul/conv/convtransposed. (#61) 2023-03-27 21:28:49 +08:00
object.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
op_type.h Xpu (#82) 2023-10-16 10:57:08 +08:00
operator.h refactor(core): 添加新的 `OpType` 定义 (#99) 2023-08-07 11:17:05 +08:00
perf_engine.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
ref.h Dev for 202303ddl (#66) 2023-04-18 15:10:33 +08:00
runtime.h Xpu (#82) 2023-10-16 10:57:08 +08:00
search_engine.h Add search engine (#64) 2023-02-12 18:27:52 +08:00
tensor.h Xpu (#82) 2023-10-16 10:57:08 +08:00
tensor_base.h Copyout numpy接口 (#135) 2023-09-15 16:40:44 +08:00
tensor_type.h Support kvcache (#134) 2023-09-18 14:17:02 +08:00