Commit Graph

3 Commits

Author SHA1 Message Date
zhangyue 985d0dee5f
Kunlun dist op (#225)
* kunlun dist inference fix

* kunlun distributed

* 添加昆仑芯分布式脚本以及解决运行llama遇到的问题

* set -j8

* format

* move run_pytorch.py int o cuda/

* update notes

---------

Co-authored-by: weijie01 <weijie01@baidu.com>
Co-authored-by: wanghailu <wanghailu0717@163.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2024-04-23 15:46:25 +08:00
zhangyue 00e6cc2587
XCCL support (#171)
* add reduce_mean and gather

* fix format

* add kunlun allreduce and cmakefile

* add kunlun allreduce and cmakefile

* deltete cmake opt

* fix format

* fix makefile

* add DIST option in Makefile

* add xpu allgather

* delete xpu_wait()

* add xpu allgather

* delete specific compiler

* fix format

* fix gather

* add broadcast

* fix format

* fix

* fix xpu, add where operation, fix element-wise operation

* fix softmax

* fix softmax

* log internal input and output

* fix kunlun gather bugs

* update CMakeList.txt and Makefile

* fix some kunlun kernels

* fix Makefile

* fix Makefile

* set cmake version 3.12

* format

* fix where, gather and support gpt2

* "fix format"

* fix format

* copy onnx.py from master

* use KUNLUN_HOME instead of absolute path

* fix torchvision models

* support torchvison model-zoo

* fix format

* format fix, CMakeList fix

* fix review

* fix vecToString return value

* fix format

* delete  empty file

---------

Co-authored-by: wanghailu <wanghailu0717@163.com>
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
2024-02-29 11:48:35 +08:00
Hardy 1184fa131f
Xpu (#82)
* support kunlun xpu and add an operator named Add

* add sub, mul, div, pow, maximum, minimum

* add code

* add xpu code

* add code

* add matmul

* add transpose

* add unary operator

* add unary operator

* add some operator

* add code

* support run resnet18 on xpu

* add code

* add max pool2d

* fix xpu code, let it can run.

* 添加XPU算子 (#120)

* add floordiv for xpu

* add batchnorm for xpu

* add more cast types for xpu

* add conv_trans for xpu

* add pad for xpu

* add logical ops for xpu

* fix format for xpu src and include

* fix format for xpu test

* fix format for xpu src

---------

Co-authored-by: Bolun <bolunz@u.nus.edu>

* Xpu abs (#121)

* add: unary kernel for xpu

* formatting

* format

* format

* format

* fix: pointer jump

* fix optype comments

* fix bug introduced while resolving conflict

* change cmake option for kunlunxin xpu from 'xpu' to 'kunlun'; fix bug after merging distributed infrastructure

* Add doc support for xpu (#141)

* fix

* fix

* fix pooling test

* format

* format

* fix

* fix

* set cmake version requirement

* fix cmakelists

* rename xpu to kunlun

* fix

* fix format

* fix format

* fix format

* fix change name to kunlun

* format

* fix format

* clang format

* fix format

---------

Co-authored-by: root <root@localhost.localdomain>
Co-authored-by: wanghailu <wanghailu@qiyuanlab.com>
Co-authored-by: wanghailu <wanghailu0717@163.com>
Co-authored-by: Bolun Zhang <48948016+Chamberlain0w0@users.noreply.github.com>
Co-authored-by: Bolun <bolunz@u.nus.edu>
Co-authored-by: zhangyue207 <138768300+zhangyue207@users.noreply.github.com>
Co-authored-by: Haojie Wang <haojie0429@gmail.com>
Co-authored-by: baominghelly <41820386+baominghelly@users.noreply.github.com>
Co-authored-by: Bolun <chamberlain0w0@gmail.com>
2023-10-16 10:57:08 +08:00