InfiniTensor/pyinfinitensor
constroy Li feccd4f318
fix tensor parallel for llama (#159)
* fix Slice

* change default rounds of timeit to 10 to reduce time

* fix slice with large ends

* Reshape support Int64

* support position_ids as input

* skip last MatMul in Llama

* skip infer_shapes to parse large model

* update launch.py

* fix split_concat_kernel

* print more message in launch.py

* Reshape supports both Int32 and Int64

* try infer_shapes and warn about failure

* fix format

---------

Co-authored-by: whjthu <haojie0429@gmail.com>
2023-10-30 15:04:16 +08:00
..
docs feat: 创建 pyinfinitensor 前端 2023-02-13 09:19:05 +08:00
src/pyinfinitensor fix tensor parallel for llama (#159) 2023-10-30 15:04:16 +08:00
tests Xpu (#82) 2023-10-16 10:57:08 +08:00
pyproject.toml Change onnx-simplifier to onnxsim to resolve build issue on xpu (#164) 2023-10-21 02:58:32 +08:00