InfiniTensor/pyinfinitensor/src/pyinfinitensor
constroy Li feccd4f318
fix tensor parallel for llama (#159)
* fix Slice

* change default rounds of timeit to 10 to reduce time

* fix slice with large ends

* Reshape support Int64

* support position_ids as input

* skip last MatMul in Llama

* skip infer_shapes to parse large model

* update launch.py

* fix split_concat_kernel

* print more message in launch.py

* Reshape supports both Int32 and Int64

* try infer_shapes and warn about failure

* fix format

---------

Co-authored-by: whjthu <haojie0429@gmail.com>
2023-10-30 15:04:16 +08:00
..
__init__.py style: use __path__ to import 2023-02-21 09:17:34 +08:00
onnx.py fix tensor parallel for llama (#159) 2023-10-30 15:04:16 +08:00