forked from jiuyuan/InfiniTensor
d1a90ba3e2
* [feature] support kvcache with static graph * use workspace to optimize kvcache attention --------- Co-authored-by: Haojie Wang <haojie0429@gmail.com> |
||
---|---|---|
.. | ||
llama_kvcache_inference.py | ||
onnx_inference.py | ||
paddle_densenet.py | ||
paddle_inception.py | ||
paddle_model_dev.md | ||
paddle_resnet.py | ||
resnet_inference.py |