Default Branch

5559536470 · add kunlun squeeze kernel (#229) · Updated 2024-04-28 11:28:28 +08:00

Branches

3b81100a46 · We handle null alpha when we load onnx model · Updated 2024-06-05 16:06:32 +08:00    p01753428

11
3

98b361442d · fix:test off compilation failure · Updated 2024-06-04 09:32:38 +08:00    p01753428

0
1

80cd1c951e · fast float4 · Updated 2024-05-29 15:21:54 +08:00    p01753428

70
16

bc2440cb98 · format · Updated 2024-05-28 16:03:42 +08:00    p01753428

0
49

a889527aa5 · add kunlun layernorm · Updated 2024-05-11 16:24:42 +08:00    p01753428

0
9

20f651b1d3 · implement instance norm in front · Updated 2024-05-08 17:44:54 +08:00    p01753428

0
2

7146294baa · memcopy instead of special kernel · Updated 2024-05-06 14:49:39 +08:00    p01753428

3
5

b0d030d0de · [fix] fix rope op test failing · Updated 2024-04-23 13:51:10 +08:00    p01753428

3
19

4a5b9572bb · add test scripts for llama2 and 9G models · Updated 2024-04-10 16:23:02 +08:00    p01753428

3
17

3b7b5740af · allocate workspace from allocator for kunlun runtime · Updated 2024-04-08 15:48:06 +08:00    p01753428

4
5

6c4dd7b28b · fix(front): 将stub改为可以接收GraphProto作为输入,消除分布式脚本保存额外的onnx文件, 采用int64作为index输入类型 · Updated 2024-04-07 17:15:40 +08:00    p01753428

3
1

25a3cedeb0 · add pytorch bench · Updated 2024-03-21 10:27:32 +08:00    p01753428

8
1

a6c919b61d · stream kernel · Updated 2024-03-07 17:01:00 +08:00    p01753428

9
11

e33131ce5c · fix comment · Updated 2024-01-17 10:57:44 +08:00    p01753428

25
10

3b5dd7d28c · Merge branch 'master' into update_pybind11 · Updated 2024-01-05 09:20:33 +08:00    p01753428

27
2

dc6befb549 · fix: fix re-dataMalloc for weight tensor and use of naive allocator · Updated 2023-12-29 17:27:36 +08:00    p01753428

35
39

a68ac10107 · Enrich dev doc · Updated 2023-12-05 17:14:28 +08:00    p01753428

38
3

965df4e294 · [feature] add fused attention_kvcache operator support (#179) · Updated 2023-11-14 23:44:22 +08:00    p01753428

44
0
Included

0a5d273130 · Add: print derivation steps for conv2gemm · Updated 2023-11-10 23:16:44 +08:00    p01753428

45
1

295450e5f4 · Add: show conv2gemm derivation · Updated 2023-11-10 22:49:07 +08:00    p01753428

99
96