Skip to content

Commit 80f19b4

Browse files
lhezshawngu-quic
andauthored
opencl: split ggml-opencl.cl into multiple files and cleanup (#12886)
* opencl: refactor - split the kernel files --------- Co-authored-by: Shangqing Gu <[email protected]> * opencl: split more kernels into separate files * opencl: specify subgroup size instead of querying it * opencl: refine Adreno cl compiler version parsing * opencl: skip some kernels not used by Adreno on old compilers * opencl: refine logic for selecting Adreno kernels * opencl: refine Adreno cl compiler version * opencl: cleanup preprocessor for kernels * opencl: consider Adreno CL compiler on Windows * opencl: add final newline for `mul_mv_f16_f16.cl` --------- Co-authored-by: Shangqing Gu <[email protected]>
1 parent f8f820c commit 80f19b4

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+5021
-4996
lines changed

ggml/src/ggml-opencl/CMakeLists.txt

+35-10
Original file line numberDiff line numberDiff line change
@@ -54,16 +54,41 @@ function(ggml_opencl_add_kernel KNAME)
5454
endfunction()
5555

5656
set(GGML_OPENCL_KERNELS
57-
ggml-opencl
58-
ggml-opencl_mm
59-
ggml-opencl_cvt
60-
ggml-opencl_gemv_noshuffle
61-
ggml-opencl_gemv_noshuffle_general
62-
ggml-opencl_mul_mat_Ab_Bi_8x4
63-
ggml-opencl_transpose_16
64-
ggml-opencl_transpose_32
65-
ggml-opencl_transpose_32_16
66-
ggml-opencl_im2col
57+
add
58+
clamp
59+
cpy
60+
cvt
61+
diag_mask_inf
62+
gelu
63+
gemv_noshuffle_general
64+
gemv_noshuffle
65+
get_rows
66+
im2col_f32
67+
im2col_f16
68+
mul_mat_Ab_Bi_8x4
69+
mul_mv_f16_f16
70+
mul_mv_f16_f32_1row
71+
mul_mv_f16_f32_l4
72+
mul_mv_f16_f32
73+
mul_mv_f32_f32
74+
mul_mv_q4_0_f32
75+
mul_mv_q4_0_f32_v
76+
mul_mv_q4_0_f32_8x_flat
77+
mul_mv_q4_0_f32_1d_8x_flat
78+
mul_mv_q4_0_f32_1d_16x_flat
79+
mul_mv_q6_k
80+
mul
81+
norm
82+
relu
83+
rms_norm
84+
rope
85+
scale
86+
silu
87+
softmax_4_f32
88+
softmax_4_f16
89+
softmax_f32
90+
softmax_f16
91+
transpose
6792
)
6893

6994
foreach (K ${GGML_OPENCL_KERNELS})

0 commit comments

Comments
 (0)