Skip to content

Commit b0c75ac

Browse files
authored
CANN: Optimize CANN buffer pool memory management (#12875)
Multiple optional memory pools are provided for CANN, including VMM, priority queue-based, and traditional memory pools. 1.When the memory pool is available and GGML_CANN_DISABLE_VMM_POOL is not defined, the VMM pool is selected by default. 2.Otherwise, if GGML_CANN_ENABLE_BUF_PRIO_POOL is defined, the priority queue-based memory pool is used. 3.If neither condition is met, the default memory pool is used.
1 parent d6d2c2a commit b0c75ac

File tree

2 files changed

+335
-64
lines changed

2 files changed

+335
-64
lines changed

ggml/src/ggml-cann/aclnn_ops.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -1783,7 +1783,7 @@ void ggml_cann_get_rows(ggml_backend_cann_context& ctx, ggml_tensor* dst) {
17831783
src0->data, ACL_INT8, sizeof(int8_t), weight_ne, weight_nb,
17841784
GGML_MAX_DIMS + 1);
17851785
aclTensor* acl_scale_tensor = ggml_cann_create_tensor(
1786-
src0->data, ACL_FLOAT16, sizeof(float16_t), scale_ne, scale_nb,
1786+
src0->data, ACL_FLOAT16, sizeof(uint16_t), scale_ne, scale_nb,
17871787
GGML_MAX_DIMS + 1, ACL_FORMAT_ND, scale_offset);
17881788
aclTensor* dequant_tensor = ggml_cann_create_tensor(
17891789
dequant_buffer_allocator.get(), ACL_FLOAT, sizeof(float_t),

0 commit comments

Comments
 (0)