[CANN] Optimize CANN buffer pool memory management #12875

bachelor-dou · 2025-04-10T11:23:35Z

Overview

Multiple optional memory pools are provided for CANN, including VMM, priority queue-based, and traditional memory pools.

When the memory pool is available and GGML_CANN_DISABLE_VMM_POOL is not defined, the VMM pool is selected by default.
Otherwise, if GGML_CANN_ENABLE_BUF_PRIO_POOL is defined, the priority queue-based memory pool is used.
If neither condition is met, the default memory pool is used.

This improvement was implemented by Frank Mai at first, from his repo: llama-box, and I applied the cann.patch to llama.cpp.

Environment

OS: ubuntu 22.04
NPU: 910B3
CANN: 8.0.RC3

Benchmark

We use model qwen2.5:0.5b-instruct-fp16.gguf for our benchmark.

./llama.cpp/build/bin/llama-cli -m ds/qwen2.5\:0.5b-instruct-fp16 -p "Building a website can be done in 10 steps:" -ngl 32

Use VMM pool

If no environment variables are set, VMM pool is used by default.

Building a website can indeed be a complex process that involves several stages. Here’s a step-by-step breakdown of what you might consider in the 10-step process:

1. **Define Your Goals and Objectives**: Clearly define what you want to achieve with your website. Are you looking to sell products, offer services, or promote a brand? Knowing your goals will guide your website design and development.

2. **Choose a Web Development Platform**: Select the platform or framework you will use to build your website. Popular choices include WordPress, Wix, Squarespace, and Netlify.

3. **Create a Website Structure**: Decide on the layout, navigation, and content of your website. Decide on the hierarchy of your site to ensure a clear user flow.

4. **Design Your Website**: Use graphic design software to create a visually appealing design. Consider themes, color schemes, and typography to enhance the user experience.

5. **Choose a Theme or Template**: Select a template or theme that fits your design and your goals. This will help you create a cohesive and professional website.

6. **Develop the Website**: Start coding your website from scratch, or use a pre-built template. This involves coding the HTML, CSS, and JavaScript to create the user interface.

7. **Test Your Website**: Use tools like browser developer tools to test your website for compatibility, responsiveness, and security. This will help you identify and fix any issues before launch.

8. **Optimize Your Website**: Consider the SEO aspects of your website, including optimizing for search engines and ensuring your website is mobile-friendly.

9. **Publish Your Website**: Once your website is tested and optimized, you can publish it on a website hosting service like WordPress.com, Squarespace.com, or Wix.com.

10. **Maintain and Update Your Website**: Continuously monitor your website for updates and improvements based on user feedback and changing market demands.

Each step can be challenging and requires careful planning and attention to detail. It's important to start with a clear vision, choose a platform that suits your goals, and follow through with consistent quality control and maintenance.

>
llama_perf_sampler_print:    sampling time =     846.24 ms /   473 runs   (    1.79 ms per token,   558.94 tokens per second)
llama_perf_context_print:        load time =    4304.85 ms
llama_perf_context_print: prompt eval time =      22.32 ms /    41 tokens (    0.54 ms per token,  1837.25 tokens per second)
llama_perf_context_print:        eval time =    7556.07 ms /   431 runs   (   17.53 ms per token,    57.04 tokens per second)
llama_perf_context_print:       total time =  101961.78 ms /   472 tokens

Use BUF_PRIO_POOL

If set export GGML_CANN_DISABLE_VMM_POOL=1 && export GGML_CANN_ENABLE_BUF_PRIO_POOL=1, priority queue-based memory pool will be enabled.

Building a website can be done in 10 steps:
assistant
Certainly! Building a website is a comprehensive process that typically involves several steps:

1. **Planning and Research**: Define the purpose and target audience of your website. Gather information on your competitors, the platforms on which they operate, and their features.

2. **Design**: Choose a design system or choose your own design. Decide on the color scheme, typography, layout, and overall design of your website.

3. **Development**: Start coding the website's structure and content. This involves HTML, CSS (for styling), and JavaScript (for interactivity).

4. **Deployment**: Decide on where you will store your files (e.g., on a server, in a cloud service, or locally). Choose a deployment method, such as static site generation (SSG) for static websites or server-side rendering (SSR) for dynamic websites.

5. **Testing**: Test your website across different browsers and devices to ensure compatibility and functionality. Use tools like browser testers or a developer’s testing suite.

6. **Launch**: Once testing is complete, deploy your website to the internet. Make sure your server can handle the traffic and that you have backups.

7. **Maintenance**: Regularly update your website with new content, improve its performance, and fix any bugs or issues that arise.

8. **SEO Optimization**: Enhance your website’s visibility in search engine results. Use relevant keywords, meta descriptions, and titles, and consider using off-page SEO strategies like backlinking.

9. **Security**: Ensure that your website is secure. This includes protecting user data, preventing hacking, and maintaining compliance with data privacy laws.

10. **Analytics and Feedback**: Use tools like Google Analytics to track website performance. Collect feedback from users and incorporate it into future updates.

Building a website is a collaborative process involving many stakeholders, so it's crucial to communicate effectively with them throughout the development and deployment phases.

>
llama_perf_sampler_print:    sampling time =     750.06 ms /   424 runs   (    1.77 ms per token,   565.29 tokens per second)
llama_perf_context_print:        load time =    4422.03 ms
llama_perf_context_print: prompt eval time =      30.95 ms /    41 tokens (    0.75 ms per token,  1324.93 tokens per second)
llama_perf_context_print:        eval time =    9070.38 ms /   382 runs   (   23.74 ms per token,    42.12 tokens per second)
llama_perf_context_print:       total time =   99563.38 ms /   423 tokens

BUF_POOL

If set export GGML_CANN_DISABLE_VMM_POOL=1, traditional buffer pools will be enabled.

Building a website can indeed be a complex process, but I'll break down the 10 steps into simpler, more manageable parts:

1. **Define the Purpose and Audience**: Start by defining what your website is for and who your target audience is. This will help you determine the content and design elements you need.

2. **Choose a Platform**: Select the platform or framework you'll use to build your website. Popular options include WordPress, Joomla, WordPress with built-in themes, or custom development.

3. **Design the Website**: Use the platform to design your website. This step is crucial for ensuring that your website is visually appealing and user-friendly.

4. **Create Content**: Write content that is relevant and informative. This could include blog posts, infographics, videos, or other forms of media.

5. **Choose a Theme or Template**: Select a theme or template that matches your brand identity and helps guide user experience.

6. **Implement Search Engine Optimization (SEO)**: Use SEO tools and practices to improve your website's search engine ranking.

7. **Launch the Website**: Once everything is set up, launch your website and start promoting it to attract traffic.

8. **Maintain and Update the Site**: Regularly update your website with new content and improve it based on user feedback and analytics.

9. **SEO Optimization**: Continue to optimize your website for search engines to increase visibility and organic traffic.

10. **User Experience (UX)**: Focus on creating a seamless and user-friendly experience for visitors.

Each step builds upon the previous one, so don't get discouraged if you hit roadblocks. A good website is not just a physical space, but a living, breathing entity that provides value to its users.

This process, while complex, can be broken down into manageable steps, making it easier to start and maintain a website.

>
llama_perf_sampler_print:    sampling time =     719.61 ms /   416 runs   (    1.73 ms per token,   578.09 tokens per second)
llama_perf_context_print:        load time =    4279.12 ms
llama_perf_context_print: prompt eval time =      21.54 ms /    41 tokens (    0.53 ms per token,  1903.26 tokens per second)
llama_perf_context_print:        eval time =    6469.97 ms /   374 runs   (   17.30 ms per token,    57.81 tokens per second)
llama_perf_context_print:       total time =  266109.47 ms /   415 tokens

bachelor-dou · 2025-04-14T11:49:09Z

@thxCode Thank you very much for your contribution. Given the superiority of the recent code changes, we have decided to merge it into the main repository. We now invite you to review this PR and look forward to your response.

thxCode · 2025-04-14T11:55:40Z

@bachelor-dou thanks, I think all things look good, and I would appreciate it if you could optimize them.

Multiple optional memory pools are provided for CANN, including VMM, priority queue-based, and traditional memory pools. 1.When the memory pool is available and GGML_CANN_DISABLE_VMM_POOL is not defined, the VMM pool is selected by default. 2.Otherwise, if GGML_CANN_ENABLE_BUF_PRIO_POOL is defined, the priority queue-based memory pool is used. 3.If neither condition is met, the default memory pool is used.

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 10, 2025

bachelor-dou changed the title ~~feat: Increase the way memory allocation is managed~~ [CANN]feat: Increase the way memory allocation is managed Apr 10, 2025

hipudding self-requested a review April 11, 2025 01:16

hipudding assigned bachelor-dou Apr 11, 2025

hipudding added the Ascend NPU issues specific to Ascend NPUs label Apr 11, 2025

bachelor-dou force-pushed the test branch from bbfeb0e to ffa449e Compare April 11, 2025 09:28

bachelor-dou changed the title ~~[CANN]feat: Increase the way memory allocation is managed~~ [CANN] Optimize CANN buffer pool memory management mode, supporting mode switching via environment variables Apr 14, 2025

bachelor-dou marked this pull request as ready for review April 14, 2025 11:48

bachelor-dou force-pushed the test branch from 55c0ba9 to 1759ca5 Compare April 14, 2025 12:34

bachelor-dou added 3 commits April 14, 2025 12:37

feat: Increase the way memory allocation is managed

14ac344

update some changes

cc36575

fix some errors

c21bc52

bachelor-dou force-pushed the test branch from 1759ca5 to c21bc52 Compare April 14, 2025 12:38

Merge branch 'ggml-org:master' into test

52ea510

hipudding mentioned this pull request Apr 15, 2025

Compile bug: Cann x86_64 not building #12945

Closed

hipudding changed the title ~~[CANN] Optimize CANN buffer pool memory management mode, supporting mode switching via environment variables~~ [CANN] Optimize CANN buffer pool memory management Apr 15, 2025

hipudding approved these changes Apr 15, 2025

View reviewed changes

hipudding merged commit b0c75ac into ggml-org:master Apr 15, 2025
51 checks passed

bachelor-dou deleted the test branch April 15, 2025 03:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CANN] Optimize CANN buffer pool memory management #12875

[CANN] Optimize CANN buffer pool memory management #12875

bachelor-dou commented Apr 10, 2025 •

edited

Loading

bachelor-dou commented Apr 14, 2025

thxCode commented Apr 14, 2025

[CANN] Optimize CANN buffer pool memory management #12875

[CANN] Optimize CANN buffer pool memory management #12875

Conversation

bachelor-dou commented Apr 10, 2025 • edited Loading

Overview

Environment

Benchmark

bachelor-dou commented Apr 14, 2025

thxCode commented Apr 14, 2025

bachelor-dou commented Apr 10, 2025 •

edited

Loading