Skip to content

Add OneDNN or DirectML support #2303

@thewh1teagle

Description

@thewh1teagle

Currently the best results we can get with whisper.cpp is with Cuda (Nvidia) or CoreML (macOS).

On Windows there's only OpenBlas and it works slow, maybe 2 times of the duration of the audio (amd ryzen 5 4500u, medium model).
When using ctranslate2 on the same machine it works 2-3 times faster than the audio duration on CPU only!

Since recently whisper.cpp removed support for OpenCL, I think that it's important having good alternative to Windows users with Intel / AMD CPUs / TPUs.

There's few different options that can be added:
oneDNN-ExecutionProvider.html
DirectML-ExecutionProvider.html

In addition ctranslate2 uses ruy

Related: ggml-org/ggml#406 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions