Skip to content

MLX backend POC #1365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 56 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
d25f214
mlx poc
williambdean Apr 11, 2025
edacc0e
add test for dot
williambdean Apr 11, 2025
052fdc2
restore pytorch
williambdean Apr 11, 2025
a9ecad0
wrap in mx.array
williambdean Apr 11, 2025
e690bff
modify the pytorch jit
williambdean Apr 11, 2025
ad29c17
move file
williambdean Apr 11, 2025
ba29b37
dont wrap
williambdean Apr 11, 2025
8716870
attempt to fix github action
williambdean Apr 11, 2025
9bf7edf
change the rtol
williambdean Apr 11, 2025
96ba116
add init file
williambdean Apr 11, 2025
e116fa1
skip if not installed
williambdean Apr 11, 2025
5d5f754
remove torch related code / comments
williambdean Apr 11, 2025
b8cee3f
simplify the fgraph_convert
williambdean Apr 12, 2025
d057453
assert type
williambdean Apr 12, 2025
ae202e6
simplify the internal
williambdean Apr 18, 2025
f1941fe
remove the language
williambdean Apr 18, 2025
7c8eae7
Adding operations in pytensor
cetagostini Apr 18, 2025
67a74fb
add extension
williambdean Apr 18, 2025
fb5eb52
make compare function
williambdean Apr 18, 2025
516b595
rename function
williambdean Apr 18, 2025
67bb8da
correct the function name
williambdean Apr 18, 2025
82bb964
tests for elemwise
williambdean Apr 18, 2025
877d79f
Changes
cetagostini Apr 18, 2025
fafedd6
Toma tu tomate William
cetagostini Apr 18, 2025
60acb8d
Pushing changes with the core shit.
cetagostini Apr 18, 2025
242aba7
add more tests
williambdean Apr 18, 2025
6cb47fc
additional tests
williambdean Apr 18, 2025
bc98e09
test for switch with mlx
williambdean Apr 18, 2025
4d5b34b
Pushing code
cetagostini Apr 18, 2025
5abd32d
Changes
cetagostini Apr 18, 2025
12daeac
A lot of new code
cetagostini Apr 18, 2025
ac93949
almost there baby william
cetagostini Apr 18, 2025
a19cbc8
Another push small
cetagostini Apr 18, 2025
5c97bc8
fix for all
williambdean Apr 18, 2025
2fc81bc
fix for carlos
williambdean Apr 18, 2025
e6437cc
just return the compiled func
williambdean Apr 19, 2025
c3a3e1a
A change for willy may!
cetagostini Apr 19, 2025
e7cf10e
FINALLY BABY LETS PARTY! (IF YOU ARE READING THIS MAKE MORE PRs)
cetagostini Apr 19, 2025
880dd5c
refactor to use getattr
williambdean Apr 19, 2025
1e6addd
bring argmax test
williambdean Apr 19, 2025
aabbb78
use deepcopy
williambdean Apr 19, 2025
0812c55
move some tests
williambdean Apr 19, 2025
294c271
THE SUPER BLOCKWISEE YA YA YA YA JUUUUU
cetagostini Apr 19, 2025
9d3eca8
Merge branch 'mlx-poc' of https://github.com/williambdean/pytensor in…
cetagostini Apr 19, 2025
9f31ab1
Guys, I'm getting sad. We need help yisus!!!!!
cetagostini Apr 19, 2025
37440ff
WILLIAM YOU NEED TO GO ANOTHER MILE! GO ON MY MATEEEEEEE, GO PHILLIES!
cetagostini Apr 19, 2025
4e4923f
RETURN, WHAT A SHAME! Sad times are coming.
cetagostini Apr 19, 2025
6b27dc4
AI COULD BE COOL? OR WE ARE JUST FUCKING AROUND?
cetagostini Apr 19, 2025
e308f83
AI RULES BABY MY MATE
cetagostini Apr 19, 2025
3744a18
test conv1d case
williambdean Apr 19, 2025
b41cab0
I'm going for pizzas, it was an incredible day!
cetagostini Apr 19, 2025
323fa9d
Merge branch 'mlx-poc' of https://github.com/williambdean/pytensor in…
cetagostini Apr 19, 2025
9766975
SUUUUUUUUU!!!!!! LIFE IS GOING WELL. MLX FOR MEDIA MIX MODELS BAY
cetagostini Apr 19, 2025
5ffc5ef
pre-commit
cetagostini Apr 19, 2025
597f84e
Almost working
cetagostini Apr 19, 2025
fb8fd2f
Last PR sampling working
cetagostini Apr 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ jobs:
install-numba: [0]
install-jax: [0]
install-torch: [0]
install-mlx: [0]
part:
- "tests --ignore=tests/tensor --ignore=tests/scan --ignore=tests/sparse"
- "tests/scan"
Expand Down Expand Up @@ -115,6 +116,7 @@ jobs:
install-numba: 0
install-jax: 0
install-torch: 0
install-mlx: 0
- install-numba: 1
os: "ubuntu-latest"
python-version: "3.10"
Expand Down Expand Up @@ -150,6 +152,13 @@ jobs:
fast-compile: 0
float32: 0
part: "tests/link/pytorch"
- install-mlx: 1
os: "ubuntu-latest"
python-version: "3.10"
numpy-version: ">=2.0"
fast-compile: 0
float32: 0
part: "tests/link/mlx"
- os: macos-15
python-version: "3.13"
numpy-version: ">=2.0"
Expand Down Expand Up @@ -196,6 +205,7 @@ jobs:
if [[ $INSTALL_NUMBA == "1" ]]; then micromamba install --yes -q -c conda-forge "python~=${PYTHON_VERSION}" "numba>=0.57"; fi
if [[ $INSTALL_JAX == "1" ]]; then micromamba install --yes -q -c conda-forge "python~=${PYTHON_VERSION}" jax jaxlib numpyro && pip install tensorflow-probability; fi
if [[ $INSTALL_TORCH == "1" ]]; then micromamba install --yes -q -c conda-forge "python~=${PYTHON_VERSION}" pytorch pytorch-cuda=12.1 "mkl<=2024.0" -c pytorch -c nvidia; fi
if [[ $INSTALL_MLX == "1" ]]; then micromamba install --yes -q -c conda-forge "python~=${PYTHON_VERSION}" mlx; fi
pip install pytest-sphinx

pip install -e ./
Expand All @@ -212,6 +222,7 @@ jobs:
INSTALL_NUMBA: ${{ matrix.install-numba }}
INSTALL_JAX: ${{ matrix.install-jax }}
INSTALL_TORCH: ${{ matrix.install-torch}}
INSTALL_MLX: ${{ matrix.install-mlx }}
OS: ${{ matrix.os}}

- name: Run tests
Expand Down
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ __pycache__
\#*\#
build
compiled/*.cpp
core.*
cutils_ext.cpp
dist
doc/.build/
Expand Down
17 changes: 17 additions & 0 deletions pytensor/compile/mode.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from pytensor.link.basic import Linker, PerformLinker
from pytensor.link.c.basic import CLinker, OpWiseCLinker
from pytensor.link.jax.linker import JAXLinker
from pytensor.link.mlx.linker import MLXLinker
from pytensor.link.numba.linker import NumbaLinker
from pytensor.link.pytorch.linker import PytorchLinker
from pytensor.link.vm import VMLinker
Expand All @@ -50,6 +51,7 @@
"jax": JAXLinker(),
"pytorch": PytorchLinker(),
"numba": NumbaLinker(),
"mlx": MLXLinker(),
}


Expand Down Expand Up @@ -494,13 +496,28 @@ def clone(self, link_kwargs=None, optimizer="", **kwargs):
),
)

MLX = Mode(
MLXLinker(),
RewriteDatabaseQuery(
include=["fast_run"],
exclude=[
"cxx_only",
"BlasOpt",
"fusion",
"inplace",
"scan_save_mem_prealloc",
],
),
)


predefined_modes = {
"FAST_COMPILE": FAST_COMPILE,
"FAST_RUN": FAST_RUN,
"JAX": JAX,
"NUMBA": NUMBA,
"PYTORCH": PYTORCH,
"MLX": MLX,
}

_CACHED_RUNTIME_MODES: dict[str, Mode] = {}
Expand Down
1 change: 1 addition & 0 deletions pytensor/link/mlx/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from pytensor.link.mlx.linker import MLXLinker
13 changes: 13 additions & 0 deletions pytensor/link/mlx/dispatch/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# isort: off
from pytensor.link.mlx.dispatch.basic import mlx_funcify, mlx_typify

import pytensor.link.mlx.dispatch.math
import pytensor.link.mlx.dispatch.basic
import pytensor.link.mlx.dispatch.elemwise
import pytensor.link.mlx.dispatch.shape
import pytensor.link.mlx.dispatch.subtensor
import pytensor.link.mlx.dispatch.core
import pytensor.link.mlx.dispatch.signal
import pytensor.link.mlx.dispatch.signal.conv
import pytensor.link.mlx.dispatch.blockwise
# isort: on
78 changes: 78 additions & 0 deletions pytensor/link/mlx/dispatch/basic.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
import warnings
from copy import deepcopy
from functools import singledispatch
from types import NoneType

import mlx.core as mx
import numpy as np

from pytensor.compile.ops import DeepCopyOp
from pytensor.graph.fg import FunctionGraph
from pytensor.link.utils import fgraph_to_python
from pytensor.raise_op import Assert, CheckAndRaise


@singledispatch
def mlx_typify(data, **kwargs):
raise NotImplementedError(f"mlx_typify is not implemented for {type(data)}")


@mlx_typify.register(np.ndarray)
@mlx_typify.register(mx.array)
def mlx_typify_tensor(data, dtype=None, **kwargs):
return mx.array(data, dtype=dtype)


@mlx_typify.register(slice)
@mlx_typify.register(NoneType)
@mlx_typify.register(np.number)
def mlx_typify_no_conversion_needed(data, **kwargs):
return data


@singledispatch
def mlx_funcify(op, node=None, storage_map=None, **kwargs):
"""Create a MLX compatible function from an PyTensor `Op`."""
raise NotImplementedError(
f"No MLX conversion for the given `Op`: {op}.\nCheck out `https://github.com/pymc-devs/pytensor/issues/1350` for progress or to request we prioritize this operation"
)


@mlx_funcify.register(FunctionGraph)
def mlx_funcify_FunctionGraph(
fgraph,
node=None,
fgraph_name="mlx_funcified_fgraph",
conversion_func=mlx_funcify,
**kwargs,
):
built_kwargs = {"conversion_func": conversion_func, **kwargs}
return fgraph_to_python(
fgraph,
conversion_func,
type_conversion_fn=mlx_typify,
fgraph_name=fgraph_name,
**built_kwargs,
)


@mlx_funcify.register(DeepCopyOp)
def mlx_funcify_DeepCopyOp(op, **kwargs):
def deepcopyop(x):
return deepcopy(x)

return deepcopyop


@mlx_funcify.register(Assert)
@mlx_funcify.register(CheckAndRaise)
def mlx_funcify_CheckAndRaise(op, **kwargs):
warnings.warn(
f"""Skipping `CheckAndRaise` Op (assertion: {op.msg}) as MLX tracing would remove it.""",
stacklevel=2,
)

def assert_fn(x, *inputs):
return x

return assert_fn
99 changes: 99 additions & 0 deletions pytensor/link/mlx/dispatch/blockwise.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
import mlx.core as mx

from pytensor.link.mlx.dispatch import mlx_funcify
from pytensor.tensor.blockwise import Blockwise
from pytensor.tensor.signal.conv import Conv1d


def blockwise_conv1d(op, node, **kwargs):
"""
Custom implementation of Blockwise.conv1d for MLX.
"""

def batched_conv1d(
x: mx.array,
kernels: mx.array,
mode: str = op.core_op.mode,
stride: int = 1,
dilation: int = 1,
) -> mx.array:
"""
Apply B separate 1D convolutions (full or valid) to B sequences in parallel.

Parameters
----------
x : array of shape (B, T)
B sequences of length T.
kernels : array of shape (B, K)
B kernels of length K.
mode : {"valid", "full"}
"valid" → no padding, output length = T - K + 1
"full" → zero-pad so output length = T + K - 1
stride : int, convolution stride (default=1)
dilation : int, convolution dilation (default=1)

Returns
-------
out : array of shape (B, L)
where L =
- T - K + 1 if mode="valid"
- T + K - 1 if mode="full"
"""
# --- 1) shape checks ---
B, T = x.shape
Bk, K = kernels.shape
if B != Bk:
raise ValueError(f"Batch mismatch: x has {B}, kernels has {Bk}")

# --- 2) flip kernels for convolution ---
kernels_flipped = kernels[:, ::-1] # shape (B, K)

# --- 3) decide padding ---
if mode == "valid":
pad = 0
elif mode == "full":
pad = (K - 1) * dilation
else:
raise ValueError(f"Unsupported mode {mode!r}: choose 'valid' or 'full'")

# --- 4) reshape into MLX conv1d form ---
# input: (N=1, H=T, C_in=B)
x_in = x.T[None, :, :]

# weight: (C_out=B, H_f=K, C_in=1)
w = kernels_flipped[:, :, None]

# --- 5) run grouped conv1d ---
y = mx.conv1d(x_in, w, stride=stride, padding=pad, dilation=dilation, groups=B)
# y shape: (1, H_out, B)

# --- 6) return shape (B, H_out) ---
return y[0].T

return batched_conv1d


@mlx_funcify.register(Blockwise)
def funcify_Blockwise(op: Blockwise, node, **kwargs):
# 1) If it's a Conv1d Blockwise, use the custom implementation
if isinstance(op.core_op, Conv1d):
return blockwise_conv1d(op, node, **kwargs)

# 2) Otherwise, get the core python function for this Blockwise
core_node = op._create_dummy_core_node(node.inputs)
core_f = mlx_funcify(op.core_op, core_node)

# 3) Determine how many inputs correspond to batch dimensions
n_batch = op.batch_ndim(node)

# 4) Build in_axes: map only the first n_batch args, keep the rest static
in_axes = tuple(0 if i < n_batch else None for i in range(len(node.inputs)))

# 5) Vectorize (vmap) with in_axes
blockwise_f = mx.vmap(core_f, in_axes=in_axes)

# 6) Return the mapped function
def blockwise_fun(*inputs):
return blockwise_f(*inputs)

return blockwise_fun
Loading