diff --git a/docs/source/learn/core_notebooks/dimensionality.ipynb b/docs/source/learn/core_notebooks/dimensionality.ipynb index f58b74b59b..8eba7adc4c 100644 --- a/docs/source/learn/core_notebooks/dimensionality.ipynb +++ b/docs/source/learn/core_notebooks/dimensionality.ipynb @@ -9,231 +9,1419 @@ }, "source": [ "(dimensionality)=\n", - "# PyMC Dimensionality\n", - "PyMC provides a number of ways to specify the dimensionality of its distributions. In this document we will not provide an exhaustive explanation but rather an overview and current best practices. \n", + "# Distribution Dimensionality\n", + "PyMC provides a number of ways to specify the dimensionality of its distributions. This document provides an overview, and offers some user tips.\n", "\n", "## Glossary\n", "In this document we'll be using the term dimensionality to refer to the idea of dimensions. Each of the terms below has a specific\n", "semantic and computational definition in PyMC. While we share them here they will make much more sense when viewed in the examples below.\n", "\n", - "+ *Implied dimensions* → dimensionality that follows from inputs to the RV\n", - "+ *Support dimensions* → dimensions you can NEVER get rid of\n", - "+ *`ndim_support`* → smallest shape that can result from a random draw. This is a fixed attribute in the distribution definition\n", - "+ *Shape* → final resulting tensor shape\n", - "+ *Size* → shape minus the support dimensions\n", - "+ *Dims* → An array of dimension names\n", - "+ *Coords* → A dictionary mapping dimension names to coordinate values\n", + "+ *Support dimensions* → The core dimensionality of a distribution\n", + "+ *Batch dimensions* → Extra dimensions beyond the support dimensionality of a distribution\n", + "+ *Implicit dimensions* → Dimensions that follow from the values or shapes of the distribution parameters\n", + "+ *Explicit dimensions* → Dimensions that are explicitly defined by one of the following arguments:\n", + " + *Shape* → Number of draws from a distribution\n", + " + *Dims* → An array of dimension names\n", + "+ *Coords* → A dictionary mapping dimension names to coordinate values" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from functools import partial\n", "\n", + "import pymc as pm\n", + "import numpy as np\n", + "import aesara.tensor as at" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Univariate distribution example\n", + "We can start with the simplest case, a single Normal distribution. We use `.dist` to specify one outside of a PyMC Model." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "normal_dist = pm.Normal.dist()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "We can then use the {func}`~pymc.draw` function to take a random draw from that distribution." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "# Just patching the draw function for reproducibility\n", + "rng = np.random.default_rng(seed=sum(map(ord, \"dimensionality\")))\n", + "draw = partial(pm.draw, random_seed=rng)" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(array(0.80189558), 0)" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "normal_draw = draw(normal_dist)\n", + "normal_draw, normal_draw.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "In this case we end up with a single scalar value. This means that a Normal distribution has a scalar support dimensionality, as the smallest random draw you can take is a scalar which has a dimension of zero. The support dimensionality of every distribution is hard-coded as a property." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "normal_dist.owner.op.ndim_supp" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### Explicit batch dimensions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "If one needs more than a single draw, a natural tendency would be to create multiple copies of the same variable and stack them together." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 0.9434115 , -0.33327414, 0.83636296])" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "normal_dists = pm.math.stack([pm.Normal.dist() for _ in range(3)])\n", + "draw(normal_dists)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "More simply, one can create a *batch* of independent draws from the same distribution family by using the shape argument." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 0.98810294, -0.07003785, -0.37962748])" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "normal_dists = pm.Normal.dist(shape=(3,))\n", + "draw(normal_dists)" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 7.99932116e-04, -1.94407945e+00, 3.90451962e-01],\n", + " [ 1.10657367e+00, 6.49042149e-01, -1.09450185e+00],\n", + " [-2.96226305e-01, 1.41884595e+00, -1.31589441e+00],\n", + " [ 1.53109449e+00, -7.73771737e-01, 2.37418367e+00]])" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "normal_dists = pm.Normal.dist(shape=(4, 3))\n", + "draw(normal_dists)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Not only is this more succint, but it produces much more efficient vectorized code. We rarely use the stack approach in PyMC, unless we need to combine draws from distinct distribution families." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### Implicit batch dimensions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "It is also possible to create a batch of draws by passing parameters with higher dimensions, without having to specify shape." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 0.81833093, -0.2891973 , 1.2399946 ])" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "normal_dists = pm.Normal.dist(mu=np.array([0, 0, 0]), sigma=np.array([1, 1, 1]))\n", + "draw(normal_dists)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "This is equivalent to the previous example with explicit shape, and we could have passed it explicitly here. Because we did not, we refer to these batch dimensions as being *implicit*.\n", "\n", - "## General Recommendations\n", - "### When prototyping implied and size are convenient\n", - "Implied dimensions are easy to specify and great for quickly expanding an existing RV. F\n", - "\n", - "### For reusable code we suggest dims\n", - "For any more important work, or reuable work we suggest dims and coords as the labels will be passed to {class}'arviz.InferenceData'. This is both best practice transparency and readability for others. It also is useful in single developer workflows, for example, in cases where there is a 3 dimensional or higher RV it'll help indiciate which dimension corresponds to which model concept.\n", - "\n", - "### Use shape if you'd like to be explicit\n", - "Use shape if you'd like to bypass any dimensionality calculations implicit in PyMC. This will strictly specify the dimensionality to Aesara\n", - "\n", - "### When debugging use unique prime numbers\n", - "By using prime numbers it will be easier to determine where how input dimensionalities are being converted to output dimensionalities.\n", - "Once confident with result then change the dimensionalities to match your data or modeling needs." + "Where this becomes very useful is when we want the parameters to vary across batch dimensions." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 0.99989975, 10.00009874, 100.00004215])" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "draw(pm.Normal.dist(mu=[1, 10, 100], sigma=0.0001))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "When the parameters don't have the same shapes, they are broacasted, in a similar way to how NumPy works. In this case `sigma` was broadcasted to match the shape of `mu`." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[array([ 1, 10, 100]), array([0.0001, 0.0001, 0.0001])]" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "np.broadcast_arrays([1, 10, 100], 0.0001)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It's important to understand how NumPy {ref}`broadcasting ` works. When you do something that is not valid, you will easily encounter this sort of errors:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (3,) and arg 1 with shape (2,).\n", + "Apply node that caused the error: normal_rv{0, (0, 0), floatX, True}(RandomGeneratorSharedVariable(), TensorConstant{[]}, TensorConstant{11}, TensorConstant{[ 1 10 100]}, TensorConstant{(2,) of 0.1})\n", + "Toposort index: 0\n", + "Inputs types: [RandomGeneratorType, TensorType(int64, (0,)), TensorType(int64, ()), TensorType(int64, (3,)), TensorType(float64, (2,))]\n", + "Inputs shapes: ['No shapes', (0,), (), (3,), (2,)]\n", + "Inputs strides: ['No strides', (8,), (), (8,), (8,)]\n", + "Inputs values: [Generator(PCG64) at 0x7F7BE616D0E0, array([], dtype=int64), array(11), array([ 1, 10, 100]), array([0.1, 0.1])]\n", + "Outputs clients: [['output'], ['output']]\n", + "\n", + "HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.\n", + "HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.\n" + ] + } + ], + "source": [ + "try:\n", + " # shapes of (3,) and (2,) can't be broadcasted together\n", + " draw(pm.Normal.dist(mu=[1, 10, 100], sigma=[0.1, 0.1]))\n", + "except ValueError as error:\n", + " print(error)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Combining implicit and explicit batch dimensions" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can combine explicit shape dimensions with implicit batch dimensions. As mentioned above, they can provide the same information." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([-0.49526775, -0.94608062, 1.66397913])" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "normal_dists = pm.Normal.dist(mu=np.array([0, 1, 2]), sigma=1, shape=(3,))\n", + "draw(normal_dists)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "But shape can also be used to extend beyond any implicit batch dimensions." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 2.22626513, 2.12938134, 0.49074886],\n", + " [ 0.08312601, 1.05049093, 1.91718083],\n", + " [-0.68191815, 1.43771096, 1.76780399],\n", + " [-0.59883241, 0.26954893, 2.74319335]])" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "normal_dists = pm.Normal.dist(mu=np.array([0, 1, 2]), sigma=1, shape=(4, 3))\n", + "draw(normal_dists)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that, due to broadcasting rules, explicit batch dimensions must always \"go on the left\" of any implicit dimensions. So in the previous example `shape=(4, 3)` is valid, but `shape=(3, 4)` is not, because the `mu` parameter can be broadcasted to the first shape but not to the second." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (3, 4) and arg 1 with shape (3,).\n", + "Apply node that caused the error: normal_rv{0, (0, 0), floatX, True}(RandomGeneratorSharedVariable(), TensorConstant{[3 4]}, TensorConstant{11}, TensorConstant{[0 1 2]}, TensorConstant{1.0})\n", + "Toposort index: 0\n", + "Inputs types: [RandomGeneratorType, TensorType(int64, (2,)), TensorType(int64, ()), TensorType(int64, (3,)), TensorType(float64, ())]\n", + "Inputs shapes: ['No shapes', (2,), (), (3,), ()]\n", + "Inputs strides: ['No strides', (8,), (), (8,), ()]\n", + "Inputs values: [Generator(PCG64) at 0x7F7BE616DEE0, array([3, 4]), array(11), array([0, 1, 2]), array(1.)]\n", + "Outputs clients: [['output'], ['output']]\n", + "\n", + "HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.\n", + "HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.\n" + ] + } + ], + "source": [ + "try:\n", + " draw(pm.Normal.dist(mu=np.array([0, 1, 2]), sigma=1, shape=(3, 4)))\n", + "except ValueError as error:\n", + " print(error)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you needed the Normal variables to have `shape=(4, 3)`, you can transpose it after defining it." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[-0.73397401, -0.18717845, -0.78548049, 1.64478883],\n", + " [ 3.54543846, 1.22954216, 2.13674063, 1.94194106],\n", + " [ 0.85294471, 3.52041332, 2.94428975, 3.25944187]])" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "transposed_normals = pm.Normal.dist(mu=np.array([0, 1, 2]), sigma=1, shape=(4, 3)).T\n", + "draw(transposed_normals)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{tip} It's important not to confuse dimensions set in the definition of a distribution versus those set in downstream manipulations like transposition, indexing or broadcasting. When sampling with PyMC (be it via forward sampling or MCMC), the random draws will always emanate from the distribution shape. Notice how in the following example, a different number of \"random\" draws were actually taken, despite the two variables having the same final shape.\n", + ":::" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [], + "source": [ + "vector_normals = pm.Normal.dist(size=(3,))\n", + "broadcasted_normal = at.broadcast_to(pm.Normal.dist(), (3,))" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(array([-0.45755879, 1.59975702, 0.20546749]),\n", + " array([0.29866199, 0.29866199, 0.29866199]))" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "draw(vector_normals), draw(broadcasted_normal)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Multivariate distribution example" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Some distributions by definition return more than one value when evaluated. This may be a vector of values or a matrix or an arbitrary multidimensional tensor. An example is the Multivariate Normal, which always returns a vector (an array with one dimension)." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(array([0.55390975, 2.17440418, 1.83014764]), 1)" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "mvnormal_dist = pm.MvNormal.dist(mu=np.ones(3), cov=np.eye(3))\n", + "mvnormal_draw = draw(mvnormal_dist)\n", + "mvnormal_draw, mvnormal_draw.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "As with any distribution, the support dimensionality is specified as a fixed property" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "mvnormal_dist.owner.op.ndim_supp" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Even if you specify a MvNormal with a single dimension, you get back a vector!" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(array([-0.68893796]), 1)" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "smallest_mvnormal_dist = pm.MvNormal.dist(mu=[1], cov=[[1]])\n", + "smallest_mvnormal_draw = draw(smallest_mvnormal_dist)\n", + "smallest_mvnormal_draw, smallest_mvnormal_draw.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### Implicit support dimensions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "In the MvNormal examples we just saw, the support dimension was actually implicit. Nowhere did we specify we wanted a vector of 3 or 1 draws. This was inferred from the shape of `mu` and `cov`. As such, we refer to it as being an *implicit support dimension*. We could be a bit more explicit by using shape." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([0.57262853, 0.34230354, 1.96818163])" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "explicit_mvnormal = pm.MvNormal.dist(mu=np.ones(3), cov=np.eye(3), shape=(3,))\n", + "draw(explicit_mvnormal)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{warning} However, note that at the time of writing shape is simply ignored for support dimensions. It serves merely as a \"type-hint\" for labeling the expected dimensions.\n", + ":::" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([1.0623799 , 0.84622693, 0.34046237])" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ignored_shape_mvnormal = pm.MvNormal.dist(mu=np.ones(3), cov=np.eye(3), shape=(4,))\n", + "draw(ignored_shape_mvnormal)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### Explicit batch dimensions" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, "source": [ - "## Code Examples" + "As with univariate distributions, we can add explicit batched dimensions. We will use another vector distribution to illustrate this: the Multinomial. The following snippet defines a matrix of five independent Multinomial distributions, each of which is a vector of size 3." ] }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 24, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[2, 0, 3],\n", + " [1, 1, 3],\n", + " [0, 2, 3],\n", + " [0, 1, 4],\n", + " [1, 0, 4]])" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "draw(pm.Multinomial.dist(n=5, p=[0.1, 0.3, 0.6], shape=(5, 3)))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + ":::{warning} Again, note that shape has no effect on the support dimensionality\n", + ":::" + ] + }, + { + "cell_type": "code", + "execution_count": 25, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0, 1, 4],\n", + " [0, 0, 5],\n", + " [3, 1, 1],\n", + " [0, 1, 4],\n", + " [0, 2, 3]])" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "import pymc as pm\n", - "import numpy as np" + "draw(pm.Multinomial.dist(n=5, p=[0.1, 0.3, 0.6], shape=(5, 4)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Scalar distribution example\n", - "We can start with the simplest case, a single Normal distribution. We specify one as shown below" + "For the same reason, you must always define explicit batched dimensions \"to the left\" of the support dimension. The following will not behave as expected." ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 26, "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "array([[2, 0, 3],\n", + " [1, 3, 1],\n", + " [1, 1, 3]])" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "normal_dist = pm.Normal.dist()" + "draw(pm.Multinomial.dist(n=5, p=[0.1, 0.3, 0.6], shape=(3, 5)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "We can then take a random sample from that same distribution and print both the draw and shape" + "If you needed the Multinomial variables to have `shape=(3, 5)` you can transpose it after defining it." ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[0, 0, 0, 0, 0],\n", + " [2, 2, 1, 0, 3],\n", + " [3, 3, 4, 5, 2]])" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "transposed_multinomials = pm.Multinomial.dist(n=5, p=[0.1, 0.3, 0.6], shape=(5, 3)).T\n", + "draw(transposed_multinomials)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### Implicit batch dimensions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "As with univariate distributions, we can use different parameters for each batched dimension" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[1, 2, 2],\n", + " [0, 3, 7]])" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "multinomial_dist = pm.Multinomial.dist(n=[5, 10], p=[0.1, 0.3, 0.6])\n", + "draw(multinomial_dist)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Which is equivalent to the more verbose" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[2, 2, 1],\n", + " [0, 3, 7]])" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "draw(pm.Multinomial.dist(n=[5, 10], p=[[0.1, 0.3, 0.6], [0.1, 0.3, 0.6]]))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "If you are familiar with NumPy broadcasting rules you may be curious of how does PyMC make this work. Naive broadcasting wouldn't work here" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (2,) and arg 1 with shape (3,).\n" + ] + } + ], + "source": [ + "try:\n", + " np.broadcast_arrays([5, 10], [0.1, 0.3, 0.6])\n", + "except ValueError as exc:\n", + " print(exc)" + ] + }, + { + "cell_type": "markdown", "metadata": {}, + "source": [ + "To understand what is going on, we need to introduce the concept of parameter core dimensions. The core dimensions of a distribution's parameter are the minimum number of dimensions the parameters need to have in order to define a distribution. In the Multinomial distribution, `n` must at least be an scalar integer, but `p` must be at least a vector that represents the probability of having an outcome on each category. So, for the Multinomial distribution, `n` has 0 core dimensions, and `p` has 1 core dimension. \n", + "\n", + "So if we have a vector of two `n`, we should actually broadcast the vector of `p` into a matrix with two such vectors, and pair each `n` with each broadcasted row of `p`. This works exactly like `np.vectorize`." + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + ">> 5 [0.1 0.3 0.6]\n", + ">> 10 [0.1 0.3 0.6]\n" + ] + }, + { + "data": { + "text/plain": [ + "array([[1, 0, 4],\n", + " [1, 2, 7]])" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "def core_multinomial(n, p):\n", + " print(\">>\", n, p)\n", + " return draw(pm.Multinomial.dist(n, p))\n", + "\n", + "\n", + "vectorized_multinomial = np.vectorize(core_multinomial, signature=\"(),(p)->(p)\")\n", + "vectorized_multinomial([5, 10], [0.1, 0.3, 0.6])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "The core dimensionality of each distribution parameter is also hard-coded as a property of each distribution" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, "outputs": [ { "data": { "text/plain": [ - "(array(-1.11530499), ())" + "(0, 1)" ] }, - "execution_count": 3, + "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "random_sample = normal_dist.eval()\n", - "random_sample, random_sample.shape" + "multinomial_dist.owner.op.ndims_params" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "In this case we end up with a single scalar value. This is consistent with the distributions `ndim_supp` as the smallest random draw dimension is a scalar which has a dimension of zero" + "Implicit batch dimensions must still respect broadcasting rules. The following example is not valid because `n` has batched dimensions of `shape=(2,)` and `p` has batched dimensions of `shape=(3,)` which cannot be broadcasted together." + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "operands could not be broadcast together with remapped shapes [original->remapped]: (2,) and requested shape (3,)\n", + "Apply node that caused the error: multinomial_rv{1, (0, 1), int64, True}(RandomGeneratorSharedVariable(), TensorConstant{[]}, TensorConstant{4}, TensorConstant{[ 5 10]}, TensorConstant{[[0.1 0.3 .. 0.3 0.6]]})\n", + "Toposort index: 0\n", + "Inputs types: [RandomGeneratorType, TensorType(int64, (0,)), TensorType(int64, ()), TensorType(int64, (2,)), TensorType(float64, (3, 3))]\n", + "Inputs shapes: ['No shapes', (0,), (), (2,), (3, 3)]\n", + "Inputs strides: ['No strides', (8,), (), (8,), (24, 8)]\n", + "Inputs values: [Generator(PCG64) at 0x7F7BDDA904A0, array([], dtype=int64), array(4), array([ 5, 10]), 'not shown']\n", + "Outputs clients: [['output'], ['output']]\n", + "\n", + "HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.\n", + "HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.\n" + ] + } + ], + "source": [ + "try:\n", + " draw(pm.Multinomial.dist(n=[5, 10], p=[[0.1, 0.3, 0.6], [0.1, 0.3, 0.6], [0.1, 0.3, 0.6]]))\n", + "except ValueError as error:\n", + " print(error)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### Combining implicit and explicit batch dimensions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "You can and should combine implicit dimensions from multidimensional parameters with explicit shape information, which is easier to reason about." ] }, { "cell_type": "code", - "execution_count": 4, - "metadata": {}, + "execution_count": 34, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, "outputs": [ { "data": { "text/plain": [ - "0" + "array([[0, 1, 4],\n", + " [4, 1, 5]])" ] }, - "execution_count": 4, + "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "pm.Normal.rv_op.ndim_supp" + "draw(pm.Multinomial.dist(n=[5, 10], p=[0.1, 0.3, 0.6], shape=(2, 3)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### Implied Example" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If we wanted three draws from differently centered Normals we instead could pass a vector to the parameters. When generating a random draw we would now expect a vector value, in this case a vector if size 3. This is a case of *implied dimensions*" + "Explicit batch dimensions can still extend beyond any implicit batch dimensions. Again, due to how broadcasting works, explicit batch dimensions must always \"go on the left\". The following case is invalid, because `n` has batched dimensions of `shape=(2,)`, which cannot be broadcasted to the explicit batch dimensions of `shape=(2, 4)`." ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 35, "metadata": {}, "outputs": [ { - "data": { - "text/plain": [ - "(array([ 1.00002897, 9.9999175 , 99.99994224]), (3,))" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" + "name": "stdout", + "output_type": "stream", + "text": [ + "shape mismatch: objects cannot be broadcast to a single shape\n", + "Apply node that caused the error: multinomial_rv{1, (0, 1), int64, True}(RandomGeneratorSharedVariable(), TensorConstant{[2 4]}, TensorConstant{4}, TensorConstant{[ 5 10]}, TensorConstant{[0.1 0.3 0.6]})\n", + "Toposort index: 0\n", + "Inputs types: [RandomGeneratorType, TensorType(int64, (2,)), TensorType(int64, ()), TensorType(int64, (2,)), TensorType(float64, (3,))]\n", + "Inputs shapes: ['No shapes', (2,), (), (2,), (3,)]\n", + "Inputs strides: ['No strides', (8,), (), (8,), (8,)]\n", + "Inputs values: [Generator(PCG64) at 0x7F7BDD9D1000, array([2, 4]), array(4), array([ 5, 10]), array([0.1, 0.3, 0.6])]\n", + "Outputs clients: [['output'], ['output']]\n", + "\n", + "HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.\n", + "HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.\n" + ] } ], "source": [ - "random_sample = pm.Normal.dist(mu=[1, 10, 100], sigma=0.0001).eval()\n", - "random_sample, random_sample.shape" + "try:\n", + " draw(pm.Multinomial.dist(n=[5, 10], p=[0.1, 0.3, 0.6], shape=(2, 4, 3)))\n", + "except ValueError as error:\n", + " print(error)" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, "source": [ - "### Shape and Size" + "## Inspecting dimensionality with a model graph" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Alternatively we may just want three draws from identical distributions. In this case we could use either `shape` or `size` to specify this" + "More often than not distributions are used inside a PyMC model, and as such there are tools that facilitate reasoning about distributions shapes in that context." ] }, { "cell_type": "code", - "execution_count": 6, - "metadata": {}, + "execution_count": 36, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, "outputs": [ { - "data": { - "text/plain": [ - "(array([-0.56435014, 0.28613655, -0.92945242]), (3,))" - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" + "name": "stdout", + "output_type": "stream", + "text": [ + " x: shape=(3,)\n", + "sigma_log__: shape=()\n", + " sigma: shape=()\n", + " y: shape=(3,)\n" + ] } ], "source": [ - "random_sample = pm.Normal.dist(size=(3,)).eval()\n", - "random_sample, random_sample.shape" + "with pm.Model() as pmodel:\n", + " mu = pm.Normal(\"x\", mu=0, size=(3))\n", + " sigma = pm.HalfNormal(\"sigma\")\n", + " y = pm.Normal(\"y\", mu=mu, sigma=sigma)\n", + "\n", + "for rv, shape in pmodel.eval_rv_shapes().items():\n", + " print(f\"{rv:>11}: shape={shape}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "An even more powerful tool to understand and debug dimensionality in PyMC is the {func}`~pymc.model_to_graphviz` function. Rather than inspecting array outputs we can instead read the Graphviz output to understand the dimensionality of the variables." ] }, { "cell_type": "code", - "execution_count": 7, - "metadata": {}, + "execution_count": 37, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, "outputs": [ { "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "%3\n", + "\n", + "\n", + "cluster3\n", + "\n", + "3\n", + "\n", + "\n", + "\n", + "x\n", + "\n", + "x\n", + "~\n", + "Normal\n", + "\n", + "\n", + "\n", + "y\n", + "\n", + "y\n", + "~\n", + "Normal\n", + "\n", + "\n", + "\n", + "x->y\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "sigma\n", + "\n", + "sigma\n", + "~\n", + "HalfNormal\n", + "\n", + "\n", + "\n", + "sigma->y\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ], "text/plain": [ - "(array([ 0.23463317, -0.24455629, -2.23058663]), (3,))" + "" ] }, - "execution_count": 7, + "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "random_sample = pm.Normal.dist(shape=(3,)).eval()\n", - "random_sample, random_sample.shape" + "pm.model_to_graphviz(pmodel)" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, "source": [ - "### Inspecting dimensionality with a model graph\n", - "A powerful tool to understand and debug dimensionality in PyMC is the `pm.model_to_graphviz` functionality. Rather than inspecting array outputs we instead can read the Graphviz output to understand the dimensionality.\n", + "In the example above the number on the bottom left of each box (or plate) indicates the dimensionality of the distributions within. If a distribution is outside of any box with a number, it has a scalar shape.\n", "\n", - "In the example below the number on the bottom left of each box indicates the dimensionality of the Random Variable. With the scalar distribution it is implied to be one random draw of `ndim_support`" + "Let's use this tool to review implicit and explicit dimensions:" ] }, { "cell_type": "code", - "execution_count": 8, - "metadata": {}, + "execution_count": 38, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, "outputs": [ { "data": { @@ -244,92 +1432,92 @@ "\n", "\n", - "\n", + "\n", "\n", "%3\n", - "\n", + "\n", "\n", "cluster3\n", - "\n", - "3\n", + "\n", + "3\n", "\n", "\n", "cluster4\n", - "\n", - "4\n", + "\n", + "4\n", "\n", - "\n", - "cluster5\n", - "\n", - "5\n", - "\n", - "\n", + "\n", "\n", - "scalar\n", - "\n", - "scalar\n", - "~\n", - "Normal\n", + "scalar (support)\n", + "\n", + "scalar (support)\n", + "~\n", + "Normal\n", "\n", - "\n", + "\n", "\n", - "vector (implied)\n", - "\n", - "vector (implied)\n", - "~\n", - "Normal\n", + "vector (implicit)\n", + "\n", + "vector (implicit)\n", + "~\n", + "Normal\n", "\n", - "\n", + "\n", "\n", - "vector (from shape)\n", - "\n", - "vector (from shape)\n", - "~\n", - "Normal\n", - "\n", - "\n", - "\n", - "vector (from size)\n", - "\n", - "vector (from size)\n", - "~\n", - "Normal\n", + "vector (explicit)\n", + "\n", + "vector (explicit)\n", + "~\n", + "Normal\n", "\n", "\n", "\n" ], "text/plain": [ - "" + "" ] }, - "execution_count": 8, + "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with pm.Model() as pmodel:\n", - " pm.Normal(\"scalar\") # shape=()\n", - " pm.Normal(\"vector (implied)\", mu=[1, 2, 3])\n", - " pm.Normal(\"vector (from shape)\", shape=(4,))\n", - " pm.Normal(\"vector (from size)\", size=(5,))\n", + " pm.Normal(\"scalar (support)\")\n", + " pm.Normal(\"vector (implicit)\", mu=[1, 2, 3])\n", + " pm.Normal(\"vector (explicit)\", shape=(4,))\n", "\n", "pm.model_to_graphviz(pmodel)" ] }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Dims" + ] + }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Dims\n", - "A new feature of PyMC is `dims` support. With many random variables it can become confusing which dimensionality corresponds to which \"real world\" idea, e.g. number of observations, number of treated units etc. The dims argument is an additional label to help." + "PyMC supports the concept of `dims`. With many random variables it can become confusing which dimensionality corresponds to which \"real world\" idea, e.g. number of observations, number of treated units etc. The `dims` argument is an additional human-readable label that can convey this meaning." ] }, { "cell_type": "code", - "execution_count": 9, - "metadata": {}, + "execution_count": 39, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, "outputs": [ { "data": { @@ -340,69 +1528,69 @@ "\n", "\n", - "\n", + "\n", "\n", "%3\n", - "\n", + "\n", "\n", - "clusterB (2)\n", + "clustercolors (2)\n", "\n", - "B (2)\n", + "colors (2)\n", "\n", "\n", - "clusterDim_A (4)\n", - "\n", - "Dim_A (4)\n", + "clustergroup (4)\n", + "\n", + "group (4)\n", "\n", - "\n", + "\n", "\n", - "red\n", + "crayon\n", "\n", - "red\n", + "crayon\n", "~\n", "Normal\n", "\n", - "\n", + "\n", "\n", - "one\n", - "\n", - "one\n", - "~\n", - "Normal\n", + "prior\n", + "\n", + "prior\n", + "~\n", + "Normal\n", "\n", - "\n", + "\n", "\n", - "two\n", - "\n", - "two\n", - "~\n", - "Normal\n", + "hyperprior\n", + "\n", + "hyperprior\n", + "~\n", + "Normal\n", "\n", - "\n", + "\n", "\n", - "one->two\n", - "\n", - "\n", + "hyperprior->prior\n", + "\n", + "\n", "\n", "\n", "\n" ], "text/plain": [ - "" + "" ] }, - "execution_count": 9, + "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with pm.Model() as pmodel:\n", - " pm.Normal(\"red\", size=2, dims=\"B\")\n", + " pm.Normal(\"crayon\", size=2, dims=\"colors\")\n", "\n", - " pm.Normal(\"one\", [1, 2, 3, 4], dims=\"Dim_A\") # (4,)\n", - " pm.Normal(\"two\", dims=\"Dim_A\")\n", + " hyperprior = pm.Normal(\"hyperprior\", [1, 2, 3, 4], dims=\"group\")\n", + " pm.Normal(\"prior\", mu=hyperprior, dims=\"group\")\n", "\n", "\n", "pm.model_to_graphviz(pmodel)" @@ -410,106 +1598,106 @@ }, { "cell_type": "markdown", - "metadata": {}, - "source": [ - "Where dims can become increasingly powerful is with the use of `coords` specified in the model itself. With this it becomes easy to track. As an added bonus the coords and dims will also be present in the returned {class}'arviz.InferenceData' simplifying the entire workflow." - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "with pm.Model(\n", - " coords={\n", - " \"year\": [2020, 2021, 2022],\n", - " }\n", - ") as pmodel:\n", - "\n", - " pm.Normal(\"Normal_RV\", dims=\"year\")\n", - "\n", - " pm.model_to_graphviz(pmodel)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, "source": [ - "## Vector Distributions\n", - "Some distributions by definition cannot return scalar values as random samples, but instead will return an array as their result. An example is the Multivariate Normal. The simplest possible return shape can be verified using `ndim_supp`. The value here indicates the smallest shape that can be returned is a vector" + "Where `dims` can become increasingly powerful is with the use of `coords` specified in the model itself. This gives a unique label to each `dim` entry, rendering it much more meaningful." ] }, { "cell_type": "code", - "execution_count": 11, - "metadata": {}, + "execution_count": 40, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, "outputs": [ { "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "%3\n", + "\n", + "\n", + "clusteryear (3)\n", + "\n", + "year (3)\n", + "\n", + "\n", + "\n", + "profit\n", + "\n", + "profit\n", + "~\n", + "Normal\n", + "\n", + "\n", + "\n" + ], "text/plain": [ - "1" + "" ] }, - "execution_count": 11, + "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "pm.MvNormal.rv_op.ndim_supp" + "with pm.Model(\n", + " coords={\n", + " \"year\": [2020, 2021, 2022],\n", + " }\n", + ") as pmodel:\n", + "\n", + " pm.Normal(\"profit\", dims=\"year\")\n", + "\n", + "pm.model_to_graphviz(pmodel)" ] }, { "cell_type": "markdown", - "metadata": {}, - "source": [ - "This can be verified with a random sample as well." - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "array([[1.00273693, 1.99583834, 2.99674157],\n", - " [4.01036637, 5.00775714, 5.97952974]])" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" + "metadata": { + "pycharm": { + "name": "#%% md\n" } - ], + }, "source": [ - "pm.MvNormal.dist(mu=[[1, 2, 3], [4, 5, 6]], cov=np.eye(3) * 0.0001).eval()" + "Note that the dimensionality of the distribution was actually defined by the `dims` used. We did not pass shape or define implicit batched dimensions." ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, "source": [ - "Like scalar distributions we can also use all our dimensionality tools as well to specify a set of Multivariate normals" + "Let us to review the different dimensionality flavours with a Multivariate Normal example." ] }, { "cell_type": "code", - "execution_count": 33, - "metadata": {}, + "execution_count": 41, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[3]\n", - "[3 2]\n", - "[3 2]\n", - "[3 2]\n" - ] - }, { "data": { "image/svg+xml": [ @@ -519,71 +1707,53 @@ "\n", "\n", - "\n", + "\n", "\n", "%3\n", - "\n", + "\n", "\n", - "cluster3\n", + "clustersupport (3)\n", "\n", - "3\n", + "support (3)\n", "\n", "\n", - "clusterrepeats (3) x implied (2)\n", - "\n", - "repeats (3) x implied (2)\n", + "clusterbatch (4) x support (3)\n", + "\n", + "batch (4) x support (3)\n", "\n", - "\n", - "clusterrepeats (3) x None (2)\n", - "\n", - "repeats (3) x None (2)\n", - "\n", - "\n", - "clusteryear (3) x None (2)\n", - "\n", - "year (3) x None (2)\n", - "\n", - "\n", + "\n", "\n", - "implied\n", + "vector\n", "\n", - "implied\n", + "vector\n", "~\n", "MvNormal\n", "\n", - "\n", + "\n", "\n", - "with size\n", - "\n", - "with size\n", - "~\n", - "MvNormal\n", + "matrix (explicit)\n", + "\n", + "matrix (explicit)\n", + "~\n", + "MvNormal\n", "\n", - "\n", + "\n", "\n", - "with shape\n", - "\n", - "with shape\n", - "~\n", - "MvNormal\n", - "\n", - "\n", - "\n", - "with coords\n", - "\n", - "with coords\n", - "~\n", - "MvNormal\n", + "matrix (implicit)\n", + "\n", + "matrix (implicit)\n", + "~\n", + "MvNormal\n", "\n", "\n", "\n" ], "text/plain": [ - "" + "" ] }, - "execution_count": 33, + "execution_count": 41, "metadata": {}, "output_type": "execute_result" } @@ -591,24 +1761,14 @@ "source": [ "with pm.Model(\n", " coords={\n", - " \"year\": [2020, 2021, 2022],\n", + " \"batch\": [0, 1, 2, 3],\n", " }\n", ") as pmodel:\n", - " mv = pm.MvNormal(\"implied\", mu=[0, 0, 0], cov=np.eye(3))\n", - " print(mv.shape.eval())\n", - "\n", - " # Multivariate RVs (ndim_supp > 0)\n", - " assert mv.ndim == 1\n", - "\n", - " mv = pm.MvNormal(\"with size\", mu=[0, 0], cov=np.eye(2), size=3, dims=(\"repeats\", \"implied\"))\n", - " print(mv.shape.eval())\n", - "\n", - " # ⚠ Size dims are always __prepended__\n", - " mv = pm.MvNormal(\"with shape\", mu=[0, 0], cov=np.eye(2), shape=(3, ...), dims=(\"repeats\", ...))\n", - " print(mv.shape.eval())\n", - "\n", - " mv = pm.MvNormal(\"with coords\", mu=[0, 0], cov=np.eye(2), dims=(\"year\", ...))\n", - " print(mv.shape.eval())\n", + " pm.MvNormal(\"vector\", mu=[0, 0, 0], cov=np.eye(3), dims=(\"support\",))\n", + " pm.MvNormal(\"matrix (implicit)\", mu=np.zeros((4, 3)), cov=np.eye(3), dims=(\"batch\", \"support\"))\n", + " pm.MvNormal(\n", + " \"matrix (explicit)\", mu=[0, 0, 0], cov=np.eye(3), shape=(4, 3), dims=(\"batch\", \"support\")\n", + " )\n", "\n", "pm.model_to_graphviz(pmodel)" ] @@ -617,29 +1777,58 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### User caution and practical tips" + ":::{tip} For final model publication we suggest dims and coords as the labels will be passed to {class}`arviz.InferenceData`. This is both best practice transparency and readability for others. It also is useful in single developer workflows, for example, in cases where there is a 3 dimensional or higher distribution it'll help indiciate which dimension corresponds to which model concept.\n", + ":::" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Tips for debugging shape issues" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, "source": [ - "While we provide all these tools for convenience, and while PyMC does it best to understand user intent, the result of mixed dimensionality tools may not always result in the final dimensionality intended. Sometimes the model may not indicate an error until sampling, or not indicate an issue at all. When working with dimensionality, particular more complex ones we suggest\n", + "While we provide all these tools for convenience, and while PyMC does it best to understand user intent, the result of mixed dimensionality tools may not always result in the final dimensionality intended. Sometimes the model may not indicate an error until sampling, or not indicate an issue at all. When working with dimensionality, particular more complex ones we suggest:\n", "\n", - "* Using GraphViz to visualize your model before sampling\n", - "* Using the prior predictive to catch errors early\n", - "* Inspecting the returned `az.InferenceData` object to ensure all array sizes are as intended" + "* Using `model_to_graphviz` to visualize your model before sampling\n", + "* Using `draw` or `sample_prior predictive` to catch errors early\n", + "* Inspecting the returned `az.InferenceData` object to ensure all array sizes are as intended\n", + "* Defining shapes with prime numbers when tracking down errors." ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [] } ], "metadata": { + "hide_input": false, "interpreter": { "hash": "f574fac5b7e4a41f7640949d1e1759089329dd116ff7b389caa9cf21f93d872d" }, "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "pymc", "language": "python", - "name": "python3" + "name": "pymc" }, "language_info": { "codemirror_mode": { @@ -651,7 +1840,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.10" + "version": "3.10.4" } }, "nbformat": 4,