Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add probability masking to space.sample #1310

Merged

Conversation

pseudo-rnd-thoughts
Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts commented Feb 19, 2025

Description

Adds a probability mask feature (probability) to the sample() method of all spaces. This allows you to specify the probability of choosing each action. Similarly to the mask parameter, the probability parameter is a numpy array with the same shape as n, the number of elements in the space. Each value in the array describes the probability of the corresponding value being chosen, with 0 meaning it will not be chosen and 1 meaning that it will be chosen. All of the values in the array must sum to 1. probability is unsupported for the Box space.

Motivation

This is helpful in instances where values need to be chosen at random, such that some values are given higher priority (a higher likelihood) over others. For example, when implementing an Ant Colony Optimization algorithm, each action needs to be assigned a different probability of being chosen.

Fixes #1255 and the completion of #1296

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@pseudo-rnd-thoughts
Copy link
Member Author

@mariojerez Could you quickly review this PR to check that it makes sense

Copy link
Contributor

@mariojerez mariojerez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking good! I caught some things that I think should be fixed. I added comments to specific lines in the Files changed tab. I noticed we're missing tests for MultiBinary probability sampling. I also suggested some rephrasing of a doc string to improve clarity.


Returns:
Sampled values from space
"""
if mask is not None and probability is not None:
raise ValueError(
"Only one of `mask` or `probability` can be provided, and `probability` is currently unsupported"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should remove "and probability is currently unsupported"

@@ -59,19 +59,28 @@ def is_np_flattenable(self):
"""Checks whether this space can be flattened to a :class:`spaces.Box`."""
return True

def sample(self, mask: MaskNDArray | None = None) -> NDArray[np.int8]:
def sample(
self, mask: MaskNDArray | None = None, probability: None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think should be...
mask: MaskNDArray | None = None, probability: MaskNDArray | None = None

The expected mask shape is the space shape and mask dtype is ``np.int8``.
probability: An optional ``np.ndarray`` to mask samples with expected shape of ``space.shape`` where element
represent the probability of 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the Args descriptions are a little confusing. I think something like this would make it more clear.

mask: An optional np.ndarray to mask samples with expected shape of space.shape.
When an element in mask is 0, then the corresponding element in the sample will be 0.
When an element in mask is 1, then the corresponding element in the sample will be 1.
When an element in mask is 2, then the corresponding element in the sample will be be randomly sampled.
The expected mask shape is the space shape and mask dtype is np.int8.

probability: An optional np.ndarray to mask samples with expected shape of space.shape where each element represents the probability of the corresponding sample element being 1.
The expected mask shape is the space shape and mask dtype is np.float64.


Returns:
A tuple of random length with random samples of elements from the :attr:`feature_space`.
"""
if mask is not None and probability is not None:
raise ValueError("Only one of `mask` or `probability` can be provided.")

if mask is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but to be consistent with the rest of the file changes, should make this an elif

self.dtype
)
else:
return self.np_random.integers(low=0, high=2, size=self.n, dtype=self.dtype)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing tests for probability sampling MultiBinary

@pseudo-rnd-thoughts
Copy link
Member Author

Thanks for the review @mariojerez, really helpful and if you are interested in helping out with any other PRs or farama projects, you would be very welcome

@pseudo-rnd-thoughts pseudo-rnd-thoughts merged commit e4c1f90 into Farama-Foundation:main Feb 21, 2025
12 checks passed
@mariojerez
Copy link
Contributor

Thank you! I'd be interested in helping out over the summer. I'll reach out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Proposal] Probability Mask for sampling a space
2 participants