-
-
Notifications
You must be signed in to change notification settings - Fork 934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add probability masking to space.sample
#1310
Add probability masking to space.sample
#1310
Conversation
…rove efficiency and readability
@mariojerez Could you quickly review this PR to check that it makes sense |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good! I caught some things that I think should be fixed. I added comments to specific lines in the Files changed tab. I noticed we're missing tests for MultiBinary probability sampling. I also suggested some rephrasing of a doc string to improve clarity.
gymnasium/spaces/multi_binary.py
Outdated
|
||
Returns: | ||
Sampled values from space | ||
""" | ||
if mask is not None and probability is not None: | ||
raise ValueError( | ||
"Only one of `mask` or `probability` can be provided, and `probability` is currently unsupported" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should remove "and probability
is currently unsupported"
gymnasium/spaces/multi_binary.py
Outdated
@@ -59,19 +59,28 @@ def is_np_flattenable(self): | |||
"""Checks whether this space can be flattened to a :class:`spaces.Box`.""" | |||
return True | |||
|
|||
def sample(self, mask: MaskNDArray | None = None) -> NDArray[np.int8]: | |||
def sample( | |||
self, mask: MaskNDArray | None = None, probability: None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think should be...
mask: MaskNDArray | None = None, probability: MaskNDArray | None = None
The expected mask shape is the space shape and mask dtype is ``np.int8``. | ||
probability: An optional ``np.ndarray`` to mask samples with expected shape of ``space.shape`` where element | ||
represent the probability of 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the Args descriptions are a little confusing. I think something like this would make it more clear.
mask: An optional np.ndarray
to mask samples with expected shape of space.shape
.
When an element in mask
is 0
, then the corresponding element in the sample will be 0
.
When an element in mask
is 1
, then the corresponding element in the sample will be 1
.
When an element in mask
is 2
, then the corresponding element in the sample will be be randomly sampled.
The expected mask shape is the space shape and mask dtype is np.int8
.
probability: An optional np.ndarray
to mask samples with expected shape of space.shape
where each element represents the probability of the corresponding sample element being 1
.
The expected mask shape is the space shape and mask dtype is np.float64
.
gymnasium/spaces/sequence.py
Outdated
|
||
Returns: | ||
A tuple of random length with random samples of elements from the :attr:`feature_space`. | ||
""" | ||
if mask is not None and probability is not None: | ||
raise ValueError("Only one of `mask` or `probability` can be provided.") | ||
|
||
if mask is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but to be consistent with the rest of the file changes, should make this an elif
self.dtype | ||
) | ||
else: | ||
return self.np_random.integers(low=0, high=2, size=self.n, dtype=self.dtype) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing tests for probability sampling MultiBinary
Thanks for the review @mariojerez, really helpful and if you are interested in helping out with any other PRs or farama projects, you would be very welcome |
Thank you! I'd be interested in helping out over the summer. I'll reach out. |
Description
Adds a probability mask feature (
probability
) to thesample()
method of all spaces. This allows you to specify the probability of choosing each action. Similarly to themask
parameter, theprobability
parameter is a numpy array with the same shape asn
, the number of elements in the space. Each value in the array describes the probability of the corresponding value being chosen, with 0 meaning it will not be chosen and 1 meaning that it will be chosen. All of the values in the array must sum to 1.probability
is unsupported for the Box space.Motivation
This is helpful in instances where values need to be chosen at random, such that some values are given higher priority (a higher likelihood) over others. For example, when implementing an Ant Colony Optimization algorithm, each action needs to be assigned a different probability of being chosen.
Fixes #1255 and the completion of #1296
Type of change
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)