Skip to content

Commit 00e365d

Browse files
authored
Merge pull request #9 from JuliaReinforcementLearning/refactor
Refactor to allow containers to be directly used as spaces
2 parents ddfd7ab + 9ee91b5 commit 00e365d

File tree

9 files changed

+457
-156
lines changed

9 files changed

+457
-156
lines changed

README.md

Lines changed: 91 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -5,70 +5,137 @@
55
[![Code Style: Blue](https://img.shields.io/badge/code%20style-blue-4495d1.svg)](https://github.com/invenia/BlueStyle)
66
[![PkgEval](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/C/CommonRLSpaces.svg)](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/report.html)
77

8+
## Introduction
9+
10+
A space is simply a set of objects. In a reinforcement learning context, spaces define the sets of possible states, actions, and observations.
11+
12+
In Julia, spaces can be represented by a variety of objects. For instance, a small discrete action set might be represented with `["up", "left", "down", "right"]`, or an interval of real numbers might be represented with an object from the [`IntervalSets`](https://github.com/JuliaMath/IntervalSets.jl) package. In general, the space defined by any Julia object is the set of objects `x` for which `x in space` returns `true`.
13+
14+
In addition to establishing the definition above, this package provides three useful tools:
15+
16+
1. Traits to communicate about the properties of spaces, e.g. whether they are continuous or discrete, how many subspaces they have, and how to interact with them.
17+
2. Functions such as `product` for constructing more complex spaces
18+
3. Constructors to for spaces whose elements are arrays, such as `ArraySpace` and `Box`.
19+
20+
## Concepts and Interface
21+
22+
### Interface for all spaces
23+
24+
Since a space is simply a set of objects, a wide variety of common Julia types including `Vector`, `Set`, `Tuple`, and `Dict`<sup>1</sup>can represent a space.
25+
Because of this inclusive definition, there is a very minimal interface that all spaces are expected to implement. Specifically, it consists of
26+
- `in(x, space)`, which tests whether `x` is a member of the set `space` (this can also be called with the `x in space` syntax).
27+
- `rand(space)`, which returns a valid member of the set<sup>2</sup>.
28+
- `eltype(space)`, which returns the type of the elements in the space.
29+
30+
In addition, the `SpaceStyle` trait is always defined. Calling `SpaceStyle(space)` will return either a `FiniteSpaceStyle`, `ContinuousSpaceStyle`, `HybridSpaceStyle`, or an `UnknownSpaceStyle` object.
31+
32+
### Finite discrete spaces
33+
34+
Spaces with a finite number of elements have `FiniteSpaceStyle`. These spaces are guaranteed to be iterable, implementing Julia's [iteration interface](https://docs.julialang.org/en/v1/manual/interfaces/). In particular `collect(space)` will return all elements in an array.
35+
36+
### Continuous spaces
37+
38+
Continuous spaces represent sets that have an uncountable number of elements they have a `SpaceStyle` of type `ContinuousSpaceStyle`. CommonRLSpaces does not adopt a rigorous mathematical definition of a continuous set, but, roughly, elements in the interior of a continuous space have other elements very close to them.
39+
40+
Continuous spaces have some additional interface functions:
41+
42+
- `bounds(space)` returns upper and lower bounds in a tuple. For example, if `space` is a unit circle, `bounds(space)` will return `([-1.0, -1.0], [1.0, 1.0])`. This allows agents to choose policies that appropriately cover the space e.g. a normal distribution with a mean of `mean(bounds(space))` and a standard deviation of half the distance between the bounds.
43+
- `clamp(x, space)` returns an element of `space` that is near `x`. i.e. if `space` is a unit circle, `clamp([2.0, 0.0], space)` might return `[1.0, 0.0]`. This allows for a convenient way for an agent to find a valid action if they sample actions from a distribution that doesn't match the space exactly (e.g. a normal distribution).
44+
- `clamp!(x, space)`, similar to `clamp`, but clamps `x` in place.
45+
46+
### Hybrid spaces
47+
48+
The interface for hybrid continuous-discrete spaces is currently planned, but not yet defined. If the space style is not `FiniteSpaceStyle` or `ContinuousSpaceStyle`, it is `UnknownSpaceStyle`.
49+
50+
### Spaces of arrays
51+
52+
[need to figure this out, but I think `elsize(space)` should return the size of the arrays in the space]
53+
54+
### Cartesian products of spaces
55+
56+
The Cartesian product of two spaces `a` and `b` can be constructed with `c = product(a, b)`.
57+
58+
The exact form of the resulting space is unspecified and should be considered an implementation detail. The only guarantees are (1) that there will be one unique element of `c` for every combination of one object from `a` and one object from `b` and (2) that the resulting space conforms to the interface above.
59+
60+
The `TupleSpaceProduct` constructor provides a specialized Cartesian product where each element is a tuple, i.e. `TupleSpaceProduct(a, b)` has elements of type `Tuple{eltype(a), eltype(b)}`.
61+
62+
---
63+
64+
<sup>1</sup>Note: the elements of a space represented by a `Dict` are key-value `Pair`s.
65+
<sup>2</sup>[TODO: should we make any guarantees about whether `rand(space)` is drawn from a uniform distribution?]
66+
867
## Usage
968

1069
### Construction
1170

1271
|Category|Style|Example|
1372
|:---|:----|:-----|
14-
|Enumerable discrete space| `DiscreteSpaceStyle{()}()` | `Space((:cat, :dog))`, `Space(0:1)`, `Space(1:2)`, `Space(Bool)`|
15-
|Multi-dimensional discrete space| `DiscreteSpaceStyle{(3,4)}()` | `Space((:cat, :dog), 3, 4)`, `Space(0:1, 3, 4)`, `Space(1:2, 3, 4)`, `Space(Bool, 3, 4)`|
16-
|Multi-dimensional variable discrete space| `DiscreteSpaceStyle{(2,)}()` | `Space(SVector((:cat, :dog), (:litchi, :longan, :mango))`, `Space([-1:1, (false, true)])`|
17-
|Continuous space| `ContinuousSpaceStyle{()}()` | `Space(-1.2..3.3)`, `Space(Float32)`|
18-
|Multi-dimensional continuous space| `ContinuousSpaceStyle{(3,4)}()` | `Space(-1.2..3.3, 3, 4)`, `Space(Float32, 3, 4)`|
73+
|Enumerable discrete space| `FiniteSpaceStyle{()}()` | `(:cat, :dog)`, `0:1`, `["a","b","c"]` |
74+
|One dimensional continuous space| `ContinuousSpaceStyle{()}()` | `-1.2..3.3`, `Interval(1.0, 2.0)` |
75+
|Multi-dimensional discrete space| `FiniteSpaceStyle{(3,4)}()` | `ArraySpace((:cat, :dog), 3, 4)`, `ArraySpace(0:1, 3, 4)`, `ArraySpace(1:2, 3, 4)`, `ArraySpace(Bool, 3, 4)`|
76+
|Multi-dimensional variable discrete space| `FiniteSpaceStyle{(2,)}()` | `product((:cat, :dog), (:litchi, :longan, :mango))`, `product(-1:1, (false, true))`|
77+
|Multi-dimensional continuous space| `ContinuousSpaceStyle{(2,)}()` or `ContinuousSpaceStyle{(3,4)}()` | `Box([-1.0, -2.0], [2.0, 4.0])`, `product(-1.2..3.3, -4.6..5.0)`, `ArraySpace(-1.2..3.3, 3, 4)`, `ArraySpace(Float32, 3, 4)` |
78+
|Multi-dimensional hybrid space| `HybridSpaceStyle{(2,),()}()` | `product(-1.2..3.3, -4.6..5.0, [:cat, :dog])`, `product(Box([-1.0, -2.0], [2.0, 4.0]), [1,2,3])`|
1979

2080
### API
2181

2282
```julia
2383
julia> using CommonRLSpaces
2484

25-
julia> s = Space((:litchi, :longan, :mango))
26-
Space{Tuple{Symbol, Symbol, Symbol}}((:litchi, :longan, :mango))
85+
julia> s = (:litchi, :longan, :mango)
2786

2887
julia> rand(s)
2988
:litchi
3089

3190
julia> rand(s) in s
3291
true
3392

34-
julia> size(s)
35-
()
93+
julia> length(s)
94+
3
3695
```
3796

3897
```julia
39-
julia> s = Space(UInt8, 2,3)
40-
Space{Matrix{UnitRange{UInt8}}}(UnitRange{UInt8}[0x00:0xff 0x00:0xff 0x00:0xff; 0x00:0xff 0x00:0xff 0x00:0xff])
98+
julia> s = ArraySpace(1:5, 2,3)
99+
CommonRLSpaces.RepeatedSpace{UnitRange{Int64}, Tuple{Int64, Int64}}(1:5, (2, 3))
41100

42101
julia> rand(s)
43-
2×3 Matrix{UInt8}:
44-
0x7b 0x38 0xf3
45-
0x6a 0xe1 0x28
102+
2×3 Matrix{Int64}:
103+
4 1 1
104+
3 2 2
46105

47106
julia> rand(s) in s
48107
true
49108

50109
julia> SpaceStyle(s)
51-
DiscreteSpaceStyle{(2, 3)}()
110+
FiniteSpaceStyle()
52111

53-
julia> size(s)
112+
julia> elsize(s)
54113
(2, 3)
55114
```
56115

57116
```julia
58-
julia> s = Space(SVector(-1..1, 0..1))
59-
Space{SVector{2, ClosedInterval{Int64}}}(ClosedInterval{Int64}[-1..1, 0..1])
117+
julia> s = product(-1..1, 0..1)
118+
Box{StaticArraysCore.SVector{2, Float64}}([-1.0, 0.0], [1.0, 1.0])
60119

61120
julia> rand(s)
62-
2-element SVector{2, Float64} with indices SOneTo(2):
63-
0.5563101538643473
64-
0.9227368869418011
121+
2-element StaticArraysCore.SVector{2, Float64} with indices SOneTo(2):
122+
0.03049072910834738
123+
0.6295234114874269
65124

66125
julia> rand(s) in s
67126
true
68127

69128
julia> SpaceStyle(s)
70-
ContinuousSpaceStyle{(2,)}()
129+
ContinuousSpaceStyle()
71130

72-
julia> size(s)
131+
julia> elsize(s)
73132
(2,)
74-
```
133+
134+
julia> bounds(s)
135+
([-1.0, 0.0], [1.0, 1.0])
136+
137+
julia> clamp([5, 5], s)
138+
2-element StaticArraysCore.SizedVector{2, Float64, Vector{Float64}} with indices SOneTo(2):
139+
1.0
140+
1.0
141+
```

src/CommonRLSpaces.jl

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,34 @@ module CommonRLSpaces
22

33
using Reexport
44

5-
@reexport using FillArrays
65
@reexport using IntervalSets
7-
@reexport using StaticArrays
8-
@reexport import Base: OneTo
6+
7+
using StaticArrays
8+
using FillArrays
9+
using Random
10+
import Base: clamp
11+
12+
export
13+
SpaceStyle,
14+
AbstractSpaceStyle,
15+
FiniteSpaceStyle,
16+
ContinuousSpaceStyle,
17+
UnknownSpaceStyle,
18+
bounds,
19+
elsize
920

1021
include("basic.jl")
1122

23+
export
24+
Box,
25+
ArraySpace
26+
27+
include("array.jl")
28+
29+
export
30+
product,
31+
TupleProduct
32+
33+
include("product.jl")
34+
1235
end

src/array.jl

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
abstract type AbstractArraySpace end
2+
# Maybe AbstractArraySpace should have an eltype parameter so that you could call convert(AbstractArraySpace{Float32}, space)
3+
4+
"""
5+
Box(lower, upper)
6+
7+
A Box represents a space of real-valued arrays bounded element-wise above by `upper` and below by `lower`, e.g. `Box([-1, -2], [3, 4]` represents the two-dimensional vector space that is the Cartesian product of the two closed sets: ``[-1, 3] \\times [-2, 4]``.
8+
9+
The elements of a Box are always `AbstractArray`s with `AbstractFloat` elements. `Box`es always have `ContinuousSpaceStyle`, and products of `Box`es with `Box`es or `ClosedInterval`s are `Box`es when the dimensions are compatible.
10+
"""
11+
struct Box{A<:AbstractArray{<:AbstractFloat}} <: AbstractArraySpace
12+
lower::A
13+
upper::A
14+
15+
Box{A}(lower, upper) where {A<:AbstractArray} = new(lower, upper)
16+
end
17+
18+
function Box(lower, upper; convert_to_static::Bool=false)
19+
@assert size(lower) == size(upper)
20+
sz = size(lower)
21+
continuous_lower = convert(AbstractArray{float(eltype(lower))}, lower)
22+
continuous_upper = convert(AbstractArray{float(eltype(upper))}, upper)
23+
if convert_to_static
24+
final_lower = SArray{Tuple{sz...}}(continuous_lower)
25+
final_upper = SArray{Tuple{sz...}}(continuous_upper)
26+
else
27+
final_lower, final_upper = promote(continuous_lower, continuous_upper)
28+
end
29+
return Box{typeof(final_lower)}(final_lower, final_upper)
30+
end
31+
32+
# By default, convert builtin arrays to static
33+
Box(lower::Array, upper::Array) = Box(lower, upper; convert_to_static=true)
34+
35+
SpaceStyle(::Box) = ContinuousSpaceStyle()
36+
37+
function Base.rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:Box})
38+
box = sp[]
39+
return box.lower + rand_similar(rng, box.lower) .* (box.upper-box.lower)
40+
end
41+
42+
rand_similar(rng::AbstractRNG, a::StaticArray) = rand(rng, typeof(a))
43+
rand_similar(rng::AbstractRNG, a::AbstractArray) = rand(rng, eltype(a), size(a)...)
44+
45+
Base.in(x::AbstractArray, b::Box) = all(b.lower .<= x .<= b.upper)
46+
47+
Base.eltype(::Box{A}) where A = A
48+
elsize(b::Box) = size(b.lower)
49+
50+
bounds(b::Box) = (b.lower, b.upper)
51+
Base.clamp(x::AbstractArray, b::Box) = clamp.(x, b.lower, b.upper)
52+
53+
Base.convert(t::Type{<:Box}, i::ClosedInterval) = t(SA[minimum(i)], SA[maximum(i)])
54+
55+
struct RepeatedSpace{B, S<:Tuple} <: AbstractArraySpace
56+
base_space::B
57+
elsize::S
58+
end
59+
60+
"""
61+
ArraySpace(base_space, size...)
62+
63+
Create a space of Arrays with shape `size`, where each element of the array is drawn from `base_space`.
64+
"""
65+
ArraySpace(base_space, size...) = RepeatedSpace(base_space, size)
66+
67+
SpaceStyle(s::RepeatedSpace) = SpaceStyle(s.base_space)
68+
69+
Base.rand(rng::AbstractRNG, sp::Random.SamplerTrivial{<:RepeatedSpace}) = rand(rng, sp[].base_space, sp[].elsize...)
70+
71+
Base.in(x::AbstractArray, s::RepeatedSpace) = all(entry in s.base_space for entry in x)
72+
Base.eltype(s::RepeatedSpace) = AbstractArray{eltype(s.base_space), length(s.elsize)}
73+
Base.eltype(s::RepeatedSpace{<:AbstractInterval}) = AbstractArray{Random.gentype(s.base_space), length(s.elsize)}
74+
elsize(s::RepeatedSpace) = s.elsize
75+
76+
function bounds(s::RepeatedSpace)
77+
bs = bounds(s.base_space)
78+
return (Fill(first(bs), s.elsize...), Fill(last(bs), s.elsize...))
79+
end
80+
81+
Base.clamp(x::AbstractArray, s::RepeatedSpace) = map(entry -> clamp(entry, s.base_space), x)

0 commit comments

Comments
 (0)