You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-225Lines changed: 2 additions & 225 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,8 @@ Rulebook:
16
16
17
17
Our key proof of concept example is [4Hammer](https://github.com/rl-language/4Hammer) . A never before implemented reinforcement learning environment with huge amounts of user actions in only ~5k lines of code (including graphical code). It runs in the browser and on desktop and all the features described in this section are present.
18
18
19
+
Zero mallocs unless explicitly requested by the user.
20
+
19
21
### Installation
20
22
21
23
Install rlc with:
@@ -46,20 +48,8 @@ rlc-probs file.rl net
46
48
```
47
49
It will to learn pass true to `win` to maximize `score`, as reported by the second command.
48
50
49
-
##
50
-
### Documentation
51
-
[Project Rationale](./docs/where_we_are_going.md)
52
-
53
-
[Language Rationale](./docs/rationale.md)
54
-
55
-
[Tutorial](./docs/tutorial.md)
56
-
57
-
[Tutorial for GYM users](./docs/gym_tutorial.md)
58
-
59
51
[Paper for Reinforcement Learning users](https://arxiv.org/abs/2504.19625)
60
52
61
-
[Language reference and stdlib documentation](https://github.com/rl-language/rlc-stdlib-doc/tree/master)
62
-
63
53
### Contacts
64
54
65
55
[Discord](https://discord.gg/saSEj9PAt3)
@@ -71,224 +61,11 @@ Or mail us at massimo.fioravanti@polimi.it
71
61
72
62

73
63
74
-
### Example: tic tac toe
75
-
```
76
-
77
-
# declares the equivalent of a struct called Board.
78
-
# It contains the tic tac toe slots and the current player turn
79
-
# Methods omitted for brevity
80
-
cls Board:
81
-
Int[9] slots
82
-
Bool playerTurn
83
-
84
-
@classes
85
-
act play() -> TicTacToe:
86
-
# allocates and initializes a board of type Board
87
-
let board : Board
88
-
while !full(board):
89
-
90
-
# declares a suspension point of the simulation,
91
-
# an action called mark that requires two ints to be performed.
92
-
act mark(Int x, Int y) {
93
-
# declares contraints about which inputs are valid
94
-
x < 3,
95
-
x >= 0,
96
-
y < 3,
97
-
y >= 0,
98
-
board.get(x, y) == 0
99
-
}
100
-
101
-
# marks the board at the position provided
102
-
board.set(x, y)
103
-
104
-
# if the current player has three marks in a line
105
-
# return
106
-
if board.three_in_a_line():
107
-
return
108
-
109
-
board.change_current_player()
110
-
111
-
112
-
fun main() -> Int:
113
-
# creates a new game
114
-
let game = play()
115
-
game.mark(0, 0)
116
-
# X _ _
117
-
# _ _ _
118
-
# _ _ _
119
-
game.mark(1, 0)
120
-
# X O _
121
-
# _ _ _
122
-
# _ _ _
123
-
game.mark(1, 1)
124
-
# X O _
125
-
# _ X _
126
-
# _ _ _
127
-
game.mark(2, 0)
128
-
# X O O
129
-
# _ X _
130
-
# _ _ _
131
-
game.mark(2, 2)
132
-
# X O O
133
-
# _ X _
134
-
# _ _ X
135
-
136
-
# returns 1 because player 1 indeed
137
-
# had three marks in a line
138
-
return int(game.board.three_in_a_line())
139
-
```
140
-
141
-
### FAQ:
142
-
#### I am a reinforcement learning engineer, what do I gain from using this?
143
-
By using RLC to write your environments, or to wrap previously existing environments, you obtain:
144
-
* the ability of automatically test those environments.
145
-
* configurable automatic serialization and deserialization textual and binary for those environments.
146
-
* configurable automatic serialization and deserialization textual and binary for sequences of actions instead of the state.
147
-
* configurable automatic serialization of the state to something that can be sent to the GPU for learning.
148
-
* the ability to reuse the environment code of the environment in production with no modification.
149
-
150
-
You can read more about the tutorial here [Tutorial](./docs/tutorial.md).
151
-
152
-
#### I am a graphic engine programmer/game programmer, what do I gain from using this?
153
-
By writing state and state evolution code (not graphical code) in Rulebook you obtain:
154
-
* the ability of automatically serialize the state to disk both in textual and binary form.
155
-
* the ability to automatically test and stress code witouth running the whole engine and thus testing it in isolation.
156
-
* the ability to reuse state code indipendetly from the engine.
157
-
* retain the ability of writing graphical code however you wish.
158
-
159
-
You can checkout a example where RLC is made interoperable with Godot [here](https://github.com/rl-language/4Hammer).
160
-
161
-
#### I can write the same tic tac toe example in python using python yields, what is the difference?
162
-
The difference is that when written in python:
163
-
* python coroutines lack a mechanism to express multiple possible resumption points.
164
-
* python coroutines allocate the coroutine state on the heap, RLC does not.
165
-
* you lose the ability to serialize and restore the execution of tic tac toe between player actions.
166
-
* you must use some special convention to extract the state of the board from the active coroutine, such as saving the reference to the board somewhere else.
167
-
* you must use special convention must be followed to express somewhere which values of x and y are valid and which are not, and such requirements cannot be expressed inline in the coroutine, defeating the advantage of using the coroutine.
168
-
* you must manually specify how to encode the suspended coroutine to something that can be delivered to machine learning components.
169
-
170
-
RLC does all of this automatically. You can read more about it [Here](./docs/rationale.md).
171
-
172
-
173
-
#### I have a previously existing code base, can I use this project?
174
-
Yes, at the moment Rulebook is compatible with python and C. You can use RLC as build only tool for testing purposes and not affect in any way your users.
175
-
176
-
#### I have performance constraints, is this fast?
177
-
We have performances comparable with C. Furthermore you can write C code and invoke it from Rulebook if you need ever more controll on performances.
178
-
179
-
#### In practice, what happens to a project that wants to include Rulebook components?
180
-
Everything about Rulebook will be turned into a single native library that you will link into or deploy along with your previously existing artifacts. Nothing else.
181
-
182
-
183
-
```
184
-
---------- 0 : p0 ------------
185
-
{resume_index: 1, score: 0.000000}
186
-
--------- probs --------------
187
-
0: win {do_it: true} 98.9385 %
188
-
1: win {do_it: false} 1.0615 %
189
-
------------------------------
190
-
{resume_index: -1, score: 1.000000}
191
-
```
192
-
193
-
Read a tutorial explaining how to play black jack [here](./docs/tutorial.md)
194
-
195
64
### License
196
65
197
66
We wish for `RLC` to be usable by all as a compiler, for both commercial and non-commercial purposes, so it is released under apache license.
198
67
199
68
200
-
## Info for compiler developers.
201
-
This section is dedicated to those that whish to build RLC itself from source, not to those that wish to use RLC as a off-the-shelf tool. At the moment we do not provide a off-the-shelf way of building RLC on windows due.
202
-
203
-
### Dependencies
204
-
Base:
205
-
* cpp17 compiler
206
-
* python
207
-
* CMake
208
-
209
-
Extra dependecies used by the setup script:
210
-
* Bash
211
-
* Ninja
212
-
* virtualenv
213
-
* lld
214
-
215
-
216
-
### Installation for compiler developers
217
-
Stop reading if you don't want to work on the compiler.
218
-
219
-
We provide a setup script that downloads the rlc repository and a setup script that will download and compile `LLVM` as well as `RLC`. As long as the dependencies written before are met you should just be able to run the following commands and everything should work. Installing and building llvm debug will take ~100 gigabytes of hard drive space and will require a large amount of time and RAM. This is only required when building from sources, pypi packages are less than 100mb on each OS.
220
-
221
-
Hard drive space can be reclaimed by deleting `LLVM` build directory after it has been fully built.
222
-
223
-
Download the setup.sh file in the root of the repository and then run:
224
-
```
225
-
chmod +x setup.sh
226
-
source ./setup.sh # clones RLC repo and initialize virtualenvs and submodules
227
-
python rlc/build.py # clones LLVM, builds it and builds RLC
228
-
```
229
-
230
-
on mac and windows replace the last line with
231
-
```
232
-
python rlc/build.py --no-use-lld
233
-
```
234
-
235
-
If that script terminates successfully, you are fully set up to start working on `RLC`.
236
-
237
-
238
-
#### Building the pip packages
239
-
240
-
You can create a pip package in BUILDDIR/dist/ by running
241
-
```
242
-
ninja pip_package
243
-
```
244
-
245
-
#### What do if run out of space or memory
246
-
Instead of the previous command python, you can run. This will only build the release LLVM version and save a great deal of space.
You need to use the flag --rlc-shared if you have built a shared LLVM.
257
-
258
-
### environment.sh
259
-
If you are using the default installation script (setup.sh) we provide a .sh file that configures your environment variable so that you can use python and rlc without installing anything in your actual machine.
260
-
When you open a shell to start working on RLC run the following command.
261
-
262
-
If you use some editor such as code or clion, start it from that shell.
263
-
264
-
```
265
-
source environment.sh
266
-
```
267
-
268
-
To check if everything works correctly run the following command.
0 commit comments