You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: evaluation/posts/2024_05_23/index.qmd
+41-2Lines changed: 41 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -180,7 +180,7 @@ But can't decide how best to display it - or if it is worth trying to display it
180
180
181
181
Updated the files accordingly.
182
182
183
-
### 14.19- Reproduction
183
+
### 14.19-15.12 Reproduction
184
184
185
185
Original:
186
186
@@ -271,6 +271,42 @@ However, this definitely has **not** fixed the issue! Still varying - and I woul
271
271
272
272
<mark>do we need to focus on the interpretations and whether they hold? but we didn't really want to do that as that is not our focus? but in this case, is it, if we want to know if we've reproduced, but can't actually necessarily get the exact same results due to randomness?</mark>
273
273
274
+
<mark>look at tom's more recent examples where he has added seeds</mark>
275
+
276
+
<mark>test! idea is to check that you are getting the same results between runs</mark>
277
+
278
+
### 15.23- Reproduction
279
+
280
+
::: {.callout-tip}
281
+
## Random seeds
282
+
283
+
At the moment, I would describe this model as reusable but not reproducible. It was really relatively quick to get the code up and running and see similar results to the paper. But in terms of getting it to match up to the paper, it is pretty much impossible, although I will try to get their via setting random seeds then running it lots of times to try and get a close match.
284
+
285
+
This is important for STARS framework improvement - that controlling randomness is important for reproducibility. It can also be handy for someone reusing a model, as they may wish to reproduce just to verify that its running properly for them.
286
+
287
+
And so for each of the studies, if this is a recurring thing that comes up, its seeing where and how to add random seeds in different models and languages, to enable reproducibility.
288
+
:::
289
+
290
+
From this [Stack Overflow post](https://stackoverflow.com/questions/59105921/why-is-numpy-random-seed-not-remaining-fixed-but-randomstate-is-when-run-in-para), I'm suspcious that perhaps the issue is that I am setting the random state as 1 and 2, which (a) would imply it's making it the same between each run, but (b) all using the same stream in parallel processing. But it's set using `RandomState`.
291
+
292
+
Trying to google around use of seeds with parallel processing.
293
+
294
+
Had a chat with Tom about it and he suggested:
295
+
296
+
* He pointed out that NormalParams is not being used, and that it would need to be setting a seed in the class Normal() when you use it in Scenario - e.g. extra parameter at end of here -
297
+
*`requiring_inpatient_random: Distribution = Uniform(0.0, 1.0)`
298
+
*`time_pos_before_inpatient: Distribution = Uniform(3,7)`
299
+
* Good example of how set up, would recommend this - https://pythonhealthdatascience.github.io/stars-simpy-example-docs/content/02_model_code/04_model.html#distribution-classes
300
+
* LLM model generation of seeds
301
+
302
+
Need seperate random number streams for each time make a distribution to use it.
303
+
304
+
I thought best option is to switch to using it how it is uses in the [treat-sim model docs](https://pythonhealthdatascience.github.io/stars-simpy-example-docs/content/02_model_code/04_model.html#distribution-classes), as focus here is just modifying code to allow it to reproduce each run.
305
+
306
+
So next things I did -
307
+
308
+
* Delete the NormalParams and UniformParams classes as not used - checked if still run fine which it did.
* Still feels unclear on when we are setting up the website (showing article, showing code). Decision I have made from trying to display the code is that actually, the simplest and clearest thing is to let people explore the code themselves (just direct them to the right folder on the GitHub), whilst for the article, it takes one minute to embed the PDFs, so just have that step (plus adding link to where the scripts are) when upload the articles, and call it a day.
334
372
* To do: move download sources from logbook to original study page (and modify as appropraite in protocol)
335
373
* Add suggestion to save outputs as go, as and when appropriate, as it's helpful to be able to include images in the logbook, for example. So perhaps, copying images from output into the logbook folder images. **Yes.** I've started copying over and storing within the logbook folder, and just focusisng on e.g. the figure I was looking at and not copying over all the data associated.
374
+
* Can I ask for advice on issues with reproduction from rest of team? Would presume so, and that include that in timing and record what is discussed and said.
0 commit comments