You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
where you need to pass in a trainable dataset, and your customized loss and metrics function. The trainable interventions can later be saved on to your disk. You can also use `intervenable.evaluate()` your interventions in terms of customized objectives.
243
243
244
+
## Citation
245
+
Library paper is forthcoming. For now, if you use this repository, please consider to cite relevant papers:
246
+
```stex
247
+
@article{wu2024pyvene,
248
+
title={pyvene: A Library for Understanding and Improving {P}y{T}orch Models via Interventions},
249
+
author={Wu, Zhengxuan and Geiger, Atticus and Arora, Aryaman and Huang, Jing and Wang, Zheng and Noah D. Goodman and Christopher D. Manning and Christopher Potts},
250
+
booktitle={arXiv:2403.07809},
251
+
url={arxiv.org/abs/2403.07809},
252
+
year={2024}
253
+
}
254
+
```
255
+
244
256
## Related Works in Discovering Causal Mechanism of LLMs
245
257
If you would like to read more works on this area, here is a list of papers that try to align or discover the causal mechanisms of LLMs.
246
258
-[Causal Abstractions of Neural Networks](https://arxiv.org/abs/2106.02997): This paper introduces interchange intervention (a.k.a. activation patching or causal scrubbing). It tries to align a causal model with the model's representations.
@@ -253,21 +265,3 @@ If you would like to read more works on this area, here is a list of papers that
253
265
## Star History
254
266
255
267
[](https://star-history.com/#stanfordnlp/pyvene&Date)
256
-
257
-
## Citation
258
-
Library paper is forthcoming. For now, if you use this repository, please consider to cite relevant papers:
259
-
```stex
260
-
@article{geiger-etal-2023-DAS,
261
-
title={Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations},
262
-
author={Geiger, Atticus and Wu, Zhengxuan and Potts, Christopher and Icard, Thomas and Goodman, Noah},
263
-
year={2023},
264
-
booktitle={arXiv}
265
-
}
266
-
267
-
@article{wu-etal-2023-Boundless-DAS,
268
-
title={Interpretability at Scale: Identifying Causal Mechanisms in Alpaca},
269
-
author={Wu, Zhengxuan and Geiger, Atticus and Icard, Thomas and Potts, Christopher and Goodman, Noah},
0 commit comments