Update app.py
Browse files
app.py
CHANGED
@@ -131,10 +131,10 @@ with gr.Blocks(theme=gr.themes.Default(), css=css) as demo:
|
|
131 |
gr.Markdown('''
|
132 |
# π Self-Interpreting Models π
|
133 |
|
134 |
-
πΎ **This space
|
135 |
This idea was explored in the paper **Patchscopes** ([Ghandeharioun et al., 2024](https://arxiv.org/abs/2401.06102)) and was later investigated further in **SelfIE** ([Chen et al., 2024](https://arxiv.org/abs/2403.10949)).
|
136 |
-
An honorary mention
|
137 |
-
We follow the SelfIE implementation in this space for concreteness. Patchscopes are so general that they encompass many other interpretation techniques too!!!
|
138 |
|
139 |
πΎ **The idea is really simple: models are able to understand their own hidden states by nature!** πΎ
|
140 |
If I give a model a prompt of the form ``User: [X] Assistant: Sure'll I'll repeat your message`` and replace ``[X]`` *during computation* with the hidden state we want to understand,
|
|
|
131 |
gr.Markdown('''
|
132 |
# π Self-Interpreting Models π
|
133 |
|
134 |
+
πΎ **This space is a simple introduction to the emerging trend of models interpreting their _own hidden states_ in free form natural language**!! πΎ
|
135 |
This idea was explored in the paper **Patchscopes** ([Ghandeharioun et al., 2024](https://arxiv.org/abs/2401.06102)) and was later investigated further in **SelfIE** ([Chen et al., 2024](https://arxiv.org/abs/2403.10949)).
|
136 |
+
An honorary mention of **Speaking Probes** ([Dar, 2023](https://towardsdatascience.com/speaking-probes-self-interpreting-models-7a3dc6cb33d6) -- my own work!! π₯³) which was a less mature but had the same idea in mind.
|
137 |
+
We will follow the SelfIE implementation in this space for concreteness. Patchscopes are so general that they encompass many other interpretation techniques too!!!
|
138 |
|
139 |
πΎ **The idea is really simple: models are able to understand their own hidden states by nature!** πΎ
|
140 |
If I give a model a prompt of the form ``User: [X] Assistant: Sure'll I'll repeat your message`` and replace ``[X]`` *during computation* with the hidden state we want to understand,
|