Skip to content
Topic

#Mechanistic Interpretability

1 article on Mechanistic Interpretability — news, releases, guides and analysis from the SourceFeed engine.

Why Prompt Injection Works: The Role Confusion Theory
Article 1w ago 4

Why Prompt Injection Works: The Role Confusion Theory

LLMs assign authority based on how text is written, not where it comes from, making role tags a leaky abstraction.

Rachel Goldstein