ARC's current research focus can be thought of as trying to combine mechanistic interpretability and formal verification. If we had a deep understanding of what was going on inside a neural
…
»
The Alignment Research Center’s Theory team is starting a new hiring round for researchers with a theoretical background. Please apply here.
Update January 2024: we have paused hiring and expect to reopen
…
»