## Prizes for matrix completion problems

Here are two self-contained algorithmic questions that have come up in our research. We're offering a bounty of $5k for a solution to either of them—either an algorithm, or a lower bound … »

This post is an elaboration on “tractability of discrimination” as introduced in section III of "Can we efficiently explain model behaviors? For an overview of the general plan this fits into, see "Mechanistic anomaly detection" and "Finding gliders in the game of life". … »

Finding explanations is a relatively unambitious interpretability goal. If it is intractable then that’s an important obstacle to interpretability in general. If we formally define “explanations,” then finding them is a well-posed search problem and there is a plausible argument for tractability. … »

ARC’s current approach to ELK is to point to latent structure within a model by searching for the “reason” for particular correlations in the model’s output. In this post we’ll walk through a very simple example of using this approach to identify gliders in the game of life. … »

An informal description of ARC’s current research approach, follow-up to Eliciting Latent Knowledge … »

ARC has released a paper on Formalizing the presumption of independence, an open problem currently central to our approach to Eliciting Latent Knowledge (ELK). … »

From January - February the Alignment Research Center offered prizes for proposed algorithms for eliciting latent knowledge. In total we received 197 proposals and are awarding 32 prizes of $5k-20k. We are also giving 24 proposals honorable mentions of $1k, for a total of $274,000. … »