BluelightAI is launching a research fellowship program focused on mechanistic interpretability and topological data analysis (TDA) applied to large language models. The initiative aims to support students, postdocs, and researchers globally in investigating how AI models function internally.
Research Areas
The organization is actively exploring:
- Cross-layer transcoders for Qwen 3
- CLT/SAE features for training interpretable classifiers
- SAE features for analyzing model performance patterns
Fellowship Details
Duration & Commitment: 1–3 month research projects requiring 10–20 hours weekly.
Compensation & Resources:
- $5,000 one-time stipend
- Weekly 1-on-1 mentorship
- Compute resource access
- Early access to Cobalt (TDA toolkit) and interpretability tools
Expected Outcomes: Blog posts published on our site and LessWrong; many projects anticipated to become peer-reviewed papers.
Application Requirements
Applicants must submit:
- CV/resume
- Personal statement
- 1–2 paragraph research proposal
- References to prior related work
The program operates on rolling admissions with 3–4 participants in the first batch.
Project Ideas
Examples include feature taxonomy development, cross-LLM capability analysis, sparse autoencoder improvements, circuit tracing automation, and mechanistic investigations of specific model behaviors.
To apply or learn more, contact jakob@bluelightai.com.