BluelightAI has constructed Cross Layer Transcoders (CLTs) for the Qwen3 family, specifically Qwen3-0.6B Base and Qwen3-1.7B Base. These tools help decode activation vectors in large language models into more interpretable features, available at qwen3.bluelightai.com.
Key Findings
The analysis builds on work by Goodfire, Anthropic, DeepMind, and OpenAI in using Sparse Autoencoders (SAEs) and CLTs to understand LLM internals. BluelightAI reports finding "clearer and more conceptually abstract features" compared to other analyses.
Layer 20, Feature 847: Activates on meta-level judgments of conceptual phrases with evaluative language, particularly text that challenges common interpretations or labels.
Layer 20, Feature 179: Fires on phrases describing fulfillment criteria or conditions. Notable for multilingual activation across English, German, and Spanish examples.
The research identifies that many features activate specifically on stopwords and punctuation.
Topological Data Analysis Integration
The platform employs TDA methods to identify feature groups, recognizing that "ideas and concepts will be more precisely identified by groups of features." This visual interface helps researchers:
- Identify clusters of related features through similarity measures
- Analyze connections between feature groups across layers
- Support circuit-tracing analysis in LLMs
Future Directions
The team plans to develop TDA-based approaches for circuit tracing, moving beyond manual individual feature analysis toward systematic group-level investigation.