← All Posts Mechanistic Interpretability

Mapping Concept Evolution in Qwen3

Mapping Concept Evolution in Qwen3

This article explores how BluelightAI uses Topological Data Analysis (TDA) to understand internal mechanisms of large language models, specifically the Qwen3 family.

Core Argument

The researchers address the "black box" nature of LLMs by introducing Cross-Layer Transcoders (CLTs) and their proprietary Cobalt software. They argue that "understanding how models construct concepts is vital for control and diagnosis" and present their work as advancing mechanistic interpretability.

Methodology

The team:

  1. Identified feature clusters within single layers showing coactivation patterns
  2. Computed mean encoder vectors for target clusters
  3. Scanned preceding layers for influential features
  4. Filtered for frequently-activating features (>1 per 10,000 tokens)
Example feature coactivation graph produced by Cobalt, used to identify target feature clusters. Semantically similar features tend to coactivate and are grouped into nodes, with semantically adjacent features appearing in neighboring nodes. The graph was constructed using a subset of BluelightAI's Qwen3 CLT features. Node coloring indicates next-token activation frequency (log scale).

Case Study 1 โ€” Software Exceptions

Traced how "problem severity" concepts (medical context) evolve through validation and problem-fixing stages before materializing as software exception handling in code.

Case Study 2 โ€” Progress Metaphors

Demonstrated how physical movement features transform into abstract concepts like "one step further" and "step in the right direction" through intermediate layers handling comparative analysis and pathfinding.

Key Insight

"Topological approaches respect high-dimensional data shape" rather than forcing rigid clustering, revealing transitional states and branching concept paths across model layers.

The authors released CLTs for Qwen3-0.6B and Qwen3-1.7B models alongside an interactive explorer tool at qwen3.bluelightai.com.