Loading…

1 · Prompt

Encodes the prompt's last-token residual at the SAE's layer and ranks the firing features. TopK SAE has at most K=50 nonzero features per token, so request up to ~50.

2 · Generate

Runs model.generate twice: once with no hooks, once with all active sliders applied as residual-stream additions h ← h + α · W_dec[:, feat].

3 · Visualization

all 32K features top firing steered
Each point is one SAE feature; positions are the top-3 PCA components of W_enc. Click a point to add it as a steering target.

Baseline output (α=0)

(no run yet)

Steered output

(no run yet)

Per-sample top features

Encode a corpus to populate.

Signature heatmap / Compare results

Encode a corpus to populate.

Filtered docs

Encode + apply filter to populate.

Synthesis output

Run "Synthesize from corpus" to populate.