← Back to Hub

Error loading visualization

Encoder Parameters

ModelSigLIP-So400M
Input224 x 224 x 3
Patch Size14 x 14
Num Patches256
Embed Dim1152
Layers27
WeightsFrozen

Encoding Progress

Visual Tokens (16 shown)

Legend

Raw Image (Pixels)
Patch Grid
SigLIP Encoder
Data Flow
Visual Tokens
Ready
Press Play to see how SigLIP converts a raw camera image into a sequence of visual tokens — the first step of the pi0 architecture.
Speed1x