← Back to Hub
Error loading visualization
pi0 Visual Encoding — SigLIP
VE
Visual Encoding with SigLIP
pi0 Architecture — Step 1
Phase
-
/ 6
Encoder Parameters
Model
SigLIP-So400M
Input
224 x 224 x 3
Patch Size
14 x 14
Num Patches
256
Embed Dim
1152
Layers
27
Weights
Frozen
Encoding Progress
Visual Tokens (16 shown)
Legend
Raw Image (Pixels)
Patch Grid
SigLIP Encoder
Data Flow
Visual Tokens
Ready
Press Play to see how SigLIP converts a raw camera image into a sequence of visual tokens — the first step of the pi0 architecture.
Reset
Play
Step
Speed
1x