DISK-LLM // TELEMETRY

SYSTEM_STATE: READY

DISK-LLM RUNTIME Qwen3.5-9B Adapter
This page is an illustrative telemetry surface, not the authoritative benchmark record. The corrected benchmark framing and archived Modal artifact live on the homepage and documentation pages.
METRIC_STREAM_01

LAYER LATENCY MAP

AVG: 198.5ms
TRANSFORMER_L1201.2ms
TRANSFORMER_L2185.1ms
TRANSFORMER_L3350.8ms
TRANSFORMER_L4190.4ms
TRANSFORMER_L5205.2ms
CAPACITY_GAUGE

MEMORY PRESSURE

70.4%
RAM_OCCUPANCY
ALLOCATED
504.4 GB
RESERVED
11.3 GB
SPATIAL_ALLOCATION

LIVE TENSOR MAP

MAPPED
UNMAPPED
FRAGMENTED
terminal RUNTIME_TERMINAL_OUTPUT // TTY1
[INFO] Initializing Disk-LLM runtime pipeline...
[INFO] Loading manifest: ./packed-qwen35/manifest.json
[INFO] Mapping tensors via numpy.memmap...
[INFO] Disk-swapping layer_012 weights (~430MB per layer block)
[WARN] High memory pressure detected — RSS pinned at 11264 MB
Processing batch_size=1, prompt_tokens=128
First token latency: 126.61s
[INFO] Latency spike in L3 attributed to sequential I/O page faults.
[INFO] Throughput: 0.016 tokens/sec
Generation complete: 2 tokens produced
_