Hacker News: Front Page 2026-06-16 16:12 GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz Open original source ↗ Reindex This Article Article URL: https://twitter.com/fguzmanai/status/2065832668172845209 Comments URL: https://news.ycombinator.com/item?id=48557535 Points: 28 # Comments: 9
Real-time LLM Inference on Standard GPUs: 3k tokens/s per request Hacker News: Front Page • similarity 0.506
MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second Hacker News: Front Page • similarity 0.480
No comments yet.