Llm Serving
2 articles about llm serving.
vLLM 0.8.2 Release Date, Changelog, and Upgrade Guide
·6 min read
vLLM 0.8.2 was released on March 25, 2025. This post covers the full changelog, V1 engine memory fix, FP8 KV cache support, and how to upgrade safely.
vllminferencellm-servingrelease-notesopen-sourcefp8kv-cache
vLLM Update April 2026: v0.18, v0.19, Gemma 4, and gRPC Serving
·9 min read
Every major vLLM update in April 2026 covered. From v0.18's gRPC serving and GPU speculative decoding to v0.19's Gemma 4 support, async scheduling, and critical security patches.
vllminferencellm-servingapril-2026gemma-4speculative-decodinggrpcopen-source
Browse by Topic
Ai Agents (332)Automation (235)Macos (192)Productivity (190)Ai Agent (182)Claude Code (160)Desktop Agent (119)Developer Tools (98)Open Source (88)Reliability (83)Accessibility Api (79)Mcp (76)Parallel Agents (75)Desktop Automation (68)Multi Agent (64)Ai Coding (56)Security (54)Workflow (51)Claude (50)Architecture (50)