What to Do with Your Idle Custom PC - Convert It to an AI Agent Server

Matthew Diakonov·March 18, 2026·3 min read

homelab proxmox gaming-pc self-hosted local-ai selfhosted

That gaming PC sitting under your desk collecting dust between sessions has serious compute potential. A mid-range gaming rig from the last few years - RTX 3080, 32GB RAM, fast NVMe - is more than enough to run local AI models and host always-on agent workflows.

Here is how to turn it into a productive AI agent server.

Why Proxmox

Proxmox gives you a hypervisor that runs VMs and containers from a web UI. You can allocate your GPU to a VM running Ollama for local inference while keeping separate containers for different agent workloads. If one agent crashes, it does not take down everything else.

Install Proxmox directly on the bare metal. It takes about 20 minutes. Your gaming setup becomes a proper server without buying any new hardware.

The Setup

A practical configuration looks like this:

VM 1: Ollama + local models - Pass through the GPU. Run Llama 3, Mistral, or whatever fits your VRAM. This handles the cheap inference tasks your agents need.
Container 1: Agent runtime - Your AI agent processes, scheduled tasks, and automation scripts.
Container 2: Support services - Databases, log aggregation, monitoring dashboards.

With 32GB of system RAM, you can comfortably run all three with room to spare.

GPU Passthrough

This is the most valuable part. Your RTX card sitting idle is wasted compute. With GPU passthrough to an Ollama VM, you get local inference at speeds that rival cloud APIs - without the per-token cost.

A 3080 with 10GB VRAM runs quantized 7B-13B models comfortably. A 4090 with 24GB handles 70B quantized models. Either way, it is free inference after the electricity bill.

Power and Noise

Gaming PCs are loud under load. Set a fan curve that keeps things quiet for sustained inference workloads - you do not need the same cooling profile as a gaming session. A typical setup draws 150-250W under AI workloads, which is roughly $15-25/month in electricity. Still cheaper than cloud inference.

Your idle hardware is your cheapest server.

This post was inspired by a discussion on r/selfhosted (30 comments).

Fazm is an open source macOS AI agent. Open source on GitHub.

What to Do with Your Idle Custom PC - Convert It to an AI Agent Server

Why Proxmox

The Setup

GPU Passthrough

Power and Noise

More on This Topic

Related Posts

llama.cpp Releases in April 2026: Tensor Parallelism, 1-Bit Quantization, and More

DSM and Provable Memory for AI Agents - Why Relevance Beats Proof

First Speculative Decoding Across GPU and Neural Engine on Apple Silicon

Comments ()

Why Proxmox

The Setup

GPU Passthrough

Power and Noise

More on This Topic

Related Posts

llama.cpp Releases in April 2026: Tensor Parallelism, 1-Bit Quantization, and More

DSM and Provable Memory for AI Agents - Why Relevance Beats Proof

First Speculative Decoding Across GPU and Neural Engine on Apple Silicon

Comments (••)

Comments ()