Building a Powerful AI/ML Workstation at Home

Introduction
Goals
Budget
OS + Software
Services
Power Consumption
LLM Performance
Conclusion

Introduction

I have wanted to build a powerful machine for machine learning and generative AI for a while now. I finally decided to build one at home.

In this article, I will share my experience finding the right parts, building the workstation, the services I run on it, the power consumption, and how it performs.

A note about this article: You can click the images to enlarge them, and click again to dismiss.

Goals

I wanted to be able to:

Train non-trivial ML models
Do LLM inference using local models like Llama + DeepSeek
Run Jupyter notebooks for exploratory data analysis
Do image generation using Stable Diffusion / FLUX

Of course, you can sign up for many of these tasks through free online services, but I often ended up getting rate-limited for some of my more ambitious projects.

I had many other reasons for wanting a home server, such as running Plex, storing backups, etc., but the goals above were the main ones for this build.

With these goals in mind, I started researching the best GPU for machine learning training and LLM inference.

This blog post from Tim Dettmers was really fantastic for figuring out which GPU you should get within your budget. I'd now replace his advice on 4090 -> 5090 if you can find one to buy.

Given this, I wanted to get 2x3090s, but I'm going to run with a single one for now before adding the second one.

I also wanted to ensure that my build would support 4090s or 5090s in the future, so I made sure to get a motherboard that supports these cards.

Dual 3090s are quite tough to fit into most cases, but this video shows you can fit both into the Silverstone GD11B case.

Regarding figuring out which models would run on the 24gb of VRAM in a 3090, shameless plug: I built Can I run this LLM? to show you the memory requirements at different quantization levels.

Budget

I had a max budget of around £2,500 for this build. I wanted to get as much power as possible within that budget.

With my goals in mind and the desire to expand this build in the future, I decided to get the following parts.

Components

Component	Model/Description	Link	Price
Case	Silverstone GD11B	Amazon	£160.00
CPU	AMD Ryzen 7 9700X (8 Core / 16 Thread)	Amazon	£318.99
nVME	Samsung 990 2TB	Amazon	£145.55
CPU Fan	Noctua NH-L9a	Amazon	£39.95
CPU Fan 2(*)	Noctua NH-U9S	Amazon	£59.95
RAM	CORSAIR VENGEANCE RGB DDR5 RAM 64GB (4x16GB) 6000MHz	Amazon	£197.98
GPU 1	Nvidia 3090 Founders Edition	Ebay	£605.00
MoBo	MSI MAG X870 Tomahawk WiFi, AMD X870, AM5, DDR5, PCIe 5.0	Amazon	£279.98
PSU	NZXT C1200 - PA-2G1BB-EU - ATX 1200 Watt Gaming PC Power	Amazon	£141.11
Case Fan	2 x Noctua NF-A12x25 PWM	Amazon	£57.90
Total			£2,006.41

(*) The first fan was much too small, which led to very high temperatures on the CPU when running at high load.

OS + Software

I tried using Windows on this build for a while, but it was using more power than I expected, and I was running everything in WSL, which crashed about once a week.

I ended up switching to Ubuntu 24.04 LTS, which has been much more stable and power-efficient (you can see power consumption graphs below comparing Ubuntu vs. Windows 11 consumption).

Generally speaking, I need the following features in an OS, all of which are easy to set up on Ubuntu:

Startup tasks
Task scheduler
Remote access
Network shares
Docker
Nvidia drivers

Services

Service	Description	Link
Jupyter Lab	Notebook software for writing ad hoc Python code	Jupyter
Tailscale	Zero Trust VPN. I connect all my machines to my Tailnet, and can access my workstation outside my house without exposing it outside the VPN.	Tailscale
AirFlow	AirFlow is a job scheduler. I use it to pull information periodically. You can chain jobs together. For example, I run a daily export of my power consumption data, put it into a database table, and then send myself a Telegram message with the data. AirFlow lets me programmatically manage these jobs. It is like cron, but better.	AirFlow
Ollama	Ollama lets me pull and run LLMs on my machine. This service exposes models through an API so you can connect other software, like a chat frontend, to it.	Ollama
OpenWebUI	Chat frontend for LLMs. I connect this to Ollama, and I can use multiple LLM models at once. This has replaced ChatGPT for most usage.	OpenWebUI
Postgres	Database. I use this for storing data from AirFlow jobs that periodically run.	Postgres
Continue	Like CoPilot, but I can point it to Ollama instead of GitHub's CoPilot servers.	Continue

AirFlow

OpenWebUI

Jupyter

Monitoring

I use Glances for monitoring system performance, and I normally keep a terminal window open with it running at all times.

Power Consumption

I track energy consumption using a Tapo Smart Plug. I wrote about how I do this here.

This chart is a bit spiky because I sometimes turned the server off for a while. However, you can generally see that Windows OS consumes more energy on average than Linux.

LLM Performance

Here are the average tokens per second on a single 3090 over 10 attempts for the prompt "Hi there":

Model	Average Tokens per Second (Over 10 Attempts)
Llama 3.1: 8B	129
Phi 4	80
Qwen 2-5-14B	75
Deepseek R1 14B	39

Conclusion

I am happy with my build overall. I haven't found a bottleneck for any of the tasks I've thrown at it so far, including a fairly large Naive Bayes classifier that I was training on gigabytes of text.

I would recommend this build to anyone looking to get into AI/ML without breaking the bank, but I would probably not recommend the same case, it feels a bit cheap and i've already lost some standoffs.

And remember;

Table of Contents