Cloud vs Local GPU for AI: What 2022 Got Right (and Wrong)

“Back in my day, we overclocked our Pentiums to run Quake faster. Now we’re doing it for language models.”
— Wing, mid-rant, 2022

In the year 2022, a new frontier opened up for hobbyists and developers alike: Large Language Models (LLMs) stopped being pure research tools and started becoming part of the daily build cycle for curious engineers and indie coders. But with that shift came a new dilemma—do you run your AI workloads locally, or pipe them out to the cloud?

Let’s break it down like it’s 1999… but with GPUs.

🖥️ Local GPU: The Tactile Appeal of Metal Under Your Desk

For those of us raised in the era of beige towers and CRT hum, there’s something right about having your compute power in the same room. In 2022, high-end cards like the RTX 3090 and later the 4090 offered enough VRAM to run decent-sized models locally—7B parameter LLMs, some diffusion models, even audio transformers—without renting compute by the hour.

Pros:

One-time cost, no subscription
Full control over environment and files
Offline = private and secure
You can feel the heat. And that’s comforting.

Cons:

Huge upfront cost ($1,200+)
Noise, power, and heat (unless you’re into that)
Laptop users: tough luck
Driver and CUDA hell, a rite of passage

☁️ Cloud GPU: Rent-A-Rig for the AI Gold Rush

Services like Google Colab, AWS, Lambda, and even niche startups offered “just enough GPU” to run training and inference. You could spin up a session, run your Hugging Face notebook, and be done in an hour—if you didn’t run into quota walls or timeouts.

Pros:

No upfront hardware cost
Great for bursty, one-off jobs
You can use a Chromebook and still look cool

Cons:

Pay-per-minute = 🩸
Data privacy? Hope you read the terms of service
Usage limits, cold boots, and queue hell
You’ll eventually want to build your own rig anyway

🎛️ The 2022 Reality

Most power users did both. Locals ran quantized 4-bit models and experimented freely, while the cloud got used for training, fine-tuning, and crunching large jobs.

2022 wasn’t about choosing one over the other—it was about realizing that every tool has a vibe.

Cloud GPUs were like renting a hot rod for a weekend. Local GPUs were like restoring your own muscle car. Both were fast… but only one felt like it was yours.

🧠 TL;DR: Wing’s Retro Verdict

If you grew up flashing BIOS chips with jumpers, you’ll appreciate the control of local.
If you’re from the “web app everything” crowd, the cloud probably felt more natural.

But if you’re serious about AI long-term?
You’ll build your own box. Eventually. And name it something like SKYNET-DEV.

💬 Got your own rig story or cloud horror tale from 2022? Send it over.