vagor.one — Just a Curious Mind

I thought I had this figured out. Apple Silicon with unified memory, or AMD Strix Halo — both solid options, both reasonable, both things I could actually pull the trigger on without a second mortgage.

Then this ... (I have to give credit to the poster who did the research. Go to "X" and support "Marfin" -> https://x.com/marfinxx?s=20)

THIS TEENAGER PLUGGED 3X RTX 4090S INTO ONE MOTHERBOARD AND MADE $16,500/MONTH RUNNING LOCAL AGENTS

THIS TEENAGER PLUGGED 3X RTX 4090S INTO ONE MOTHERBOARD AND MADE $16,500/MONTH RUNNING LOCAL AGENTS

hosting large open-source models for client automations requires massive VRAM. Apple Silicon UMA and AMD Strix Halo are great, but multiple GPUs in a custom system change the math… https://t.co/s6m4F5EVIn pic.twitter.com/UQJ0TC02C6
— marfin (@marfinxx) June 16, 2026

And just like that, the math changes. Again.

As always, what people are doing to run local LLM's is changing so rapidly. I actually don't think that companies like NVIDIA or AMD are thinking this way...(well maybe, but not the point).

Read the full article, it's really interesting

https://t.co/fkcbKcA46P
— marfin (@marfinxx) June 16, 2026

When I think about price for the NVDIA, AMD, or Mac Studio pro, this actually seems more reasonable and now, could even be better from a long term perspective. Time to dig a little more - oh and by the way - I don't yet understand the "make money" model, but that I will figure out as well.

Every Time I Think I've Decided on My LLM Server, This Happens