Every Time I Think I've Decided on My LLM Server, This Happens
June 17, 2026
I thought I had this figured out. Apple Silicon with unified memory, or AMD Strix Halo — both solid options, both reasonable, both things I could actually pull the trigger on without a second mortgage.
Then this ... (I have to give credit to the poster who did the research. Go to "X" and support "Marfin" -> https://x.com/marfinxx?s=20)
THIS TEENAGER PLUGGED 3X RTX 4090S INTO ONE MOTHERBOARD AND MADE $16,500/MONTH RUNNING LOCAL AGENTS
THIS TEENAGER PLUGGED 3X RTX 4090S INTO ONE MOTHERBOARD AND MADE $16,500/MONTH RUNNING LOCAL AGENTS
— marfin (@marfinxx) June 16, 2026
hosting large open-source models for client automations requires massive VRAM. Apple Silicon UMA and AMD Strix Halo are great, but multiple GPUs in a custom system change the math… https://t.co/s6m4F5EVIn pic.twitter.com/UQJ0TC02C6
And just like that, the math changes. Again.
As always, what people are doing to run local LLM's is changing so rapidly. I actually don't think that companies like NVIDIA or AMD are thinking this way...(well maybe, but not the point).
Read the full article, it's really interesting
— marfin (@marfinxx) June 16, 2026
When I think about price for the NVDIA, AMD, or Mac Studio pro, this actually seems more reasonable and now, could even be better from a long term perspective. Time to dig a little more - oh and by the way - I don't yet understand the "make money" model, but that I will figure out as well.