Llm

Self Hosted AI: Actually Running Local LLMs for a Multi-User Household featured image

Self Hosted AI: Actually Running Local LLMs for a Multi-User Household

How self-hosted AI became the final piece of my homelab puzzle, delivering true parallel processing for multi-user setups and unlocking the real superpower of knowledge management.

Derek Armstrong - Staff Software Engineer and Solutions Architect
Derek Armstrong
Read more
Running Qwen3.6 27B Locally on Dual RTX 3090s with vLLM v0.19 featured image

Running Qwen3.6 27B Locally on Dual RTX 3090s with vLLM v0.19

How I went from a blank Docker template to 116+ tok/s with speculative decoding, FlashInfer, and a 160k context window on dual 3090s.

Derek Armstrong - Staff Software Engineer and Solutions Architect
Derek Armstrong
Read more
Qwen3.5 Showdown: 27B Q8 vs 35B-A3B Q8 — Real-World Testing for Local AI featured image

Qwen3.5 Showdown: 27B Q8 vs 35B-A3B Q8 — Real-World Testing for Local AI

A real-world comparison of Qwen3.5 27B Q8 and 35B-A3B Q8 running locally on a dual RTX 3090 homelab — which one actually belongs in your daily workflow?

Derek Armstrong - Staff Software Engineer and Solutions Architect
Derek Armstrong
Read more
Pydantic and Pydantic-AI: Type Safety That Actually Earns Its Keep featured image

Pydantic and Pydantic-AI: Type Safety That Actually Earns Its Keep

Pydantic is one of those libraries I underestimated until the day it saved me four hours of debugging. Here's what it actually does, where it hurts, and why Pydantic-AI has me …

Derek Armstrong - Staff Software Engineer and Solutions Architect
Derek Armstrong
Read more
Run your own AI LLM in two commands featured image

Run your own AI LLM in two commands

Set up your own AI chatbot locally using Meta's Llama model and Docker in just two commands

Derek Armstrong - Staff Software Engineer and Solutions Architect
Derek Armstrong
Read more