Sam Stoelinga

Sam Stoelinga @samos123

Joined:
Jan 1, 2020

Sam Stoelinga
articles - 5 total

Don't use a K8s Service for LLM Serving!

Relying solely on standard Kubernetes Services for load balancing can lead to suboptimal performance...

Learn More 0 0Mar 11

Tutorial: Deploying Llama 3.1 405B on GKE Autopilot with 8 x A100 80GB

Tutorial on how to deploy the Llama 3.1 405B model on GKE Autopilot with 8 x A100 80GB GPUs using...

Learn More 0 0Oct 8 '24

Infinity embeddings on Kubernetes with KubeAI

Just merged and released the Infinity support PR in KubeAI, adding Infinity as an embedding engine....

Learn More 3 1Sep 25 '24

Introducing KubeAI: Open AI Inference Operator

We recently launched KubeAI. The goal of KubeAI is to get LLMs, embedding models and Speech to text...

Learn More 3 1Sep 16 '24

How I deployed a global website speed checking service to 25 locations while costing less than $5/yr using GCP Cloud Run

I created https://websu.io an open source webpage speed monitoring tool and a key feature was the...

Learn More 9 1Feb 28 '22