I build and operate the infrastructure that powers AI. From bare-metal GPU clusters to high-speed Infiniband fabrics — I set up the Slurm schedulers, Kubernetes control planes, RoCE storage backends, and CUDA environments that let AI teams train and deploy models at scale. Currently running NVIDIA DGX, HGX H200, and A100 clusters in production with Dragonfly, Elasticsearch, and Lustre storage under the hood.