Infrastructure · Automation · Reliability

Micheal Breedlove

Building resilient infrastructure, distributed automation systems, and AI-assisted operational platforms.

Infrastructure & automation engineer.
Designed a self-healing AI cluster orchestration platform with adaptive routing, shared memory, and automated disaster recovery.
Seeking: Security Engineer / DevSecOps / Platform / SRE roles.
...

About

Infrastructure engineer building reliable, automated, recovery-first systems.

I design and operate distributed infrastructure with an emphasis on automation, resilience, and operational discipline. My background includes military intelligence (former 35N SIGINT), operations leadership in high-pressure environments, and hands-on systems engineering. Currently completing a B.S. in Cybersecurity at WGU.

  • Self-healing AI cluster orchestration platform with adaptive routing and shared memory
  • Distributed task queue with health-aware dispatch and automated failure recovery
  • Proxmox virtualization, OPNsense firewall, ZFS-backed storage with snapshot policies
  • SRE automation: SLO tracking, incident management, automated curation and DR drills
  • CompTIA Tech+ certified, targeting Security+ · B.S. Cybersecurity at WGU (in progress)

Featured Project

AI Cluster Orchestration Platform

A self-healing control plane that coordinates distributed worker nodes, manages shared operational knowledge, and automates recovery, routing, and validation workflows across a 4-node homelab cluster.

Adaptive Routing Shared Memory Self-Healing Recovery-First

Jasper orchestrates health-aware job dispatch across Nova, Mira, and Orin based on role specialization, real-time health, and observed task performance. A shared Markdown memory corpus on ZFS-backed storage provides durable operational knowledge, while local semantic indexes keep each node fast and independent.

Jasper
Orchestrator · Inference

Nova
Automation · Monitoring
Mira
Network · Auditing

Orin
Analysis · Validation

NAS · ZFS · Snapshots
Five-plane platform architecture
Adaptive Routing

Tasks assigned based on node health, role specialization, and observed success rates. Routing improves automatically as execution history builds.

Shared Knowledge

Shared Markdown corpus on NAS with per-node daily notes. Nightly curation promotes durable facts and archives old observations automatically.

Self-Healing

Runtime watchdog detects degradation, quarantines bad state, and restores from verified backups. Preflight guards block startup on corrupted state.

Disaster Recovery

Portable recovery bundle rebuilds the entire orchestrator. Monthly DR drills validate the bundle in a sandboxed environment without touching live state.

Autonomous Operations

The orchestrator generates maintenance and remediation tasks automatically. Risky actions require approval; safe operations execute without intervention.

Durable Job Queue

File-based queue on shared storage with dispatch, tracking, retry logic, and stale-job detection. No external database dependencies.

Engineering Capabilities

Core strengths demonstrated through working systems, not just theory.

Infrastructure Automation

Ansible, Proxmox, systemd, scheduled tasks, and infrastructure-as-code with CI/CD pipelines and GitOps workflows.

Reliability Engineering

SLO tracking, burn-rate alerting, safety gates, incident management, and recovery-first system design.

Distributed Systems

Multi-node orchestration, shared-nothing memory architecture, file-based job queues, and health-aware task dispatch.

AI-Assisted Operations

Local LLM inference, autonomous task generation with approval gates, and AI-orchestrated cluster management.

Recovery & Resilience

ZFS snapshots, self-healing watchdogs, portable recovery bundles, monthly DR drills, and conservative safety policies.

Documentation Discipline

Automated knowledge curation, architecture documentation, operational runbooks, and clear rollback procedures.

What This Demonstrates

These projects show how I approach real infrastructure problems: design for failure, automate recovery, observe everything, and document decisions. I build systems that heal themselves, route work intelligently, and produce clear operational records — the same principles that matter at production scale.