/infrastructure

Build infrastructure you can operate on a bad day.

Infrastructure is more than servers or a cloud account. It's the whole quiet collection of systems, access paths, dependencies, deployment procedures, state, monitoring, backups, and operational knowledge that keeps everything else standing.

A system is not dependable because it's online today. It's dependable when someone can understand it, change it deliberately, notice when it fails, and recover without archaeology.

See where infrastructure lives Explore the Private Cloud / Homelab Ask about an infrastructure review

This is practical infrastructure architecture and review work — not 24/7 administration, emergency support, a managed hosting service, a cloud provider, or an uptime guarantee.

Where it lives

The system beneath the system.

These are infrastructure concerns, not nine new service offerings. Most fragility hides in the seams between them — the parts nobody owns on purpose.

Ownership and inventory

The unglamorous foundation: knowing what exists before deciding whether it's safe.

What's actually running, and what it does
Who owns it and where it runs
What depends on it, and what it depends on
Whether anything has quietly become orphaned

Small business technology audit

DNS and edge

The front door. When DNS or a certificate lapses, everything downstream looks broken at once.

Registrar ownership and recoverable access
DNS records, redirects, and routing
Certificate and domain expiry
Email-related records where they matter

Infrastructure Sanity Pass

Networking and exposure

The boundary between intended and accidental — what the world can reach versus what you meant to share.

Ingress, remote access, and segmentation
VLANs, firewall rules, tunnels, and port forwards
Public versus private services
Intended exposure versus the accidental kind

Private Cloud / Homelab Review

Compute and runtime

Where workloads actually run, and how cleanly they're separated when one of them misbehaves.

Hosts, virtual machines, and containers
Process boundaries and privileged workloads
Resource isolation and blast radius
Runtime consistency across environments

Private Cloud / Homelab

Storage and state

Servers are replaceable; state usually isn't. This is the part worth being deliberate about.

Persistent data and database ownership
Storage capacity and snapshots
Stateful dependencies between services
Where the important data actually lives

Notes from a private cloud gremlin

Deployment and change

How code and configuration reach production — and whether a change is repeatable or a memory test.

How code and config reach production, and who can deploy
Repeatability and environment separation
Approvals and rollback
Configuration drift over time

How I build

Monitoring and capacity

The difference between learning a system is unhealthy from your dashboard and learning it from your users.

Service health, logs, and alerts
Alert ownership and what fails silently
Disk, memory, CPU, queue, and certificate pressure
What users notice first

The lab

Backups and recovery

A backup is a checkbox; a tested restore is a plan. Only one of them helps on a bad day.

Backup scope and off-system copies
Restore testing, not just backup success
Recovery order and ownership
Acceptable data loss and downtime, decided in advance

Backup & disaster recovery checklist

Documentation and handoff

Operational memory that lives outside one person's head — so the system survives a vacation.

System diagrams, runbooks, and service inventory
Access ownership and dependency notes
Incident history worth keeping
Future operation without one person's memory

Documentation is infrastructure

The operating pattern

Operate the consequences.

A general operating pattern, not an SRE framework, formal audit method, or a claim that every system needs identical tooling. The exact shape follows the system — but the shape is always there.

Inventory

Identify systems, services, accounts, domains, data, dependencies, and owners.
Map dependencies

Show what relies on what, and where a single failure would spread.
Define intended state

Decide what should be public, private, running, backed up, monitored, and maintained.
Remove obvious fragility

Address stale access, accidental exposure, unsupported components, and hidden single points of failure.
Make changes repeatable

Use documented, reviewable deployment and configuration paths instead of memory-driven commands.
Add useful visibility

Monitor what matters and make alerts actionable rather than merely noisy.
Test failure and recovery

Validate backups, rollback, rebuild procedures, and service restoration.
Document operation

Record ownership, dependencies, safe procedures, and what to do when normal assumptions fail.
Revisit after change

Review the system after meaningful growth, migration, vendor, staffing, or architecture changes.

Principles

Boring is a feature.

The dependable system is usually the unexciting one — fewer surprises, clearer ownership, an obvious way back.

Understand before automating

Automation amplifies whatever it's pointed at, including the mistakes.

Repeatability beats memory

A documented path survives the day the one person who knew it is out.

Recovery beats backup checkmarks

A backup nobody has restored is a hope, not a plan.

State deserves deliberate ownership

Compute is replaceable; the data it holds usually isn't.

Single points of failure should be visible

You can accept a risk you can see; the hidden ones decide for you.

Fewer moving parts are often safer

Every component is something that can break and something to operate.

Infrastructure should expose useful state

A system that can't tell you how it's doing is one you operate blind.

Alerts need an owner and an action

An alert nobody owns and nobody can act on is just noise with anxiety.

Patching needs a routine

Updates that only happen in emergencies are themselves an emergency.

Capacity failures are still failures

A full disk takes a system down as completely as any crash.

Documentation is operational memory

It's part of the system, not a chore that comes after it.

Managed services are tools, not morals

Use them where they earn their place — not as a position to defend.

Self-hosting is a deliberate tradeoff

Worth it when ownership matters; costly when it's chosen by reflex.

Automation should reduce toil, not hide the system

If it makes the system harder to understand, it's a liability.

Removing complexity can be the safest improvement

Sometimes the best change is one fewer thing to operate.

Failure modes

What makes infrastructure quietly fragile.

Each failure mode gets paired with a proportionate, defensive response. Calm operations, not fear marketing.

Nobody knows what is running

The stack has grown organically and no current map of it exists.

Response

Maintain a current host, service, domain, account, and dependency inventory.

DNS or registrar access lives in one account

A single lapsed login could take the domain — and everything on it — with it.

Response

Document ownership, use recoverable access, and review renewal and administrative paths.

The deployment process exists only in shell history

Shipping a change depends on one person remembering the right commands.

Response

Create a repeatable, documented, reviewable path with rollback.

One host carries everything

A single box quietly became load-bearing without anyone deciding it should.

Response

Make the dependency explicit, understand the blast radius, and decide whether redundancy or a rebuild plan is justified.

Backups have never been restored

Backups run, but no one has confirmed they'd actually come back.

Response

Test representative restores and document recovery order and ownership.

Monitoring only says “down”

Alerts tell you something broke, but not enough to decide what to do.

Response

Add service, resource, dependency, and certificate signals that support an actual decision.

Alerts have no owner

Notifications fire into a channel everyone has muted.

Response

Define who receives them, what qualifies as urgent, and what action follows.

Certificates or domains expire silently

The first sign of an expiry is the outage it causes.

Response

Track expiry, automate renewal where appropriate, and alert before failure.

Storage fills without warning

Capacity creeps up until a disk hits 100% and takes services with it.

Response

Monitor capacity and growth, set thresholds, and know what can be removed or expanded safely.

Updates happen only during emergencies

Patching is a reaction to incidents instead of a routine.

Response

Create a regular patch and upgrade routine with tested rollback or rebuild options.

Remote access grew organically

Tunnels, VPNs, and port forwards accumulated faster than anyone removed them.

Response

Inventory remote-access paths, credentials, and exposed services; remove what no longer has a purpose.

Only one person can recover the system

Recovery depends entirely on one person's memory being available.

Response

Create diagrams, runbooks, access records, and a tested handoff path.

Proof & material

The thinking and the receipts.

Verified projects, writing, and supporting pages that show the approach in practice — a learning lab and field notes, not customer case studies or production claims.

Project

Private Cloud / Homelab

A real learning and validation environment — a place to operate the consequences before they're production. A lab, not a public cloud or a hosting claim.

open Project

Duvall WiFi

An earlier chapter of independent consulting and infrastructure work — historical evidence of cross-layer systems thinking, not the current business identity.

open Article

Homelabs Teach the Messy Parts

Why a lab hands you the operational failures polished dashboards hide — and why those messy parts are the lessons worth having.

open Article

Notes From a Private Cloud Gremlin

Field notes on private cloud, containers, and the deliberate joy of making infrastructure actually yours.

open Article

Small Business Backup and Disaster Recovery Checklist: Recover Without Panic

A practical checklist for backups that survive ransomware, restores that actually work, and recovery you can run without panic.

open Article

Documentation Is Infrastructure

The case that docs aren't a side quest: for secure systems, automation, and homelabs, documentation is part of the system itself.

open Article

Technical Documentation for Small Business: Runbooks, SOPs, and Knowledge Bases That Actually Help

How to write runbooks, SOPs, and operational notes that reduce risk and make a system easier to operate and hand off.

open Article

Small Business Technology Audit: Find the Mess Before It Finds You

A practical sweep across tools, accounts, access, backups, and documentation to surface the mess before it surfaces itself.

open Page

Uses

The tools and environment behind the work — what's in the kit and why.

open Page

The Lab

Where experiments and homelab work happen, and failure modes get learned cheaply.

open Page

Trust & boundaries

How access, credentials, automation, and production systems are handled on an engagement.

open Resource

Resources

Free, lightweight checklists — including infrastructure-flavored sanity lists.

open

Starting points

What kind of infrastructure problem do you have?

Start from the situation that sounds like yours, not the service name.

The stack works, but it feels fragile

Get a second set of eyes on what's solid versus what's quietly holding on, then write down the parts that only live in someone's head.

We run a homelab, self-hosted environment, or private cloud

Review the environment for fragility, exposure, and hidden dependencies before it becomes something you'd genuinely miss.

We are planning a migration or rebuild

Pressure-test the architecture and the tradeoffs first — this is focused planning and review, not a large managed migration team.

We are unsure whether to self-host or use a managed service

The right choice depends on ownership, reliability, security, skill, cost, and recovery — not ideology. Talk it through against your actual constraints.

Backups exist, but nobody trusts recovery

Move from “backups run” to “recovery works” — test representative restores and write down the order and ownership.

Exposure, access, or secrets are the main concern

Look at what's reachable, who can reach it, and where credentials live — the review that adds security and access risk across the stack.

We are in an active outage

This site does not offer emergency 24/7 operations or an incident-response retainer. Use your existing runbook and engage the responsible hosting, network, platform, or incident-response provider.

We need ongoing administration

The work described here is focused architecture, review, planning, and bounded implementation — not continuous managed operations.

Fit

Useful when ownership matters.

This work pays off when you want to understand and operate a system better — not when you need someone to run it around the clock.

Good fit

Infrastructure that has grown organically
Dependencies that are unclear
A deployment path that depends on one person
Backups that need validation
Monitoring with blind spots
Self-hosted systems that are becoming important
A migration or redesign that needs practical tradeoff analysis
A team that wants prioritized improvements, not a giant rewrite
Situations where documentation and handoff matter

Not a fit

24/7 administration
Emergency outage response
A large managed-services contract
Formal compliance certification
A penetration test
Guaranteed uptime
A cloud reseller relationship
Requests to conceal unauthorized or insecure activity
A predetermined tool purchase seeking only a rubber stamp

“Not a fit” isn't a judgment — it just means a different provider or arrangement is the honest answer for that need.

Pakkit OS

See the whole Pakkit OS →

Boundaries

What this work is not.

Not a public cloud or hosting provider
Not a managed-service provider
Not 24/7 administration
Not emergency operations
Not guaranteed uptime
Not a penetration test
Not formal compliance certification
Not a vendor reseller
Not a claim that self-hosting is always better
Not a promise that every system needs Kubernetes, microservices, or a private cloud
Not an enterprise team disguised as one person

Even so, focused review, architecture, documentation, and bounded implementation can make a system substantially easier to operate.

Bring the system as it exists

Find the first fragile assumption.

The fastest way to start is to describe the system as it actually is:

What is running
Where it runs
Who depends on it
How changes deploy
What is monitored
What is backed up
How recovery currently works
What only one person knows
Which part feels fragile, expensive, or confusing

Please don't put any of these into the contact form:

Passwords
Private keys
API tokens
Recovery codes
Full configuration exports
Private network diagrams with sensitive details
Customer data
Confidential logs

Start a conversation Explore infrastructure services See the Private Cloud / Homelab

Build infrastructure you can operate on a bad day.

The system beneath the system.

Ownership and inventory

DNS and edge

Networking and exposure

Compute and runtime

Storage and state

Deployment and change

Monitoring and capacity

Backups and recovery

Documentation and handoff

Operate the consequences.

Boring is a feature.

Understand before automating

Repeatability beats memory

Recovery beats backup checkmarks

State deserves deliberate ownership

Single points of failure should be visible

Fewer moving parts are often safer

Infrastructure should expose useful state

Alerts need an owner and an action

Patching needs a routine

Capacity failures are still failures

Documentation is operational memory

Managed services are tools, not morals

Self-hosting is a deliberate tradeoff

Automation should reduce toil, not hide the system

Removing complexity can be the safest improvement

What makes infrastructure quietly fragile.

Nobody knows what is running

DNS or registrar access lives in one account

The deployment process exists only in shell history

One host carries everything

Backups have never been restored

Monitoring only says “down”

Alerts have no owner

Certificates or domains expire silently

Storage fills without warning

Updates happen only during emergencies

Remote access grew organically

Only one person can recover the system

The thinking and the receipts.

Private Cloud / Homelab

Duvall WiFi

Homelabs Teach the Messy Parts

Notes From a Private Cloud Gremlin

Small Business Backup and Disaster Recovery Checklist: Recover Without Panic

Documentation Is Infrastructure

Technical Documentation for Small Business: Runbooks, SOPs, and Knowledge Bases That Actually Help

Small Business Technology Audit: Find the Mess Before It Finds You

Uses

The Lab

Trust & boundaries

Resources

What kind of infrastructure problem do you have?

The stack works, but it feels fragile

We run a homelab, self-hosted environment, or private cloud

We are planning a migration or rebuild

We are unsure whether to self-host or use a managed service

Backups exist, but nobody trusts recovery

Exposure, access, or secrets are the main concern

We are in an active outage

We need ongoing administration

Useful when ownership matters.

Good fit

Not a fit

One discipline across three modes.

Build

Create

Play

What this work is not.

Find the first fragile assumption.