TerrellCloud.io

Five questions we ask before any cloud architecture review

2026-07-03T14:00:00+00:00

Cloud architecture reviews go wrong in a predictable way: they start with the diagram. A diagram tells you what someone intended to build. It says very little about what the system actually does under load, what it costs, or what breaks at 2 a.m.

Before we look at any client’s architecture diagram, we ask five questions. The answers reshape the review every time.

1. What does this system cost per month, and who looks at that number?

Not the budget — the actual bill, broken down by service. If nobody on the team can answer within twenty percent, cost isn’t being managed, it’s being absorbed. The follow-up (“who looks at it?”) matters more than the number: unowned costs grow, owned costs shrink. This is usually where we find the quickest wins — idle non-production environments, oversized instances, and data transfer patterns nobody chose on purpose.

2. When did this system last fail, and how did you find out?

The answer distinguishes systems with operational maturity from systems with operational luck. “Last Tuesday, an alert paged us before customers noticed” is a healthy answer regardless of the failure. “It doesn’t really fail” means either nobody’s looking, or the system hasn’t been stressed yet — and both mean the reliability posture in the diagram is untested theory.

3. What’s the one change you’re afraid to make?

Every team has one: the database migration nobody wants to run, the IAM policy nobody fully understands, the service that only one engineer can deploy. That fear is a precise map of where the architecture’s real risk lives. Reviews that skip this question produce recommendations about the parts of the system that were already fine.

4. If this workload doubled next quarter, what breaks first?

The point isn’t the prediction — it’s watching where the team looks. Confident, specific answers (“the write path on the primary database”) indicate a team that knows its bottlenecks. Silence, or five different answers from five engineers, tells us the scaling story is aspiration rather than design.

5. What are you paying for that you no longer need?

Architectures accrete. The message queue added for a feature that was cancelled, the second region “for DR” that’s never been failed over to, the Kubernetes cluster running three pods. Subtraction is the most underrated architectural move in cloud systems, and nobody inside the team has the standing to propose it. Outside reviewers do — it’s half of why reviews are worth commissioning.

Only after these five answers do we open the diagram — because at that point, we know which parts of it are load-bearing and which are wishful thinking. If your team is due for that kind of review, the about page covers how we engage.

Designing GenAI systems: start with the workflow, not the model

2026-06-27T14:00:00+00:00

The most common failure mode we see in GenAI initiatives isn’t a bad model choice. It’s starting from the model at all.

Teams pick a model, wire it to a prompt, get an impressive demo in a week — and then stall for months, because nobody defined what the system is supposed to do, for whom, and how anyone would know it’s doing it well. The model was never the hard part.

Work backwards from the workflow

Before we talk about models, retrieval strategies, or agent frameworks, a GenAI design engagement starts with three questions:

What human workflow is this replacing or accelerating? “Answer support tickets” is not a workflow. “Draft a first response to tier-1 billing tickets, which an agent reviews before sending” is. The second version tells you the input distribution, the tolerance for error, and where a human sits in the loop.
What does a good output look like, concretely? If the team can’t produce ten examples of ideal outputs by hand, the system can’t be evaluated — and a GenAI system that can’t be evaluated can’t be improved, only vibed at.
What happens when it’s wrong? Every LLM system is wrong some percentage of the time. The design question is whether a wrong answer costs an awkward edit, a refund, or a regulatory incident. That answer drives the architecture far more than model benchmarks do.

The architecture falls out of the answers

Once the workflow is pinned down, most architectural decisions stop being debates:

Retrieval (RAG) vs. fine-tuning is usually settled by how fast the underlying knowledge changes and who needs to audit the sources — not by which technique is fashionable.
Agentic vs. single-shot is settled by whether the workflow has genuine intermediate decisions, or whether an “agent” is just a chain of prompts that would be simpler as a pipeline.
Human-in-the-loop placement is settled by the cost-of-error answer. High-stakes outputs get review gates; low-stakes ones get sampling and monitoring instead.

Evaluation is part of the system, not a phase

The deliverable that makes a GenAI system maintainable isn’t the prompt — it’s the evaluation harness. A small, versioned set of real inputs with graded expected outputs, run on every change, does for GenAI what a test suite does for conventional software. Without it, every prompt tweak and model upgrade is a leap of faith. We treat the eval set as a first-class artifact from week one, and it’s consistently the piece clients thank us for a year later.

Models will keep changing under everyone’s feet. Workflows, error budgets, and evaluation discipline are what carry a GenAI system through those changes — which is why that’s where the design work starts.

Welcome to the TerrellCloud.io blog

2026-06-20T14:00:00+00:00

TerrellCloud.io has been helping clients design and implement cloud and generative AI systems since 2021. This blog is where we start writing that work down.

Consulting engagements generate a lot of hard-won, reusable knowledge: architecture patterns that survive contact with production, GenAI designs that hold up after the demo, cost lessons learned the expensive way. Most of it stays trapped in private decks and retro documents. We’d rather put the general lessons here, where clients and technical peers can use them.

What to expect

Posts will stay close to the work:

Cloud architecture — designs, reviews, migrations, and the trade-offs behind them.
Generative AI systems — RAG, agents, evaluation, and what it takes to run GenAI in production rather than in a notebook.
Consulting practice — how we scope engagements, ask questions, and decide what’s worth building.

No release-note filler, no thinly veiled product pitches. If a post wouldn’t be useful to a technical peer, it doesn’t ship.

If any of this maps to a problem you’re working on, the about page explains how we engage. Otherwise — welcome, and enjoy the archive as it grows.