Automating ops without building a rat's nest
Where to put logic, when to use a workflow tool versus code, and how to keep automations debuggable a year later.
Most ops automation projects start clean. One Zap, one trigger, one action. Six months later there are 47 Zaps, three Airtable bases, two Make scenarios, and a Slack bot nobody remembers building. When something breaks, nobody knows where to look.
That is the rat's nest. It is the predictable outcome of a stack that grew one quick win at a time without anyone holding the architecture in their head.
This is how we build ops automation that is debuggable a year later, by the next person, without an archeology dig.
The single rule
Every automation we ship follows one rule:
Logic lives in one place. The rest of the stack moves data in and out of that place.
In a healthy stack, the "one place" is your codebase or a single server function layer. In a no code stack, the "one place" is a single workflow tool or a single database with triggers. Either is fine. What is fatal is logic spread across four tools.
When the rule breaks, debugging stops being "where is the bug" and becomes "which tool ran first." Every five minutes of debugging triples.
Workflow tool versus code: the cut
The honest answer:
- Use a workflow tool when the automation is one trigger, one or two transformations, and one action, and the inputs and outputs are stable third-party tools. Stripe to Slack. HubSpot to Notion. Calendly to email.
- Use code when the automation has branches, retries, dedupe logic, or talks to your own database. The moment "if this then that" turns into "if this and that but not when X," you need code.
The mistake is starting in a workflow tool and growing into code complexity. The Zap that started as "lead lands in HubSpot, ping Slack" becomes "lead lands in HubSpot, check if already a customer, look up the rep, score the company, decide which channel, format the message, optionally @-mention the rep." That Zap is now an 11-step rat's nest and should have been ten lines of code in the first place.
The reverse mistake is rarer: writing code for the truly simple two-tool integrations. A Zap is fine for those. It is faster to ship and faster to hand off.
The seam test
Before we ship any automation, we run the seam test:
Can the next engineer find this automation by reading the codebase, without knowing what tools we used?
If the answer is no, the automation is invisible. Invisible automations are the rat's nest. They fire silently, they fail silently, and they confuse everyone.
The seam test is why we keep a registry. One markdown file in the repo lists every automation, where it lives, what triggers it, and what it does. Updated on every new automation. The file is the index. The tools are the implementation.
A registry sounds boring. It is the single highest-leverage discipline in ops automation. Every team we have worked with that maintains one has a maintainable stack. Every team that does not has a rat's nest.
Logging that survives
Every automation we ship logs three things:
| Field | What it captures |
|---|---|
| trigger_at | When the automation fired |
| input_hash | A hash of the input so duplicate fires are spottable |
| outcome | success, skipped, failed, with a short reason |
Three fields. One table. Every automation writes to it. When something breaks at 11pm, that table is where the on-call person looks first. Without it, the first hour of every incident is "did it even fire?"
This is the boring infrastructure that pays for itself the first time something goes wrong.
Where to put the logic
A good default:
- Triggers in the source system. HubSpot fires when the contact is created. Stripe fires when payment succeeds. Do not poll for things the source system can push.
- Logic in a server function in your own stack. One function per automation, named after what it does, in a folder all the automations live in.
- Actions in the destination system, called by the server function with its own API.
Triggers in source. Logic in code. Actions out. That layering is what makes the stack debuggable.
The opposite pattern, which is what the rat's nest stacks look like, is logic spread across four workflow tools because each tool was the quickest way to add the next conditional. The quickest add becomes the most expensive remove.
Retries, dedupe, and idempotency
Every automation that touches money, customers, or external systems needs all three:
- Retries for transient failures. The destination is briefly down. Try again in a minute. After three failures, alert.
- Dedupe for double-fires. Webhooks fire twice more often than anyone expects. Use the input hash.
- Idempotency in the destination call. Send the same external ID every time. If the action ran already, the destination treats it as a no-op.
Most no code workflow tools handle some of this. None handle all of it well. When the automation matters, write the code.
The cuts we make on every project
Three patterns we delete on sight:
Multi-tool fan-out. "When X happens, fire a Zap, which fires a Make scenario, which writes to Airtable, which triggers an Airtable automation." Collapse to one tool or one function.
Polling. If the source supports webhooks, use webhooks. Polling is bandwidth waste and adds latency.
Untracked branches. Every "if this then that" without a logged outcome is invisible. Log the branch taken.
Each cut is small. Stacked, they turn a rat's nest into a stack that can be handed off.
How long does it take to clean up a rat's nest?
Less than you think, if you do it the right way.
The wrong way is "audit everything and rebuild." That takes a quarter. By the time you are done, half the audit is stale.
The right way is the strangler pattern. New automations go into the clean stack. Existing automations stay where they are until they break or change. When they do, they get rebuilt clean and the old version is killed. Within six months the rat's nest is gone, with no big rewrite. Within twelve months the team has forgotten it was ever there.
Want a rat's nest reviewed or rebuilt?
If your team has an automation stack nobody fully understands, the call is thirty minutes. We map what you have on a single page and tell you what should stay, what should move, and what to kill.
Book a discovery call and we will start with the messy version. The messy version is the version we want to see.
Frequently asked questions
Is Zapier a rat's nest by default?+
No. Zapier is fine for simple two-tool integrations. It becomes a rat's nest when teams use it for logic that grew past one trigger and one or two transformations. Cap each Zap at three steps. Anything more goes into code.
What about n8n or Make for self-hosted workflows?+
Better than Zapier for complex flows because the visual graph is the truth and the version control is real. Still not a replacement for code when the logic needs retries, dedupe, and idempotency. Use them in the same role as Zapier with a higher ceiling.
How do I know when to migrate from no code to code?+
When you have more than ten conditional branches across the stack, when an automation has failed silently more than once, or when onboarding a new engineer requires a full walkthrough of which tool does what. Any one of those is the trigger.
Where should the automation registry live?+
In the same repo as the codebase. Markdown file, kept short, updated on every PR that adds or changes an automation. The registry is the README of your automation stack.
Keep reading
Got an idea? We can ship it next week.
30-minute discovery call. We tell you what's possible, what it costs, and when it ships.