#1781: Writing Tests Before Code Is Insane (Until You Try It)

Why testing feels like a tax, how it actually speeds you up, and the simple three-step method to start today.

Featuring

Daniel

Corn

Herman

0:000:00

Episode Details

Episode ID: MWP-1935
Published: Mar 30
Duration: 19:29
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash
Topics: software-development ai-training productivity

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The Haunted House of Code
It’s a universal developer jump-scare: you change a single line of code—maybe just a padding value or a string label—and suddenly the entire login flow collapses. It feels like haunted house logic, but it’s usually just the result of flying blind. Without a safety net, every change is a potential landmine.

This is where unit testing enters the conversation. While many developers view it as a burdensome "tax" on development speed, the reality is that it’s an investment in velocity. The core idea isn't to simulate the entire internet, but to test the smallest possible unit of code: a single function or component. By isolating it from databases, networks, and file systems, you can run a controlled experiment to see if the logic holds up.

The "Arrange, Act, Assert" Framework
To structure these tests, the conversation highlights a simple, three-step methodology:

Arrange: Set up the conditions. Define your inputs (e.g., X = 5, Y = 5).
Act: Call the function you are testing (e.g., result = add X and Y).
Assert: Verify the outcome. Check if the result equals 10. If it does, the test passes; if not, it fails immediately, catching the bug before it ever leaves your machine.

This process codifies expectations. Instead of manually clicking through an app to see if a button works—a slow, error-prone method—a unit test performs that check in milliseconds, every single time.

The Economics of Quality
The financial argument for testing is staggering. A 2020 study by the Consortium for Information and Software Quality estimated the cost of poor software quality in the U.S. alone was over two trillion dollars, largely due to technical debt and preventable bugs. The cost of fixing a bug scales exponentially with time: seconds to fix while writing code, hours in staging, and potentially catastrophic costs in production involving emergency patches and data recovery.

Beyond the financials, there is a psychological benefit. A solid suite of tests acts as a safety net, allowing developers to refactor complex code with confidence. If the tests remain green after swapping out an algorithm, you have proof that the behavior hasn't changed.

Writing Tests as a Diagnostic Tool
Perhaps the most valuable insight is that the act of writing tests improves code architecture by default. If a function is difficult to test, it’s often a sign that the code is "tangled" or poorly designed. For example, a function that calculates a price while simultaneously calling a database, sending an email, and checking an API is a nightmare to test.

This forces developers toward "pure functions"—logic that takes an input and returns an output with zero side effects. By breaking a giant, thousand-line function into smaller, focused pure functions, the code becomes more readable, modular, and maintainable. It’s a virtuous cycle: you start writing tests to catch bugs, and you end up with better-designed software as a byproduct.

Starting Small: The Practical Steps
For those staring at a legacy codebase with zero tests, the advice is to start small:

Pick One Function: Choose something clear and isolated, like a currency formatter or email validator.
Install a Runner: Use tools like Jest (JavaScript) or Pytest (Python). These run locally on your laptop and require no complex infrastructure.
Create the Test File: Name it parallel to the source file (e.g., format-currency.test.js).
Write the Test: Import the function, write the test block using Arrange, Act, Assert, and run npm test.

The "Pro Move" is to try to break the code. Pass null, negative numbers, or invalid strings (like "banana") to see how the function reacts. This hardens the code against edge cases that inevitably appear in production.

For messy, legacy codebases, "Mocks" and "Stubs" allow you to fake external dependencies (like a database) so you can test the logic without needing the full infrastructure running.

Finally, the conversation touches on Test-Driven Development (TDD), a methodology where you write the test before the code. While it sounds counterintuitive, it forces you to define exactly what the code should do before writing it, often resulting in cleaner, more focused implementations. Whether you’re testing after or before, the goal is the same: stop flying blind and start building with proof.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#1781: Writing Tests Before Code Is Insane (Until You Try It)

You ever have that moment where you change one single line of code, something totally innocuous like a string label or a padding value, and suddenly the entire login flow of your app just goes into a death spiral?

It is the ultimate developer jump-scare. You think you are just tidying up, and then the Slack notifications start screaming.

It is the worst. You are staring at the screen thinking, how did a CSS change break the database connection? It feels like haunted house logic. But usually, it is just because we are flying blind. We are changing things and then manually clicking around like it is nineteen ninety-nine, hoping we didn't break the world. Today's prompt from Daniel is about exactly that, or rather, the cure for it. He wants us to dive into unit testing, why it is a non-negotiable best practice in twenty twenty-six, and how to actually start if you have never written a single test in your life.

I love this topic because there is so much guilt around it. Every developer knows they should be doing it, but a lot of people feel like they missed the train or that it is too late to start. And by the way, today's episode is powered by Google Gemini three Flash. It is helping us map out this testing landscape.

Gemini is probably better at writing tests than I am, honestly. But look, let's start with the basics for anyone who has been avoiding this. When we say unit testing, we are not talking about some massive, complex end-to-end simulation of the entire internet. We are talking about testing the smallest possible "unit" of code, right?

That is the core of it. Think of a unit as a single function or a single component. It is about isolating that one piece of logic from everything else—the database, the network, the file system—and just asking it, "Hey, if I give you the number five, do you actually return ten?" It is a controlled experiment.

So it's essentially just a script that runs your code for you to make sure it's not lying to you.

Precisely. Well, I should say, it is a script that codifies your expectations. Most developers "test" by running the app, logging in, navigating to a page, and seeing if the button works. That is slow, it is manual, and it is prone to human error. A unit test does that in three milliseconds and does it the same way every single time.

I think the reason people skip it is the perceived "tax." Like, I have to write the code, and then I have to write more code to test the code? I've got features to ship, Herman. I don't have time to write poetry about my functions.

That is the big misconception. People see it as a tax on speed, but in reality, it is a massive investment in velocity. If you don't have tests, you actually move slower over time because you become afraid of your own codebase. You stop refactoring. You start "bolting on" new code because you are too scared to touch the old stuff.

It's the "if it ain't broke, don't touch it" philosophy, which eventually leads to a codebase that looks like a giant ball of yarn that's been through a vacuum cleaner.

And that is where the economics of this gets really interesting. There was a study by the Consortium for Information and Software Quality back in twenty twenty that estimated the cost of poor software quality in the U S alone was over two trillion dollars. A huge chunk of that is just technical debt and bugs that could have been caught early.

Two trillion. That is a lot of coffee and standing desks.

It is wild. And the reason that number is so high is the "cost of change" curve. If you find a bug while you are writing the code—because a test failed immediately—it costs you seconds to fix. If that bug makes it to a staging environment, it costs hours of coordination. If it hits production and starts corrupting user data? It can be a hundred times more expensive to fix. You are talking about emergency patches, PR damage control, and potential data recovery.

Plus the psychological cost of being the person who broke the build on a Friday afternoon. That is a heavy burden to carry into the weekend.

That is the "safety net" aspect. When I have a solid suite of unit tests, I can go into a complex calculation function, rip out the guts, replace it with a more efficient algorithm, and if the tests are still green, I know I haven't changed the behavior. I don't have to guess. I don't have to "feel" like it's okay. I have proof.

Okay, so if I'm sold on the "why," let's talk about the "how." For someone who has a project right now with zero tests, what is the actual anatomy of a test? I've heard people talk about "Arrange, Act, Assert." It sounds like a legal proceeding.

It is actually a very elegant way to structure your thoughts. Every good unit test follows those three steps. First, you "Arrange." You set up the conditions. If you are testing a calculator, you define your inputs—let's say X equals five and Y equals five.

Simple enough. I can handle the arrangement.

Then you "Act." You actually call the function you are testing with those inputs. Result equals add X and Y.

And then the "Assert" is the "I caught you" moment?

It is the verification. You assert that "Result" should equal ten. If it does, the test passes. If your function accidentally returns fifty because you used a multiplication sign instead of a plus sign, the assertion fails, the test runner turns red, and you caught the bug before it ever left your machine.

It sounds almost too simple. Like, why wouldn't I just know that five plus five is ten?

Because in real-world code, it is never just five plus five. It is "calculate the prorated discount for a user who subscribed on a Tuesday, lives in a specific tax jurisdiction, and has a referral code that is fifty percent expired." When you have twenty edge cases like that, your brain cannot hold all those permutations. But the computer can.

That makes sense. It's about offloading the mental gymnastics to the machine. But what about the setup? I think a lot of people get stuck on the "plumbing." Like, do I need a special server? Is this a cloud thing?

Not at all. In twenty twenty-six, the tooling is incredibly streamlined. If you are in JavaScript or TypeScript, you just run "npm install save dev jest" or use something like Vitest. If you are in Python, you just "pip install pytest." These are local tools. They run on your laptop. They don't need an internet connection. They just scan your folders for files that end in dot-test-dot-js or something similar and run them.

So it's literally just another command in my terminal. No fancy infrastructure required.

And once you have that runner, you start seeing the second-order effects. This is the part that actually makes you a better developer, not just a more careful one. When you start trying to write tests, you quickly realize which parts of your code are "untestable."

Untestable? You mean like code that is just too cool for school?

More like code that is too "tangled." If you have a function that calculates a price, but inside that function, it also calls a database, sends an email, and checks the current weather via an API, that function is a nightmare to test. You'd have to set up a fake database and a fake weather service just to check if the math is right.

Ah, so the testability of the code is like a diagnostic tool for the quality of the architecture. If it's hard to test, it's probably bad code.

You nailed it. It forces you toward "Pure Functions." A pure function is the holy grail of unit testing. It takes an input, returns an output, and has zero side effects. It doesn't reach out to the world; it just lives in its own little logic bubble. The more of your app you can pull into pure functions, the easier it is to test and the more reliable it becomes.

So by trying to write a test, I might realize I need to break my giant, thousand-line function into ten smaller, focused functions. And suddenly, my code is more readable, more modular, and easier for the next person to understand.

It's a virtuous cycle. You start writing tests to catch bugs, and you end up with better-designed software as a byproduct. It's like how learning to cook doesn't just give you a meal; it teaches you about ingredients and timing and organization.

I like that. But let's get practical. If I'm sitting at my desk right now, listening to this, and I've got a utility file with some string helpers or some math logic, what is the first actual step?

Step one: pick one function. Just one. Don't try to "test the app." That is too overwhelming. Pick a function that does something clear, like "format currency" or "validate email."

Okay, I've got my "format currency" function. It takes a number and returns a string with a dollar sign and two decimals.

Perfect. Step two: install your runner. Let's say you're using Node. Install Jest. Step three: create a new file right next to your utility file called "format-currency-dot-test-dot-js."

And what goes in that file? Do I have to import the whole world?

Just the function. You import "format currency." Then you write a simple test block. In Jest, it looks like "test, quote, it should format numbers to two decimals, comma, arrow function." Inside that function, you do your Arrange, Act, Assert. Arrange: const amount equals twelve point five. Act: const result equals format currency amount. Assert: expect result to be dollar sign twelve point fifty.

And then I just type "npm test" in the console?

And you watch for that green checkmark. It is a tiny hit of dopamine. It feels great. But then—and this is the pro move—you try to break it.

Why would I want to break my own heart, Herman?

Because that is where the real testing happens. What happens if you pass "null" to your currency function? What if you pass a negative number? What if you pass a string that says "banana"?

My function would probably explode. It would return "dollar sign banana" or just crash the server.

So you write a test for the "banana" case. You assert that it should return "zero point zero zero" or throw an error. Now, you've not only tested the "happy path," but you've also hardened your code against the weird stuff that users—or other developers—will inevitably throw at it.

It's like building a little fence around the edge cases. I can see how that would prevent a lot of those "how did this happen" bugs in production. But I can already hear the skeptics. They're saying, "Herman, I have a legacy codebase. It's ten years old, it's all coupled together, and there are no pure functions. Testing is impossible for me."

That is where "Mocks" and "Stubs" come in. You don't have to refactor the whole world on day one. If your function is tied to a database, you can use a library to create a "mock" database. It's a fake object that pretends to be the database but just returns whatever data you tell it to for that specific test.

So it's like a Hollywood movie set. The front of the building looks real, but there's nothing behind the door.

That is exactly what it is. It allows you to isolate the logic you care about without needing the entire infrastructure to be running. It is a bit more advanced, but it means you can start testing even in messy, older codebases.

You mentioned something earlier about "Test-Driven Development" or T D D. Is that the same thing, or is that like the black-belt version of this?

T D D is a methodology where you actually write the test before you write the code. It sounds crazy at first. How can you test something that doesn't exist?

It sounds like trying to check if a ghost is wearing a hat.

Kind of! The idea is you write a test for the behavior you want. The test fails because the code isn't there. Then you write the bare minimum code to make the test pass. Then you refactor. Red, Green, Refactor. It's a very disciplined way to work that ensures you never write more code than you need and that everything you write is tested from second one.

I can see the appeal, but for a beginner, that feels like a lot of pressure. I think just getting that first green checkmark on an existing function is probably the best way to break the ice.

I agree. Don't worry about T D D yet. Just worry about "characterization tests." Write tests for how your code works now, even if it's buggy. Just so you have a baseline. Then, when you go to fix the bug, you'll have a test that fails when the bug is present and passes when it's fixed. That's how you stop bugs from coming back.

The "regression" bug. The one you fixed three months ago that somehow crawled out of its grave and started biting people again.

We have all been there. A unit test is the silver bullet for regressions. Once a bug is covered by a test, it can never sneak back into the codebase unnoticed. If someone—including future you—accidentally re-introduces the bug, the test will scream immediately.

I'm starting to think of tests more like documentation that actually does something. Instead of a README file that nobody reads and is always out of date, the tests tell you exactly what the code is supposed to do, and they prove it every time you run them.

That is a great way to put it. Tests are "executable documentation." If I want to know how your complex discount logic works, I shouldn't have to read through five hundred lines of nested if-statements. I should just look at your test file. It should say: "it should apply ten percent for seniors," "it should stack with holiday coupons," "it should not exceed fifty percent total." It's a clear list of requirements.

Okay, so let's talk about the "trap" of code coverage. I've worked at places where management says, "We need eighty percent code coverage!" and everyone starts writing these useless tests just to make a number go up.

Oh, the vanity metrics. Code coverage is a tool, not a goal. You can have one hundred percent coverage and still have a buggy app if your tests aren't actually asserting anything meaningful. I've seen tests that just call a function and don't check the output. The coverage tool sees the lines were "hit," so it counts it as covered. But it's a useless test.

It's like saying a student "covered" the material because they sat in the classroom, even if they were asleep the whole time.

Focus on testing the "critical paths" and the "complex logic." You don't need to test a simple getter that just returns a variable. That is a waste of time. Focus on the parts of the code where bugs are likely to live—the math, the data transformations, the conditional branching.

That makes it feel much more manageable. It's not about a perfect score; it's about covering your assets on the stuff that actually matters.

And in twenty twenty-six, we have so many tools to make this easier. We mentioned the "Quash" blog in the notes Daniel sent over. They talk about using AI to help generate test cases. While we should be careful about letting AI do all the thinking, it acts as a great "boilerplate" generator. It can suggest edge cases you might have missed, like "hey, did you think about what happens if the user's name is an empty string?"

It's like having a very cynical, very thorough assistant who is always looking for ways to break your stuff.

We all need a cynical assistant in software development. The more you can adopt that "how can I break this" mindset, the more resilient your software becomes. And that pays off in ways that go beyond just fewer bugs. It affects your reputation. People trust developers who ship code that works.

And people trust teams that can move fast without breaking things. I think that's the real competitive advantage. If your team can deploy five times a day because you have a suite of tests giving you the "all clear," you are going to run circles around the team that only deploys once a month because they're terrified of a manual QA cycle.

That is the "Continuous Testing" part of the modern workflow. These tests don't just live on your machine. You hook them up to your GitHub or GitLab so that every time you push code, the tests run automatically in the cloud. If they fail, the code can't be merged. It is a hard gate that protects the production environment.

We actually talked about this a bit in a very early episode, number sixteen ninety-seven, about Git hooks. Using tests as a gate is the ultimate "last line of defense."

It really is. It turns "quality" from a department—like the QA team in the basement—into a shared responsibility of the entire engineering team.

So, to wrap this up for someone ready to take the plunge. What are the three things they should do when they finish this episode?

Number one: install a test runner. Don't think about it, just do it. "npm install jest" or "pip install pytest." Get the tool in your kit.

Step one, check. What's step two?

Number two: find the "purest" function in your codebase. The one with the least amount of "outside world" interference. Write one single test for it. Just a "happy path" test to see that green checkmark.

Get that dopamine hit. And number three?

Number three: write one "negative" test for that same function. Try to break it. Give it bad data and make sure it handles it gracefully. If you do those three things, you are officially a "testing developer." You've broken the seal.

I love it. It's not about being perfect; it's about being better than you were yesterday. And honestly, once you start, you'll wonder how you ever lived without it. It's like driving with a seatbelt. It feels weird at first, but once you're used to it, driving without one just feels reckless.

It really does. And the peace of mind you get when you hit "deploy" and you know that five hundred tests have already given you the thumbs up? That is worth its weight in gold. No more Sunday morning emergency calls.

That is the dream. No more "haunted" codebases. Just logical, verified systems. This has been a great deep dive. I think it's time we get back to our own testing suites.

I've got a few red tests waiting for me, actually. I should go fix those.

Better you than the users. Thanks as always to our producer, Hilbert Flumingtop, for keeping the show running smoothly. And a big thanks to Modal for providing the GPU credits that power this whole operation.

If you found this episode helpful, please do us a favor and leave a review on your favorite podcast app. It really helps other developers find the show and join the conversation.

You can find us at myweirdprompts dot com for the full archive and all the links we mentioned today.

This has been My Weird Prompts.

See ya.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#1781: Writing Tests Before Code Is Insane (Until You Try It)

Downloads

You Might Also Like

#1781: Writing Tests Before Code Is Insane (Until You Try It)