Development··3 min read

Two Months with an AI Code Reviewer

We added an AI code review tool to our team's PR workflow. Here's what two months looked like.

The Friday PR Curse

Our team has 6 people. We average about 8 PRs a day, and from assigning a reviewer to actually getting the review done, it took a day and a half. PRs submitted on Friday didn't get reviewed until Monday. Friday afternoon PRs? Sometimes Tuesday. (We almost developed an unwritten rule against Friday afternoon PRs.)

So last November, we introduced an AI code review tool. When a PR goes up, automated review comments appear within 5 minutes.

The First Week, Everyone Was Impressed

"Wait, this actually works?" That was the team's reaction. The impressive catches included missed null checks, dependency array mistakes causing unnecessary re-renders, and dynamic queries vulnerable to SQL injection. When it accurately flagged an N+1 query pattern in a junior developer's PR, I was genuinely surprised. Even human reviewers miss that stuff sometimes.

A Month In, the Annoyance Kicked In

The problem was noise. The AI left at least five or six comments on every PR. Of those, maybe one or two were actually meaningful. The rest were style-guide-level stuff like "consider making this variable name more descriptive."

After a month, team members started scrolling right past the AI comments. Once the vibe becomes "oh, the AI is spouting nonsense again," the tool's value tanks. At that point I thought, "is this a failed experiment?"

Two Days Overhauling the Config

I turned off all style-related comments and configured it to focus only on security, performance, and potential bugs. I set up three severity levels and made only Critical ones highlighted. Spending two days on this tuning felt wasteful at first, but the results were undeniable. Comments dropped to two or three per PR, and the ratio of meaningful ones jumped to what felt like over half.

Human Reviews Didn't Decrease

Here's the important part: the AI didn't replace human reviews -- it complemented them. With the AI handling mechanical checks first, human reviewers could focus on architecture and business logic.

Review completion time dropped from a day and a half to roughly one day. But I think that's less about the AI itself and more about a psychological effect -- when the AI reviews quickly, human reviewers feel pressure to step up too. (Unexpected benefits of peer pressure?)

The Numbers After Two Months

Average time from PR to merge went from 36 hours to 19 hours. Production bugs that should've been caught in code review dropped from 3 per month to 1. Monthly cost is about 148,000 KRW for the team -- whether that's expensive or cheap depends on your perspective.

I'd Recommend It, but With Conditions

If your team has 3+ people, it's worth trying. But you absolutely have to invest time in initial tuning. If you use the default settings, the noise will burn out your team and you'll turn it off within a month. We almost did. AI code review is a tool for catching wrong code, not a tool for writing good code. The discussions about what makes code good -- that's still on us humans. That hasn't changed.

Related Posts