Skip to content
Version 1.0 — Last updated: April 13, 2026
Tools 10 min read

Half Your Code Was Written by a Machine. Nobody Checked.

The AI coding market just consolidated around three tools. The data on what they're producing is worse than you think.

I run all three. Copilot for autocomplete, Cursor for multi-file edits, Claude Code when I need something that actually reasons about architecture. That stack costs me $50 a month and has changed how I work more than any tool shift in the past decade. I'm not going to pretend otherwise.

But something happened this month that made me sit down and look at what this stack is actually producing — not in my projects, but across the industry. The numbers are bad. And the response from every company involved has been to ship faster.

The Market Picked Its Winners

JetBrains published their AI Pulse survey results two weeks ago. Over 10,000 professional developers, eight languages, globally representative. The headline: 74% of developers now use specialized AI coding tools at work. Not chatbots. Dedicated coding assistants, editors, agents.

The market share breakdown is clean. GitHub Copilot sits at 29%, still the largest installed base but no longer growing. Cursor holds 18%, also plateauing after its explosive 2025. Claude Code matches Cursor at 18% — and that number was 3% nine months ago. A 6x jump in under a year, with the highest satisfaction scores JetBrains has ever measured for a dev tool: 91% CSAT, +54 NPS.

That last number matters more than market share. Developers don't give a +54 NPS to something that sort of works. They give it to something that changed what they think is possible. I've seen this in my own workflow. Claude Code handles the kind of cross-codebase reasoning that Copilot can't touch and Cursor still struggles with on large projects.

The emerging pattern isn't one tool replacing another. It's three tools at three layers: Copilot at $10/month for the muscle memory stuff — tab completions, line suggestions, the things that used to be snippets. Cursor at $20/month for the multi-file coordination — refactoring across modules, design mode, the agent workflows that Cursor 3 bet the company on. Claude Code at $20/month for the hard problems — architecture decisions, complex debugging, the conversations where you need the model to actually understand what your system does.

Fifty dollars a month. Three subscriptions. One stack. And according to multiple sources, 51% of all code committed to GitHub is now AI-generated or substantially AI-assisted.

Read that again. More than half.

The Quality Nobody Wants to Talk About

Here's where I stopped scrolling and started digging.

CodeRabbit's 2026 State of Code report analyzed hundreds of thousands of pull requests across open-source and enterprise repos. The productivity story is real: PR count per author rose 20% year-over-year. Developers are shipping more. That part is true.

But incidents per PR increased 23.5%.

More code. More bugs. Not proportionally more — disproportionately more. The bug rate is growing faster than the output rate. If you're a manager looking at throughput dashboards and celebrating, you're looking at the wrong number.

A separate study — 304,362 AI-authored commits across 6,275 GitHub repositories — found that over 15% of commits from every major AI assistant introduced at least one quality issue. Not "could theoretically have an issue." Did have one. And 24.2% of those issues were never caught or fixed. They're in production right now.

SonarQube's developer survey puts it even more bluntly: 61% of developers agree that AI often produces code that looks correct but isn't reliable. That's not a fringe opinion. That's a supermajority of the profession saying the tools they use daily produce output they don't fully trust.

Stack Overflow's numbers track the same trajectory. Trust in AI-generated code accuracy: 29%, down from 40%. Positive favorability toward AI coding tools: 60%, down from 72%. Developers are using these tools more and trusting them less. That's not a paradox. That's a dependency.

I notice it in my own work. A Claude Code suggestion that compiles, passes the obvious tests, and has a subtle ownership issue in Rust that shows up three days later under load. A Cursor refactor that restructures a module cleanly but introduces a race condition that only manifests in production concurrency. The failure mode isn't "obviously wrong." The failure mode is "looks right, ships, breaks later." That's harder to catch than a syntax error. By a lot.

GitHub Just Made It Worse

On April 24 — eleven days from now — GitHub flips a switch. Every Copilot Free, Pro, and Pro+ user will have their interaction data used to train AI models. Opted in by default. The data collected includes code snippets, inputs, outputs, comments, file names, repository structure, and navigation patterns. From private repositories. While you're actively working in them.

Business and Enterprise tiers are excluded. Their contracts prohibit it.

I'll say that differently so the implication is clear: if you pay $10 or $20 a month, your code trains Microsoft's models. If your company pays $39 a month, it doesn't. The cheapest users subsidize the product with their data. The most expensive users are protected by legal agreements. That's not a privacy policy. That's a business model.

The GitHub community discussion has 232 downvotes and three rocket emojis. Out of 39 developer comments, exactly one person — a GitHub VP — endorsed the change. The rest range from frustrated to furious. The GDPR implications alone are non-trivial. European privacy law generally requires opt-in for data processing of this nature. GitHub is using American opt-out norms and hoping nobody in Brussels notices.

Here's the part that actually keeps me up: there's no per-repository toggle. You can't say "use my open-source work, leave my client projects alone." It's account-level. One toggle. All or nothing.

I opted out the day I read the announcement. But I know most developers won't. Most developers don't read privacy policy updates. That's what GitHub is counting on.

Cursor's Transparency Problem Didn't Go Away

I wrote about Cursor 3 when it launched on April 2. The product is impressive. The Agents Window, the Background Agents running on remote VMs, the rebuilt interface — serious engineering. I use it daily and I'll keep using it.

But the Composer 2 origin story still bothers me. Within 24 hours of launch, a developer found the internal model ID pointing to Moonshot AI's Kimi K2.5 — a one-trillion-parameter model from a Chinese AI lab. Moonshot's head of pretraining confirmed identical tokenizers. Cursor co-founder Aman Sanger called the omission a "miss." The modified MIT license on Kimi K2.5 explicitly requires prominent display of the Kimi name for products exceeding $20 million in monthly revenue. Cursor exceeds that by roughly 8x.

A $50 billion company used a model with specific attribution requirements and didn't mention it until someone caught them. That's not an oversight. That's a decision someone made.

Cursor's numbers are otherwise remarkable. Two billion dollars in annualized revenue. Over a million paying users. A 37% improvement on their internal benchmark with Composer 2. The product works. But trust is a separate axis from functionality, and right now Cursor is spending credibility faster than it's earning it.

What's Coming and Why It Matters

Google released Gemma 4 under Apache 2.0 — the first time the Gemma family has shipped under a true OSI-approved open-source license. Four model sizes, from 2B to 31B parameters. Over 400 million community downloads of Gemma models total. This is Google's play for the on-device AI layer, and the licensing decision is significant. It means anyone can build a coding assistant on Gemma without calling home to Google.

Google's Antigravity IDE — built on a VS Code fork after their $2.4 billion acquisition of Windsurf's team — scores 76.2% on SWE-bench Verified. That's the highest published benchmark for any coding agent currently shipping. And it's free.

Meanwhile, GitHub's new Copilot Coding Agent can autonomously pick up issues, create branches, write code, run tests, and open pull requests. No human in the loop until review. Cursor's Background Agents do the same thing on remote VMs. Claude Code's terminal-based workflow handles it natively.

The pattern is unmistakable. Every major tool is moving toward autonomous coding agents that operate without continuous developer supervision. And the quality data says we're not ready for that.

The Actual Point

We have a market that just consolidated around three tools, each doing different things well, together producing more than half of all committed code. The code quality data says bug rates are rising faster than output. The trust data says developers know it. And the companies building these tools are responding by making them more autonomous, not more careful.

GitHub is harvesting user code to train better models. Cursor shipped a model without crediting its origin. Copilot is launching agents that write code without developers watching. Every incentive in the system points toward more AI-generated code, faster, with less oversight.

I'm not going to do the thing where I pretend this means we should stop using these tools. I use them. They make me meaningfully more productive. That's real. But productivity measured in lines shipped is not the same as productivity measured in working software. And right now, the industry is measuring the first one and hoping the second one follows.

It won't. Not automatically. Not without the kind of review discipline that gets harder, not easier, when the code looks correct and the deadline is tomorrow.

Fifty-one percent of code on GitHub is AI-assisted. Twenty-four percent of quality issues in that code are never caught. Do the math on what's accumulating in production systems worldwide, right now, while everyone celebrates shipping faster.

That's not a tools problem. That's a culture problem wearing a tools costume.

A

Alexei Volkov

I build software for a living and write about tech on the side — because someone has to say what everyone else is thinking.