Gemini Stopped Answering Questions. It Started Building Apps.

I've been using Claude's artifacts for months. I build small tools with them, prototype UI components, generate charts from data I'm too lazy to format myself. It's become part of how I work. So when Google announced that Gemini now generates entire interactive experiences instead of text responses — what they're calling "Generative UI" — I paid attention. Not because of the marketing. Because of what it means for the interface layer that sits between us and these models.

Here's what actually happened, what's interesting about it, and where I think Google is both right and wrong.

What They Shipped

Google rolled out two things on April 8. The first is a global expansion of Generative UI — a feature that launched in preview with Gemini 3 back in November but is now available to everyone. The second is NotebookLM integration directly inside the Gemini app.

Generative UI comes in two flavors. "Dynamic View" is the ambitious one: Gemini 3 Pro writes a complete web application from scratch — HTML, CSS, JavaScript — and renders it inline as your answer. You ask about Van Gogh's paintings, you get a scrollable gallery with context panels. You ask about mortgage rates, you get a calculator with adjustable inputs. You ask about molecular structures, you get a 3D model you can rotate.

"Visual Layout" is the lighter version: magazine-style presentations with photos pulled from the web, interactive modules, filters. Less impressive technically, but more consistently useful.

The NotebookLM piece is simpler to explain: there's now a "Notebooks" section in the Gemini sidebar. You can create persistent knowledge bases, add sources — PDFs, Google Docs, URLs, YouTube videos, previous chats — and Gemini will ground its answers in that material. It syncs bidirectionally with NotebookLM. Free users get 100 notebooks with 50 sources each. Paid tiers go up from there.

Why a Developer Should Care

I read the research paper. Google actually published a 22-page document describing the architecture, which is unusual for a consumer feature launch. The system isn't magic — it's Gemini 3 Pro generating standard web code (React, Tailwind, Recharts, Lucide icons) executed in a sandboxed browser environment. Three components: tool access for image generation and web search, carefully engineered system prompts, and post-processing to catch the mistakes the prompts can't prevent.

The interesting part is the evaluation. Google built a benchmark called PAGEN — human-designed reference websites for various queries — and found that users preferred Generative UI over top Google search results 90% of the time. Over plain text answers, 97%. Human designers still won, but only 56 to 43. That margin is close enough to matter.

Here's my honest reaction: the concept is sound, but the execution has a gap that nobody at Google seems worried about. Dynamic View can take over 60 seconds for complex prompts. That's not a loading screen — that's a coffee break. Simpler UIs render in one to three seconds, which is fine, but the moment you need something genuinely complex — the use case Google keeps demoing — you're sitting there watching a spinner.

I tested Claude's artifacts side by side last week. A React component with a data table and a chart. Claude generated it in maybe four seconds, rendered it in the artifact panel, and I could iterate on it immediately. Gemini's Dynamic View would have built something more visually polished, probably. But I'd have been waiting. And when you're in flow, waiting is the cost you feel the most.

The Comparison Nobody Wants to Make Honestly

There's a useful way to think about the three approaches.

Gemini's Generative UI treats the entire response as an experience. The model decides not just what to tell you but how to present it — timeline versus calculator versus gallery versus simulation. The interface is the answer. It's the most consumer-friendly vision and the most unpredictable one.

Claude's artifacts are an execution environment bolted to the side of a conversation. You ask for something, it appears in a panel, you can run it, edit it, publish it, share it. It's developer-oriented. The chat is still the primary interface; the artifact is a tool.

ChatGPT's Canvas is a collaborative document. You write together. Track changes. Inline edits. It's the closest to how a human editor would work with you. OpenAI recently added around 70 interactive math and science modules — their first step toward generative UI — but it's curated, not open-ended.

Each approach reflects a different bet on what users actually want from AI. Google bets they want experiences. Anthropic bets they want tools. OpenAI bets they want a collaborator.

I think the honest answer is that all three are partially right and none of them has figured out the whole thing yet. I use artifacts because I'm a developer and I want to iterate on code. My partner would probably prefer Gemini's approach because she doesn't care about the code — she wants the result. Neither of us is wrong.

The NotebookLM Move Is Strategically Smarter Than It Looks

Here's where I think Google is being quietly clever. NotebookLM was always good — arguably the best product Google has launched in years, which is a weird thing to say about a tool most people have never heard of. The audio overview feature alone is worth the price of admission. But it was isolated. A separate app, separate context, separate everything.

Putting it inside Gemini changes the dynamics. Now your research lives where your conversations happen. You build up context over time instead of starting from zero every chat. You upload a hundred documents about your project and Gemini answers with that knowledge baked in.

Claude has memory now. ChatGPT has memory. But neither has anything like a structured, user-controlled knowledge base with source attribution that you can inspect and modify. That's what Notebooks provides. It's less magical and more practical — which, in my experience, is what actually matters when you're doing real work.

The catch is the tier structure. Google's marketing says "100 sources free" but the details tell a different story: free users get 100 notebooks with 50 sources each. You need a paid plan for 100 sources per notebook. It's not a dealbreaker but it's the kind of mismatch between announcement and reality that erodes trust over time.

What the Community Actually Thinks

Jakob Nielsen — the usability researcher, not a random blogger — called Generative UI the most important aspect of the Gemini 3 launch. He predicts AI-generated interfaces will surpass human-designed ones by end of 2026. That's a bold claim from someone who's spent decades arguing that consistency and learnability are the foundations of good design.

Hacker News went the other direction, predictably. The top thread is full of developers arguing that personalized interfaces are inherently hostile: "I don't want anything automatically configured on my behalf." The logic is sound — people learn muscle memory, and a UI that redesigns itself every session destroys that. It's the same reason everyone hates it when their favorite app moves the settings button.

I'm somewhere in between. For repeated tasks — my daily work, my project dashboards, my code reviews — I want predictability. I want the button where I left it. But for one-off exploration — understanding a concept, planning a trip, visualizing data I'll never look at again — a custom-built experience is genuinely better than a wall of text. The question is whether Gemini can learn when to generate and when to just answer. Right now, it can't. It generates when it feels like it, and you can't reliably control that.

What This Actually Means for the Next Six Months

Google is making a bet that the text-based chat interface is a temporary artifact of LLM limitations, not a permanent design choice. That as models get better at generating code, the natural output isn't words — it's applications. Every prompt becomes a product.

Here's the counterargument, and it's not trivial: applications need maintenance. They need state management, error handling, accessibility, responsiveness across devices. A generated UI that works once is a demo. A generated UI that works reliably across a thousand different prompts from a thousand different users with a thousand different screen sizes is an engineering problem that Google hasn't solved yet. Their own documentation admits Dynamic View is desktop-only for now. That's a tell.

I'll keep using Claude for my development work. The artifact workflow fits how I think, and the code quality is better for my use cases. But I've started using Gemini's Dynamic View for the kind of stuff I used to google — visualizing concepts, exploring data, building quick tools I'll use once and throw away. For that, it's legitimately good. Not perfect. Not reliable enough to depend on. But good enough that I keep going back.

And the NotebookLM integration might end up being the bigger deal, even though it got less attention. Persistent context is the feature everyone needs and nobody has gotten right yet. Google just got closer than anyone else.

We'll see if they can keep it working. Their track record with product reliability isn't exactly reassuring.

Gemini Stopped Answering Questions. It Started Building Apps.

What They Shipped

Why a Developer Should Care

The Comparison Nobody Wants to Make Honestly

The NotebookLM Move Is Strategically Smarter Than It Looks

What the Community Actually Thinks

What This Actually Means for the Next Six Months

Sources

Related Articles

Anthropic Leaked Its Own Source Code. Twice. In One Week.

Half Your Code Was Written by a Machine. Nobody Checked.

97 Million Installs. Zero Questions.