Written by

CEO - Jacar Systems

Passionate about technology, cloud infrastructure and artificial intelligence. Writes about DevOps, AI, platforms and software from Madrid.

Metodologías Startup

entrevistas usuario hipótesis producto ia generativa práctica equipo product discovery product management síntesis investigación

Product discovery with AI: practices that stick

February 24, 2026 12 min read 144 reads

Table of contents

Key takeaways
What AI does well in discovery
What AI does badly
Practices that have matured
Common errors to avoid
How to integrate without corrupting
My reading

Actualizado: 2026-05-15

Product discovery, the continuous process by which a product team figures out which problems are worth solving and how to validate solutions before building them, was one of the territories where generative AI created most expectation through 2023 and 2024. Two years later, with accumulated experience including failures, we can make a cooler read of which practices have passed the test of time and which should be discarded without nostalgia.

Key takeaways

Interview transcript synthesis is the most useful and least controversial use case.
Generating hypotheses without real data has been a repeated failure.
User simulation produces systematic false positives about adoption.
Mature practices involve the human in critical analysis, never replace them.
AI amplifies good processes and accelerates bad ones toward faster, better-documented failures.

What AI does well in discovery

Interview-transcript synthesis is probably the most useful and least controversial use case. A team doing ten user interviews per sprint generates dozens of hours of audio and corresponding transcripts, and processing that mass of text to extract patterns, relevant quotes, and recurring themes is exactly the kind of task where current models add value without high risk. Synthesis doesn’t replace the researcher; it frees them from tedious work to spend time on interpretation.

Generating questions for interview guides also works well as a first approximation. Asking the model to propose twenty open questions about a concrete problem, with variants by user segment, produces a starting point the researcher refines and filters.

Exploring cross-industry analogies has surprised positively. Asking the model how unrelated sectors solve similar problems, or what product patterns have worked in comparable contexts, sometimes yields insights we wouldn’t have found manually. As a search-space widening technique it performs acceptably, especially in initial phases of a new problem.

Drafting specs, user stories, and acceptance criteria has also found its place. The model produces a first reasonably structured text on an idea, the product manager edits it, discusses with engineering and design, and the time from concept to discussable document shrinks. The final document remains human responsibility.

What AI does badly

Direct generation of product hypotheses without real data has been a repeated failure. Asking a model to suggest what problems a segment’s users have, without feeding it real prior research, produces plausible but generic lists. Without concrete-segment data, the model produces the internet’s statistical consensus, not useful insight.

Automatic prioritization has disappointed in most cases. Several teams tried through 2024 and 2025 to use models to score backlogs against criteria like impact, effort, and risk. Results were superficially coherent but didn’t withstand detailed discussion with stakeholders. For structured prioritization a framework like RICE with human judgement still wins.

User simulation had mostly discouraging results. The model produces plausible answers but converges to a statistical average that doesn’t reflect real user diversity, and generates false positives about adoption because it tends to be nice to proposed ideas. There are documented cases of teams prematurely validating directions that later failed contact with real users.

Detection of unexpressed needs is territory where the model usually falls short. Users often can’t articulate what they need, and the good researcher’s job is to listen between lines, observe contradictions, and catch non-verbal signals.

Practices that have matured

After two years of trial and error, four concrete practices have consolidated:

Analysis partner after real interviews. The researcher transcribes, the model helps synthesize, the human team discusses findings. This pattern leverages AI’s good parts without falling into the failures of delegating core thinking.
Counterargument generation. After formulating a hypothesis, asking the model to build the three best arguments against it, without being nice, as if a hostile external critic. This helps harden the hypothesis before investing validation time.
Text variant generation for testing. The model produces twenty copy or value-proposition variants, the team picks five, they’re tested with real users, and learning returns to the model for the next iteration.
Customer-support conversation analysis. Support interactions contain information on real user frictions. Processing that corpus with model help identifies patterns otherwise hidden.

Common errors to avoid

Three recurring traps deserve explicit mention:

Confusing speed with rigor. The model accelerates many discovery tasks but doesn’t make them better by itself. An accelerated cycle without adequate human controls quickly produces well-documented bad decisions.
Assuming the model understands business context. Without explicit loading of segment information, metrics, partnerships, and constraints, model suggestions are too general to help decide.
Excluding experienced researchers. When juniors use AI without senior supervision, the team loses the ability to detect when outputs are plausible but wrong.

How to integrate without corrupting

For a product team wanting to incorporate AI into discovery without corrupting its fundamentals, the sensible sequence is gradual:

Start with post-interview synthesis and drafting, lowest-risk and highest-immediate-return areas. Measure time saved over two or three sprints.
Move next to counterargument generation and analogy exploration, higher-leverage areas requiring more maturity in critical model use.
Only after, with months of accumulated experience, consider systematic support-conversation analysis or hybrid workflows.

AI in product discovery pays off when the team has a mature process without AI. The inflection point is usually when manual synthesis becomes the bottleneck.

My reading

Product discovery with AI is today a mature area with concrete practices that work and others discarded clearly. The most important lesson is that AI amplifies good processes and accelerates bad ones toward faster, better-documented failures. Teams already doing serious discovery before, with AI do more and better; those without process who expected AI to replace it have learned it doesn’t work that way.

The product manager’s and researcher’s role remains central, but its profile has changed. Less time on mechanical synthesis and drafting tasks, more time on critical decisions, user conversation, and critique of model outputs. The good news is that the standard of good discovery has risen without the practicing team’s cost rising. The same principle applies to broader lessons from agents in production: the tool amplifies the team, it doesn’t replace it.

Was this useful?

[Total: 0 · Average: 0]

Post Views: 144

Written by

Javier Cañete

CEO - Jacar Systems

Passionate about technology, cloud infrastructure and artificial intelligence. Writes about DevOps, AI, platforms and software from Madrid.

Product discovery with AI: practices that stick

Key takeaways

What AI does well in discovery

What AI does badly

Practices that have matured

Common errors to avoid

How to integrate without corrupting

My reading

Related posts

How to build a production-ready agent with the Anthropic SDK, step by step

AI-integrated DevOps tools in my daily flow

FinOps on agent tokens: the invoice that surprises

Claude Opus 4.7 and long-horizon tasks: real changes