User research has always been one of the most expensive disciplines in product teams. Recruiting participants, conducting interviews, transcribing, analyzing, and synthesizing takes weeks. Since 2023 generative AI has begun promising to shorten each of those phases, and by 2025 we have enough track record to see which parts deliver and which do not. The answer is not symmetric: there are steps where AI is a clear improvement and others where it introduces a subtle risk of losing contact with the real user.
What AI does well in this field
Automatic transcription of interviews is probably the most mature and least controversial application. Tools like OpenAI’s Whisper, Deepgram, or services integrated into Zoom produce transcriptions with precision good enough for analysis, including timestamps and speaker separation. What used to cost an hour for every hour of interview now costs a few minutes of review. This saving is net: nothing is lost along the way.
Synthesis of field notes has also improved a lot. Models like Claude 3.5 or GPT-4o can summarize transcripts, group quotes by theme, and highlight contradictions between participants. The researcher still validates and refines, but the first step of qualitative analysis, which used to be tedious, now takes minutes. That frees time for the genuinely hard part: interpreting what the patterns found mean.
Finally, AI is useful in preparation. Generating interview guides from research objectives, proposing question variants, detecting bias in phrasing, and translating materials across languages are tasks where current models perform well. The tool does not replace the researcher’s judgment, but it accelerates the initial draft.
Where AI fails or misleads
The most dangerous area is the generation of synthetic personas. Several tools promise to create simulated users you can chat with to “understand” a segment without interviewing real people. The temptation is obvious: zero cost, instant response, round-the-clock availability. The problem is that what you get is not a person, it is the statistical average of the training corpus filtered by the prompt. A synthetic persona like “working mother, age 35, living in Madrid” produces plausible answers, but without any of the contradictions, surprises, or intuitions that make a real interview useful.
I have seen teams make product decisions based on conversations with synthetic personas and later discover that real users thought something completely different. The model does not know what is not on the internet: local cultural nuances, family habits, specific frustrations with particular interfaces. Using synthetic personas as a substitute for real research is a shortcut that looks like it saves money and ends up multiplying the cost.
Another problem area is automatic analysis of raw interviews. Models can summarize, but they also invent quotes or attribute statements to the wrong participant when context is long. When analysis feeds important decisions, every quote that goes into a final document must be verified against the original transcript. AI is a good first pass; it is a bad single source of truth.
The risk of misunderstood efficiency
There is a tendency in teams adopting AI for research: reducing the number of real participants because “AI fills in the gaps”. This reasoning is fallacious for a specific statistical reason. The value of an interview is not in the average data point it provides, but in the possibility of surprise. Each additional interview has some probability of producing an insight that shifts the hypothesis. Halving interviews does not only halve the data, it halves the chance of discovering what you did not know you did not know.
AI can interpolate between data you already have, but it cannot extrapolate beyond them. If your initial research had sampling bias, the model will amplify it. If the sample was too small to capture a minority but critical segment, AI will not warn you about the omission. These are blind spots that can only be detected by direct contact with real users.
A hybrid format that works
The workflow I have seen perform best combines both layers. Interviews are still real and numerous enough to capture variability, typically between ten and twenty-five depending on the objective. Transcription, initial coding, and pattern search use AI. Interpretation, prioritization, and decision-making stay with humans. This division takes advantage of what each does best without falling for the illusion of replacing the costly part.
In this format, AI also helps in later phases. Answering “what did participants say about X?” with semantic search over transcripts is fast and reliable. Generating initial reports with representative quotes also works well. Even translating findings across languages or adapting them to different internal audiences, from engineering to executives, is a task where current models perform.
What matters is keeping discipline: every quote in a final deliverable is verified, every statement about patterns is backed by concrete evidence, every extrapolation declares its uncertainty. AI speeds up, it does not license skipping these steps.
My reading
Two years after ChatGPT, AI has reached user research the same way it has reached everywhere else: turning the tedious into the fast without changing what is fundamentally hard. Transcribing is no longer a problem. Summarizing is no longer a problem. Interpreting what users want and why remains just as hard as before, because the user is still human and the clues are still in their contradictions, not their averages.
Teams that tune their processes while preserving real contact with the user gain speed without losing quality. Those who use AI as an excuse to talk to fewer people end up with products that make sense on paper and fail in real hands. Ultimately, the question of how much to automate is not technical but methodological: which part of the work is learning and which is friction? AI eats friction, it should not touch the learning.
For product teams, the practical recommendation is simple. Adopt AI for transcription and first-pass synthesis without reservation. Adopt it for guide and material preparation with review. Reject it as a substitute for real participants. Measure its help in time saved, not in interviews skipped. And above all, do not let the speed with which a model produces an answer make us forget that the best product ideas still come from talking to the person who uses the product.