Why Your AI Product Team Needs an AI Quality Lead

Why Your AI Product Team Needs an AI Quality Lead

Why Your AI Product Team Needs an AI Quality Lead

Jan 9, 2025

Jan 9, 2025

In a recent episode of the Deployed podcast, Freeplay customer Help Scout shared how they've transformed their customer service platform into an AI-native product company. They describe one of their most impactful decisions as appointing a new role they titled "AI Quality Lead." 

We’ve seen this exact role — or roles very much like it — make a big impact on several of our customers’ teams. The results have consistently resulted in tighter feedback loops, more trust in performance metrics, and overall faster time to market.

We also get a lot of questions from engineering leaders about how to set up AI teams so thought it was worth sharing some learnings about this role.

What's an AI Quality Lead?

The AI Quality Lead role serves as a bridge between domain expertise and AI development. They tend to come from the domain expert side, but then learn generative AI tactics like prompt engineering and evals. We’ve seen people enter this role from customer service, solution engineering, financial analysis, CPA backgrounds, and even MDs.

Generally they’re responsible for:

  • Reviewing and analyzing production data to spot quality issues

  • Defining evaluation criteria and (sometimes) writing evals to catch future issues

  • Managing test datasets to ensure comprehensive testing

  • Improving prompts based on deep understanding of customer needs, especially to make sure AI features maintain the right tone and quality

  • Helping prioritize with engineering & PMs what issues to fix first and what to build next 

  • Training others on the team to evaluate AI output quality

In Help Scout’s case, this looked like plucking someone with deep experience on their customer success team who had solid product knowledge but no engineering background, and then empowering them to make sure AI features truly serve customer needs.

The result? This role has since become central to their AI development process. As Luis Morales, VP of Engineering at Help Scout, explains on the podcast: "Understanding the data, understanding the LLMs, understanding machine learning is not everything. We really need to understand the context of the things that we're doing."

In just a month after their AI Quality Lead started improving prompts, the customer team reported that AI-generated drafts "felt like they were written by one of our own."

Here’s a clip of Luis talking more about the role:

Why domain experts make great AI Quality Leads

One of the most exciting aspects of this role is that you don't need to start with someone who have generative AI expertise (especially since so few people actually do). A key assumption is that they’ll be paired with experienced engineers and data scientists who know the technical side well, but it’s a great way to get more people involved in AI development.

The learning curve for prompt engineering, writing evals, and running experiments is not as steep as many assume. As Help Scout described, it only took a month for this role to ramp up and start making a measurable impact.

Below are a few key attributes that make domain experts successful in this role:

  • They have a deep understanding of customer needs and use cases

  • They are able to recognize subtle quality issues that technical metrics might miss

  • They have a natural inclination to think systematically about categorizing and solving problems

  • They are eager to learn new technical skills

  • They are strong communicators, with a knack for bridging the gap between technical and non-technical teammates

Getting started

If you're running an AI team and considering creating an AI Quality Lead role, we've shared a template job description based on what we've seen work well across multiple companies. 

And for anyone taking on this role, here’s a breakdown of how to set yourself up to succeed:

  1. Learn the basics of prompt engineering (check out these guides from OpenAI and Anthropic, our our post here)

  2. Start dogfooding your team’s existing AI applications and digging into existing production data to understand current quality issues

  3. Dig in with your engineering team to understand existing evaluation criteria and testing processes, and map out future plans (primer on LLM evaluation & testing here)

  4. Build a systematic process you can use for reviewing issues and taking action, like a daily review of negative customer feedback

  5. Establish regular feedback loops with customers and the engineering team, like a weekly sync with engineering to review your learnings & suggestions

Have questions? We’d love to help. We’re happy to share learnings and provide input for other leaders as they build out AI teams. Grab time to chat with our team here.

Additional resources

To learn more about building effective AI teams, feedback loops, and evaluation systems:

Keep up with the latest


Keep up with the latest


Keep up with the latest