Building great AI products requires constant iteration, which typically means teams spending hours analyzing logs, parsing customer feedback, and running experiment after experiment to refine their prompts and model behaviors. Then any time a new model comes out, they have to repeat the process again and adjust their prompts to behave the right way with the new model.
What if all that data you are already collecting in production could be used to automatically generate improvements?
We built our new automated prompt optimization features to do exactly that. Our goal is to turn your data into fuel for continuous, automated optimization — and give you the control to decide what you're optimizing for.
What’s special about how Freeplay treats this topic? There are lots of magic “make this prompt better” buttons in various playgrounds like Anthropic’s Workbench or OpenAI’s Playground, but Freeplay offers the promise of full production data integration and cross-model compatibility that can make automated prompt optimization an “always on” part of your team’s workflow. And at the same time, it's flexible to work with the data you have (instead of well-labeled, well-structured datasets you might wish you have).
The result is automated prompt optimization that radically reduces time spent experimenting with prompt engineering tweaks.
Instead of spending hours sifting through data and trying to decide what changes to test, you can now leverage all the signal you’re already generating from evaluation results, customer feedback, and human annotations to surface opportunities for improvement and generate optimized prompts based on your real production data. And you can quickly run all your evals against a new version to compare to prior versions, using your own custom evals scores.
How Automated Prompt Optimization Works
Getting started is as simple as clicking the Optimize button on any existing prompt in Freeplay in the Freeplay playground.
Here’s what happens next:
1. Context Collection: Our optimization agent automatically gathers context from signals in your Freeplay data.
Production evaluation results on your observability logs help show where your prompt succeeds and fails.
End-user feedback highlighting real-world issues adds further context on observability logs.
Human annotations from your review workflows add nuance from your domain experts.
Model-specific prompting best practices drive provider- and model-specific customizations.

2. Agentic Reflection: Using this rich context, our optimization agent then analyzes your current prompt to identify specific improvement opportunities — whether that’s clearer instructions, better few shot examples (which can be pulled in from real data), or more effective formatting structures for your specific model.
3. Optimized Prompt Generation: You’ll receive an optimized prompt recommendation in a diff view with a detailed explanation of each change, showing exactly how the optimizations address patterns in your data.

4. Side-by-Side Validation: Freeplay also automatically runs a test with your specified dataset to compare your old version of a prompt to the new, optimized version so you can make a quantifiable decisions to deploy the new changes.

Why This Matters and Our Plans
Every day your AI product runs in production, it generates valuable signal about what’s working and what isn’t. Until now, most teams have tried to make sense of all that data manually and decide what to do with it.
We believe the future of generative AI product development — and certainly the future of the Freeplay product — will look like increasingly smart automation to address these common parts of the generative AI ops and agent development workflow. Agents are needed to evaluate and improve other agents.
Automated prompt optimization from Freeplay is one of these examples (alongside eval generation) that transforms passive data into active improvements, improving one key aspect of your data flywheel such that your AI product gets better as it’s used.
Our Vision for the Future
This launch marks the start of a broader set of changes to how AI products are built on top of Freeplay. Coming soon you’ll see:
More granular control over optimization context and constraints
Deeper insights generation that flow directly into optimization suggestions
Tighter integrations between your human review flows and targeted auto optimization
Ready to try automated prompt optimization? It’s available now for all Freeplay customers. Open any prompt and click “Optimize” to get started.
Categories
Product
Authors

Jeremy Silva