In our latest episode of Deployed, we sat down with Sara Beykpour, co-founder and CEO of Particle. We talk about how they're using AI to transform how people consume news — and get into lots of detail about her practical lessons learned with prompt engineering, evals, etc.
Particle is an iOS app that organizes news coverage across multiple sources into an easy-to-read, summarized, and personalized feed. It just launched out of beta in the last couple weeks, after almost two years in development. Generative AI is used throughout Particle's system for things like:
Generating news summaries and headlines
Identifying topics and building a taxonomy for the news
Clustering related stories and identifying similarities between source articles
Assisting with content moderation
Answering user questions about stories
What makes Particle particularly interesting to us is how seamlessly they've integrated AI into the core news reading experience. We’ve all seen a lot of chatbots and more than a few hardware devices at this point, but this feels like a new class of “consumer AI apps” that we want to see more of. LLMs are integral to nearly every surface in the app, but they’re embedded seamlessly – it doesn’t feel like you’re “using AI.” Making LLMs work at consumer scale is hard, and we knew Sara would have good learnings to share.
In this conversation, Sara gets into the details of what it’s looked like to build AI products that actually work for customers, including:
How the Particle team approaches quality and builds trust when summarizing the news
Their practical process for developing and improving AI features
Key learnings about evaluation pipelines and prompt engineering
A few insights on building products and on working with publishers in the AI era
We dive into some key takeaways from the conversation below if you just want to watch some highlights. Check out the full episodes here: YouTube, Spotify, or Apple Podcasts
Building Customer Trust With Good Evals & Transparency
One of the most interesting aspects of Particle's approach is how they've tackled the challenge of trust. As Sara explains, "AI and trust in AI disappears when it's working. When it's working, you don't have to worry about the trust. You don't have to worry that it's written by AI."
Particle has taken a few steps to deliver quality outputs from LLMs alongside giving users easy access to citations to check on any claims:
They run multiple evaluations before any summary reaches users (easy to do since they generate article summaries asynchronously / in advance of people reading anything)
They then provide clear citations and sourcing for every claim so people can inspect themselves
This pattern feels generally helpful for a lot of other AI products, in particular designing citations into the user experience.
Iterating on AI Features In Practice
We consistently hear feedback from podcast listeners that they like learning how other teams build, and Sara shares lots practical details in the podcast. Here are a few.
Going From Prototype To Production
Sara talks about her team’s process getting a “big prompt” to work (a couple hundred lines long):
Testing first with a few real world test cases (actual news stories) to get a feel for prompt behavior
Thinking through the most important dimensions that mattered for optimization (5-6 total)
Expanding to more test cases to pressure test those dimensions, including edge cases and jailbreaking attempts and other risk vectors
Running the new release candidate through async eval pipeline with offline examples
Then finally running live content through it, reviewing outputs, and making incremental adjustments to improve
On Prompt Engineering
Everyone's favorite art and pseudo-science… If you're looking for alpha on what works, a few of Sara's favorite tips learned in the trenches:
Tell the model what it will be evaluated on
Be clear and direct (“as unambiguous as possible”)
Occasionally asking ChatGPT for feedback when something’s not working right
And maybe… threaten its job ;)
On Evaluation Pipelines
Particle runs multiple evaluation steps before serving content to users — a benefit they get from their product use case since they can create content in advance, rather than waiting for users to ask for content.
Their pipeline includes:
Running targeted evals to check that prompt instructions were followed
"Reality checks" to verify accuracy against source materials
Asking for reasoning/explanation from the AI
Human review of edge cases
Looking Ahead
Sara's parting advice for teams building AI products emphasizes focusing on customer value above all: "Focus on user values. User value is agnostic to whether or not AI powered it or whether or not something else powered it... The value is what really matters."
We're loving using Particle ourselves and are excited to see where it goes. Want to try it out? You can download Particle now on iPhone and iPad in the US here.
Check out the full conversation on YouTube, Spotify, or Apple Podcasts to hear more details about Particle's approach to building with AI. And don't forget to subscribe to get notified about the next episode of Deployed!