Building High-Performance AI Engineering Teams with Mike Conover, Co-founder & CEO of Brightwave

Building High-Performance AI Engineering Teams with Mike Conover, Co-founder & CEO of Brightwave

Building High-Performance AI Engineering Teams with Mike Conover, Co-founder & CEO of Brightwave

Sep 17, 2024

Sep 17, 2024

Continuing our Deployed podcast series, we recently sat down with Mike Conover, Co-founder & CEO of Brightwave to learn about their experience building AI-powered tools for financial research.

Mike has been working as an ML engineer for over a decade at companies like LinkedIn, Workday, and most recently, Databricks. At Databricks Mike led the work to create Dolly — the first instruction-tuned open source LLM that could deliver ChatGPT-like performance. Mike has thought deeply about what it takes to get ML systems to perform well in production, and we get into conversation about what’s unique about building with LLMs specifically.

Now he’s building Brightwave to help asset managers and financial analysts discover new opportunities and make better decisions about equities, bonds, and the like. Brightwave’s product creates insightful summaries from disparate knowledge across large sets of documents to help people quickly spot things they might otherwise miss. Here’s a small example.

If you’ve ever asked ChatGPT or Claude to summarize a really long document for you, you probably appreciate how hard it is to get a model to surface the interesting bits. Lots of teams are using LLMs to summarize long content, and this conversation will be especially useful for anyone trying to create great summaries.

We get into the details of what it takes to make Brightwave work well, and lessons learned along the way including:

  • How Brightwave creates great summaries of long content (hint: it’s not long context windows)

  • How they continuously review data & evolve their eval suite through a practical process involving in-house finance experts, product, and engineering

  • Thoughts on staffing AI engineering teams, including what Mike has seen work to get strong software engineers up to speed working with LLMs

More on each of those points below! For anyone who wants to jump right in, here’s the show notes and the full episode.

2:18 - Intro to Mike & Brightwave: Helping asset managers find insights that give them an advantage
6:13 - Biggest surprise building with LLMs? Large context windows aren't as useful as we would have hoped
9:47 - The solution: Systems engineering that connects multiple inference calls
14:37 - Proving the haters wrong – Brightwave’s summary of bearish Goldman Sachs AI report
16:52 - Defining "good" for AI products. What’s it look like to define the right evals?
20:02 - Iterating toward quality: Reviewing data and tuning systems to get better results
25:30 - "Most everybody is learning about the applied uses of this technology at roughly the same rate" - Mike on the democratization of AI knowledge
29:47 - What it looks like to cross-pollinate expertise between ML engineers and systems engineers
32:26 - How to think about the future, especially as models evolve? (recorded BEFORE OpenAI o1 was released!)
36:08 - Mike's parting advice: "Just use the tool as much as humanly possible"

The Secret To Great Summaries Is Not Long Context Windows 

Anyone who’s followed LLMs over the last two years is familiar with the hope and hype around extending context windows. We’ve seen these explode from 1000’s of tokens to up to 2M tokens today for Google’s Gemini.

Especially if you’re building a complex RAG system, the hope has been that long context windows will do the hardest work, and relieve the burden of investing in progressively better search / retrieval techniques.

So far at least, that hasn’t played out. Mike talks about how long context windows might be good at finding a needle in a haystack, but not identifying lots of interesting needles and pulling them together into an interesting summary. Instead, he talks about their approach to iterate over documents and “spearfish” for the interesting points, then build summaries from those pieces.

Iterating to a Strong Eval Suite

When it comes to getting ML systems to perform well, evals are key. But how do you decide which evals to run? And how do you learn to trust those evals?

At Brightwave as with so many other strong generative AI product teams, it’s an iterative process. 

Mike talks about the challenges especially that come from working with LLMs where you often don’t have ground truth data to compare against. Instead, they spend time reviewing lots of real-world examples from their systems, then identifying issues and figuring out how to measure them so they can detect them the next time. They then make improvements, push to production, and repeat that process. In Mike’s words: “It’s like tightening a ratchet.”

In Brightwave’s experience, a key to getting the details right is having finance experts at the table looking at the data alongside engineers or PMs. They’re able to spot nuances that others would miss, and help the team think about how to improve.

Or captured way more poetically by Mike: “There’s a tick-tock motion with regards to measurements and qualitatively basking in the gestalt of what you’ve built… It’s like another way to say, just look at the f*@!ing data.” 

Building High-Performance AI Engineering Teams

“The fact of the matter is that most everybody is learning about the applied uses of this technology at about the same rate.”

Given that so few people had worked closely with LLMs before ChatGPT came onto the scene, it’s almost impossible to build generative AI features today by hiring experienced experts. Mike’s biggest learning, and single biggest suggestion to other founders and builders? The best way to gain expertise is simply to get the reps in: Use the tools, and look at tons of inferences.

Mike shared a few more tips on how to staff effective teams by combining people with ML experience and strong systems engineering backgrounds:

  • Why you want ML experience: It’s valuable to have people on the team who have worked with ML models before – for example, recommendation systems or search problems. In particular, prior experience working with ML systems helps inform intuition about what to do when things aren’t working, and past experience is often the source of insight on what to do next.

  • Why you want strong systems engineers: Systems engineering is still a huge part of the work, and traditional software engineering is still the majority of the work when building products around generative AI. The basics of working with LLMs are learnable for experienced engineers, especially with the right content and with lots of repetitions using LLMs.

———

When it comes to working with LLMs, Mike is one of the more experienced people we know. It’s hopefully encouraging for others who are newer to building with LLMs to hear him say it’s possible to gain expertise relatively quickly by simply getting hands-on and doing the work.

Huge thanks to Mike and the team at Brightwave for sharing all these learnings! And PS, Brightwave is hiring for all roles: careers@brightwave.io   

Check out the episode and share your feedback with us! We’d love to hear from you.

We’ve got more good content coming soon, and you can subscribe on your favorite podcast service to keep up on the next interviews like this: Spotify, Apple Podcasts, and YouTube.

Keep up with the latest


Keep up with the latest


Keep up with the latest