A Conversation with Lillian Freiberg and Austin Ritzel
AI is moving faster than most financial firms can absorb it. This conversation covers what that actually means: for security, vendor strategy, and the people inside your organization quietly figuring it out without you.
Part 1: The Mythos Moment — When a Model Was Too Dangerous to Release
Lillian Freiberg: Hi everybody. Welcome. My name is Lillian Freiberg. I’m the Head of North America at Clarity AI. I am joined by my colleague Austin Ritzel, our Senior Manager of AI Strategy and Strategic Projects. Basically, the person I will bother when something in the AI world breaks and I need to understand how it’s going to affect my clients and me.
We’re going to cover a lot of ground. How fast this technology is actually moving. What that means for firms that are trying to keep up, and the strategic decisions that are going to separate the winners from the ones that are still untangling the choices they made two years ago.
Austin Ritzel: Thanks, Lillian. I can’t beat that intro. You’ve teed me up well. I know that we have a lot that we want to discuss today. There’s no shortage of fascinating topics in the world of AI right now.
Lillian Freiberg: Let’s start with something that genuinely rattled people a few months ago. And if it’s new to you, I’m going to give you the quick version.
Anthropic — the people behind Claude, one of the leading AI labs — released a model called Mythos. AI releases happen constantly. But this one landed very differently. Mythos was built to be exceptional at coding. What nobody fully anticipated was that the same capabilities that made it great at writing code also made it essentially the most powerful security vulnerability scanner ever seen. Nearly 200 serious bugs were discovered in Firefox alone. For comparison, the model before it found about 20, just two months earlier.
Here’s the part that really got people’s attention. Anthropic voluntarily chose not to release it publicly. They gave access to roughly 40 institutions — major financial firms, tech companies, critical infrastructure — specifically so defenders could shore up before this capability got into the wild. That kind of voluntary restraint at that scale hadn’t really happened before. It set off alarm bells in Washington, alarm bells in Brussels, and it forced a real conversation about where the AI race actually stands.
My first read is that we’re nowhere near settled. What is yours?
“For the righteous actor, that’s a very robust shield. For nefarious actors, Mythos is a very powerful offensive weapon.”
Austin Ritzel: In a space where we seem to have consequential moments, unprecedented moments every other week, this was truly a very consequential moment for the AI landscape. And there were a couple of reasons for that.
First, as you mentioned, Mythos is uniquely potent in its ability to identify security vulnerabilities, and of course, that comes with dual-use concerns. On one hand, for the righteous actor, that’s a very robust shield. But for nefarious actors, Mythos is a very powerful offensive weapon — a sword, to close out the metaphor.
The second reason it was very consequential is this idea you touched on: models are grown, not built. This is an entirely different way of thinking about technology. It’s not an idea that I’ve invented. In fact, this is how Dario Amodei, the CEO of Anthropic, actually describes the process of creating a model. Mythos was not optimized specifically for cybersecurity scanning. It was optimized for coding, as many large language models are. But the same capabilities, the same features that made it really great at coding also made it absolutely prolific at vulnerability discovery.
“Mythos is a step change from what Opus 4.6 was able to do in terms of scanning for security vulnerabilities. Big change.”
There were some people who said at the beginning that this is like the greatest marketing hype ever — the model that’s too powerful to release. But we found out pretty soon thereafter that this was not in fact marketing hype. It’s a step change from what Opus 4.6, the previous model, was able to do in terms of scanning for security vulnerabilities. Big change.
As an aside, I do want to note that this idea of generalizable capabilities from one domain to another — this is something we’re seeing in finance. A lot of the elements that make large language models really great at coding, like logic, primarily make them great at a lot of finance tasks as well.
The third reason this was such a big deal: Anthropic of their own volition held it back from public release and granted it to those 40 — I think now it might be 70 — institutions to allow them to shore up their defenses. Also, in part due to some concerns about computing. Mythos is a very large model.
“The Trump administration has floated the idea of requiring AI labs to submit their models to the feds prior to public release. Like an FDA for the AI space.”
That spurred a very interesting response from the Trump administration. Ideologically, the administration has been opposed to regulation, particularly around AI — one of the first executive orders was to strike down a regulation put in place by the Biden administration. But now they’ve floated this idea of an executive order to require AI labs to submit their models to the feds prior to public release. It’s been described as an FDA for the AI space. A big change for the regulatory landscape.
It also sparked a response from the EU because they were left in the lurch — no EU-domiciled institution had access either. There are also questions about free and fair access by all governments to shore up defenses.
That said, on the first facet of your question: is the AI race far from settled? Absolutely. Experts generally posit — and this is coming from the CEOs of these companies — that US labs are only about 1 to 3 months in front or behind each other these days. The Chinese labs like DeepSeek, it’s about 6 to 12 months behind. They’re very, very clustered. And I think we should expect as a result to see the field continuously shuffle and reshuffle.
Part 2: Why Financial Services Is Ground Zero
Lillian Freiberg: Interesting. So even as Anthropic is having this moment, this is not a winner-declared situation. The race is still very much on. And one of the places that race is playing out most visibly is within financial services, as you pointed out earlier. Every major AI provider is making a massive push into this industry. Would you say this is structural, or is this just where the money is?
Austin Ritzel: I think both. On the structural side, a lot of the features that make large language models really good at coding also make them really good at finance workflows, or a lot of finance workflows, certainly not all. Anthropic itself estimates that finance is just about 6 to 12 months behind where coding is right now on the automation curve. And anyone who’s been paying even light attention to the AI space knows what’s going on with coding these days.
It is also certainly where the money is. The major labs are unlocking a lot of revenue in finance. For Anthropic — and I reference them because they just released a whole lot of data on this — by revenue, the finance industry is number two among all industries. That trails only the technology industry, and 40% of their top 50 customers are financial institutions.
Lillian Freiberg: That makes a lot of sense. But six to 12 months behind coding… I feel like that’s genuinely closer than most people realize. And yet financial services are famous for being a very slow-moving industry. Procurement cycles, risk approvals, compliance sign-offs. Is the pace of AI development a temporary headache for the sector, or is there something more fundamentally mismatched here?
“The major labs are unlocking a lot of revenue in finance. By revenue, the finance industry is number two among all industries for Anthropic.
Austin Ritzel: I think we need to consider it from first principles. Where does the friction come from? There seems to be both procedural and structural friction.
Procedural friction — documentation cycles, vendor onboarding, that sort of thing — AI could probably compress a lot of that. The structural friction is probably more durable. And by structural friction, I mean regulation, fiduciary duty, and personal accountability for decisions in highly regulated domains. These are probably best cast as features, not bugs, in the world of finance, particularly as that world is being challenged by AI where the cost of being wrong is very high.
The more potentially challenging structural constraint is an idea I’ve written about called the Silicon Ceiling: the limit of a firm’s capacity to absorb capability. Things like change management, filling skills gaps among employees, and training employees. There is a mismatch between the velocity of frontier capability and the velocity of organizational absorption. And that is not unique to financial institutions. Deloitte’s 2026 State of AI in the Enterprise survey found that the AI skills gap is perceived as the biggest barrier to enterprise integration. So that squishier side of the structural barrier is deeply important to understanding how fast all companies can move.
I would expect that gap to probably widen before it narrows. But you’ll see a divergence between those companies that are investing in absorption capacity — which really means investing in people and processes as a strategic asset — and those who are not. I expect the former to pull away.
“The more potentially challenging structural constraint is the so-called Silicon Ceiling: the limit of a firm’s capacity to absorb capability. Things like change management, filling skills gaps among employees, and training employees. There is a mismatch between the velocity of frontier capability and the velocity of organizational absorption.”
Lillian Freiberg: I like the squishy framing. And the Silicon Ceiling is such a useful frame because it’s not the technology that’s the problem necessarily. It’s the people, the organization, the integration. And that connects directly to something I want to make sure we touch on before we move on, because it’s so important and I get this question all the time. That’s cybersecurity. Mythos made it visceral. When models are being deployed faster, what actually happens to the threat surface for a financial institution?
Austin Ritzel: It’s a really important question and it actually casts that slowness of financial institutions in a very positive light. Instead of slowness, you read it as deliberateness.
The speed of model deployment is altering the threat surface via two primary vectors. The first is that there’s an asymmetry between offensive and defensive capabilities. The cycle between capability emerges, capability breaks onto the scene, and capability is in the wild; that cycle is being compressed. And when that air gap is compressed, it doesn’t just give defenders less time to patch the holes. It gives them less time to understand what they’re patching against. Anthropic’s decision to hold back Mythos is the best example we’ve had thus far of a company pulling the brake, trying to separate capability from deployment, and buying everyone a little prep time.
The second vector is that traditional security perimeters are changing, particularly as AI becomes agentic, as it starts to really do things. AI with hands is a very different concern than a brain in a jar. Every AI integration implies the deployment of a new perimeter. An agent with calendar access, email access, file system access… It doesn’t cleanly fit into the least-privileged models we’ve relied on. Security practices probably need to change fundamentally to mirror that cadence of change. Rather than updating a security posture annually, it needs to look a lot more like a monthly cadence in order to try to keep up with the pace of AI development.
“Traditional security perimeters are changing, particularly as AI becomes agentic, as it starts to really do things. AI with hands is a very different concern than a brain in a jar.”
Part 3: The Wrong Decision at the Wrong Level
Lillian Freiberg: That’s going to be a huge adjustment for some firms. So let’s pivot a bit and talk about a problem that I suspect a lot of people are quietly living with right now.
Hypothetically: 18 months ago, your firm goes through the whole process: IT review, risk sign-off, compliance blessing, and approves a foundational model. Resources have gone in. And now the landscape has moved and you’re looking around thinking: did we back the right horse?
A big part of what made it worse is that a lot of these decisions got made at the wrong level. Risk or IT approved something before the business really understood what they were buying, how they were going to use it, or even who might be using it. Now you’ve got a contract, workflows built on top of it, and the rest of the world has moved on.
Austin, is this solvable? Or is the industry just going to have to keep carrying these early decisions for a while?
Austin Ritzel: When the ground is always shifting, the most appropriate response is to ensure that foundations are flexible enough to adapt. What we saw in the first wave of AI, when the excitement was building, but it was still pretty novel, back in say 2024, is that non-business units were making a lot of these decisions on behalf of business users. IT, risk, and procurement teams.
Business units probably deferred their influence because it was so novel. They hadn’t developed personal preferences yet. They hadn’t gotten familiar with what these tools were capable of. And frankly, the capabilities were very different. In 2024, we were talking about chatbots. Now we’re talking about people building agents in a couple of hours. It’s a huge change.
That said, I would expect model agnosticism to become the dominant approach, both amongst companies building specialized solutions and the parties buying those solutions. But model agnosticism is not just a decision you make. It’s not just saying “we’re model agnostic.” It requires a tremendous amount of architectural investment.
“When the ground is always shifting, the most appropriate response is to ensure that foundations are flexible enough to adapt.”
Clarity AI actually provides a useful example of this. We’ve had a model-agnostic structure since our start. The data extraction platform we’ve built, for internal use and for clients, allows for frictionless switching between models from Anthropic, OpenAI, Gemini, and many others. When you include open-weight models, we’re talking about access to more than a thousand foundational models. A new entrant can be embedded in just a couple of days. That ensures we’re never dependent on a single lab and never vulnerable to either being outmoded by a new release or being left in a lurch by an outage.
The buyers on the other side should look to an example recently set by Apple. Apple is going to use Gemini as a basis for its foundational models, but with the release of iOS 27 they’ll allow users to select the third-party provider they want — Gemini, OpenAI, Anthropic — to power features across their software. A Siri response generated via Claude will be different than one generated with Gemini. You could make that choice based on a preference for the tone or personality of the model, its reputation for specific tasks, your trust in the lab, or your familiarity with it from professional settings.
Apple isn’t betting on the horses. They’re building the racetrack. That’s the same approach we’ve taken at Clarity AI for years, and I expect it to be increasingly adopted by both builders and buyers as they realize that exclusivity carries enormous risk.
“Apple isn’t betting on the horses. They’re building the racetrack. That’s the same approach we’ve taken at Clarity AI for years.”
Lillian Freiberg: Build the racetrack, don’t bet on the horse. I love that. Not just because I love betting, but it sounds obvious in hindsight. But clearly not everybody got there early. You mentioned decisions getting made at the wrong level. Does that pattern actually hold up across the market?
Austin Ritzel: I think so. And this isn’t guilt that sits exclusively with the finance industry. This is playing out across industries. Central function IT, risk, and procurement gets the first crack at evaluating AI providers, way back when it was extremely novel, and businesses sort of deferred their influence. That’s how you end up with firms that approved one model in 2024 and are now locked in. And that’s what both of us hear from clients in our day-to-day.
The smart move was and will always be to go bottom-up. You need to understand the use case first, then move through vendor selection. You start with the functions closest to the work. If you’re thinking about onboarding an AI tool for software engineers, you’d better talk to the software engineers. If you’re thinking about onboarding a tool for relationship managers, you need to talk to relationship managers to understand which capabilities matter and which don’t. Move beyond pretty demos into actual value.
Central evaluation still needs to layer on top — IT, risk, procurement all need to put these tools through their paces from a security standpoint, as we’ve discussed. But they should be open to the influence of business units. Business unit users are becoming competent enough to flex influence over the evaluations of technical teams.
Lillian Freiberg: And that leads perfectly into a separate but related phenomenon. I think it’s way more widespread than firms want to admit: people inside these institutions know better tools exist and they’re quietly using them anyway, approved or not. What does that pressure eventually produce at the institutional level?
Austin Ritzel: I think this shadow usage is symptomatic of a very deep affliction. Ethan Mollick, from Wharton, who writes a lot on AI in enterprise settings, has dubbed this persona the Secret Cyborg. The Secret Cyborg is silent and fearful. It’s an employee who uses AI to outperform peers in the shadows but sees no upside, no incentive for disclosing that use. They fear judgment or fear that AI could replace them. Whatever their disincentive for keeping their AI use under the table, it’s problematic. And it indicates that the firm itself hasn’t built the right internal relationship with AI. That is the disease of which this persona is symptomatic.
The consequences are serious. There are issues like data exfiltration through prompt history — if you’re using unsecured non-enterprise versions of tools, you may be passing sensitive information into systems that could train on it, and that information could surface in answers provided to other users. It creates inconsistent outputs across teams. But maybe the most costly issue is that it creates capability gaps. It reduces the efficiency of your workforce and creates a growing gap between companies where these Secret Cyborgs are prevalent and those firms that have built permissive, well-governed AI environments where people use their tools without reprisals.
The solution is to ensure that you’re building governance around how — not whether — employees use AI, because they are going to use it. And reframe from the top down: AI is an accelerant of expert work, not a substitute for expert work. An employee empowered with AI does a lot more than an employee replaced by AI. The companies that do more with the same versus the same with less are really going to be the winners.
Lillian Freiberg: Got it. The Secret Cyborg. Silent and fearful. I can see a whole new therapy demographic opening up for them.
Austin Ritzel: Indeed. Or a sci-fi series.
Lillian Freiberg: We could sell it to Star Wars.
Austin Ritzel: That’s a very sad Star Wars entry, but yes.
“An employee empowered with AI does a lot more than an employee replaced by AI. The companies that do more with the same versus the same with less — they’re really going to be the winners.”
Part 4: Build or Buy — And What the White Elephant Actually Costs
Lillian Freiberg: For some firms watching all of this play out, they’ve landed on a response that sounds reasonable on the surface: “If we can’t find the right vendor, we’ll just build it ourselves.” Own the stack, control the roadmap, no dependency on an external provider. I’ve heard this argument made convincingly in a lot of boardrooms. But executing on it is very different from the pitch. How do you think about that tradeoff?
Austin Ritzel: There are two layers we need to differentiate when we talk about build versus buy. The first is foundation models: the things that Google’s DeepMind is building in Gemini, that Anthropic is building in Claude, that OpenAI is building in ChatGPT. Almost no financial institution is seriously entertaining building a foundation model. The talent gap is enormous, the capex involved is enormous, and the compute constraints are enormous. But the development of specialized tools and applications that leverage AI as a central capability — data extraction platforms, report generators, co-pilots — that’s a more interesting conversation, and that’s where the real tradeoffs live.
With today’s tools, you could theoretically build your own Gmail, your own Slack, your own HubSpot. Few are actually doing that at the enterprise level. Not because they technically couldn’t, but because the cost of getting it right — and then keeping it right, maintaining it — would dwarf any conceivable benefit of owning that full stack. It becomes a cursed white elephant. You’ve bought this elephant, and now you have to figure out how to feed it and keep it going.
There are three reasons that the in-house path tends to under-deliver:
- First, there’s a clear opportunity cost. Every engineer assigned to build an internal AI tool is an engineer not working on the core competencies of the institution.
- Second, specialized tools come with specialists: experts who are focused and native to the problem. At Clarity AI, we spent a decade collecting, validating, enriching, and curating a huge corpus of sustainability data across hundreds of clients with proprietary methodologies built only for this purpose. That expertise compounds. An internal team starting from zero will struggle to reproduce that depth.
- Third — and this is something our CTO really emphasizes — maintenance is silent but massively expensive. We’re keeping over 400 metric methodologies and 20 product-level methodologies current, we’re evaluating and integrating new models, and we are supporting the workflow as foundational models change and regulatory regimes shift. We can amortize that burden across hundreds of clients. An internal team carries that weight alone, forever.
“Few are building today day-to-day tools that at the enterprise level. Not because they technically couldn’t, but because the cost of getting it right — and then keeping it right, maintaining it — would dwarf any conceivable benefit of owning that full stack.”
JP Morgan is a useful proof point. They have more capital and more engineers than many tech companies. Jamie Dimon has said “We’re a tech company that just happens to dabble in finance.” And yet they’re partnering with Anthropic at the model layer and buying specialized vendors at the workflow and tool layers. Buy the core and build the edges.
Lillian Freiberg: There’s also the financial stability question lurking under all of this that I don’t think enough firms are factoring in. None of the major foundation model providers are profitable. Some are carrying significant debt. If one of them hits serious trouble, what happens to everyone that’s built on top of them? Is that a risk people are actually pricing in?
Austin Ritzel: It’s a really good question. And I want to be clear that Alphabet is a potential outlier here: Alphabet is hugely profitable. But if you draw a box around DeepMind, the internal lab that builds Gemini, DeepMind itself is not profitable. They probably look much the same as Anthropic or OpenAI on a fully loaded basis.
Being overcommitted to any single model is a risk, even setting aside financial concerns. I think the economic model is starting to work on the inference side. But training is very expensive, and those concerns are real. And none of these foundation labs are inevitable.
We had a nice example recently: Elon Musk leased out his largest data center outside of Memphis, the Colossus Data Center, to Anthropic. Meanwhile, Musk also has Grok. And it seems like he effectively abandoned the idea of Grok as a leading model by leasing out all of that compute capacity to Anthropic. Firms need to be resilient to any exogenous shock, whether that’s a bankruptcy or just someone falling behind.
That is really the importance of the open architecture approach. Building a foundation that allows you to navigate and reroute when someone falls behind or falls off the table altogether. At Clarity AI, our data extraction platform uses all of those models I mentioned, not just because different models are better suited for different tasks, but also because we don’t want the concentration risk that comes with committing to a single lab.
“The importance of the open architecture approach is building a foundation that allows you to navigate and reroute when someone falls behind or falls off the table altogether.”
There’s also a larger philosophical question I’ve been playing with. Is the dependency we’ve developed over the past few years on the major foundation labs sufficient that we would deem them “too big to fail”?
Lillian Freiberg: I can see many different angles. I can see a lot of people arguing that from both sides. But what is absolutely certain is that no financial institution — no enterprise — should put themselves in a position to personally experience the consequences of an answer that sounds like “no” to that question.
Part 5: Who Actually Wins
Lillian Freiberg: When we talk about all of this, particularly for financial firms — we’re talking about loss of competitive advantage, lost alpha. Which brings me to what I think is the thesis underneath this whole conversation. How does this market actually shake out? My instinct, based on what I hear from clients and thought leaders, is that the specialized players, the ones with genuine depth in a specific domain, who can operate across whatever model a client has already approved, are going to end up in the strongest position. Is that also how you see it?
Austin Ritzel: Yes. And I think the economics probably force that reality. The foundation layer probably commoditizes to a certain extent. Sam Altman himself has compared the end state of OpenAI to a utility. And as a consequence of that layer commoditizing, you probably see the differentiated value moving to layers in the stack that can do things a general model cannot.
Think of the LLM as the engine. The companies that pull away are those that can build a chassis around it: real domain expertise, curated knowledge, deterministic methodologies that transform what is fundamentally a probabilistic tool. Two people asking the same question need to get the same response. You need evaluations. You need human-in-the-loop expert reviews. The people who can build the chassis — the steering, the brakes, the body — are going to derive a very special type of value from this ecosystem.
Verification infrastructure is also going to be fundamental, specifically for financial institutions or any institution in a regulated domain. Explainability, provenance back to a curated database, traceability of transformation from raw data and methodology, all the way to output, that’s what makes these outputs actually usable in regulated contexts.
“Think of the LLM as the engine. The companies that pull away are those that can build a chassis around it: real domain expertise, curated knowledge, deterministic methodologies that transform what is fundamentally a probabilistic tool.”
Lillian Freiberg: Absolutely. And particularly for financial clients — traceability, trust, something that’s IC-ready, audit-ready — it’s paramount. So taking all of that into consideration, what actually separates the specialized vendors who come out stronger from the ones who don’t survive it?
Austin Ritzel: A few things. Architectural flexibility, model agnosticism. Real domain depth: bring proprietary data to the table, bring methodologies to the table to render probabilistic outcomes deterministic. Data is going to be a real moat going forward.
On top of that, specialized providers that pull away from the pack will also demand really robust auditability infrastructure. Not just page numbers but citation geometry: where exactly on the page, the exact position of the information. Source lineage. Auditable transformations: how did it go from raw data found on the page to a transformed output in an automatically generated report? How do you reduce the verification tax, the time it takes someone to confirm that the AI has produced something accurate, down to as close to zero as possible?
The wrappers — companies that are building pretty user interfaces on top of a general-purpose model with none of that domain depth — they’re probably going to get gobbled up. Where you separate yourself is curated or proprietary data, auditable and transparent outputs that reduce the verification tax, and doing what those general models just can’t do.
Lillian Freiberg: Thin wrappers getting gobbled up. I think that might be the line of the day. Austin, thank you. This was genuinely fantastic. You almost hit three in a row on “model agnosticism,” but you did make it all make sense, and that’s no small feat.
Austin Ritzel: Thank you, Lillian.





