The Facebook insider building content moderation for the AI era

3 weeks ago

2 4 minutes read

When Brett Levenson left Apple in 2019 to lead corporate integrity at Facebook, the social media giant was in the midst of the Cambridge Analytica fallout. At the time, he thought he could simply solve Facebook’s content moderation problem with better technology.

The problem, he soon discovered, went deeper than the technology. Human reviewers were expected to memorize a 40-page policy document that had been machine translated into their language, he said. They then had about 30 seconds each of flagged content to decide not only whether that content broke the rules, but also what to do about it: block, ban the user, limit distribution. Those quick calls were only “slightly better than 50% accurate,” according to Levenson.

“It was like tossing a coin about whether the human reviewers could actually address the policy correctly, and this was anyway many days after the damage had already occurred,” Levenson told TechCrunch.

Such a delayed, reactive approach is not sustainable in a world of nimble and well-financed adversaries. The rise of AI chatbots has only exacerbated the problem, as content moderation errors have resulted in a series of high-profile incidents, such as chatbots offering teenagers guidance on self-harm or AI-generated images that bypass safety filters.

Levenson’s frustration led to the idea of ’policy as code’ – a way to turn static policy documents into actionable, updatable logic, tightly linked to enforcement. That insight led to the founding of Moon bouncewhich announced Friday that it has raised $12 million in funding, TechCrunch has exclusively learned. The round was co-led by Amplify Partners and StepStone Group.

Moonbounce partners with companies to provide an extra layer of security wherever content is generated, whether by a user or by AI. The company has trained its own large language model to look at a customer’s policy documents, evaluate the content at runtime, provide a response in 300 milliseconds or less, and take action. Depending on customer preference, that action could look like Moonbounce’s system slowing down distribution while the content awaits human review later, or blocking risky content at that time.

Today, Moonbounce serves three major verticals: platforms that deal with user-generated content such as dating apps; AI companies that build characters or companions; and AI image generators.

WAN event

San Francisco, CA
|
October 13-15, 2026

Moonbounce supports more than 40 million daily reviews and serves more than 100 million daily active users on the platform, Levenson said. Customers include AI companion startup Channel AI, image and video generation company Civitai, and character role-playing platforms Dippy AI and Moescape.

“Safety can actually be a product benefit,” Levenson told TechCrunch. “That’s never been the case, because it’s always something that happens later and not something you can actually build into your product. And we’re seeing our customers find really interesting and innovative ways to use our technology to make safety a differentiator and part of their product story.”

Tinder’s head of trust and safety recently explained how the dating platform uses these types of LLM-powered services to achieve a 10x improvement in detection accuracy.

“Content moderation has always been an issue that has plagued major online platforms, but now that LLMs are at the core of every application, this challenge is even more difficult,” said Lenny Pruss, general partner at Amplify Partners, in a statement. “We invested in Moonbounce because we envision a world where objective, real-time guardrails become the supporting backbone of every AI-mediated application.”

AI companies are facing mounting legal and reputational pressure after chatbots have been accused of targeting teenagers and vulnerable users suicide and image generators such as xAI’s Grok have been used to create non-consensual nude images. It is clear that the internal safety features are failing and it becomes a liability issue. Levenson said AI companies are increasingly looking outside their own walls for help in strengthening security infrastructure.

“We are a third party that sits between the user and the chatbot, so our system is not overloaded with context like the chat itself,” Levenson said. “The chatbot itself may need to remember tens of thousands of tokens that have come in previously… We’re just concerned about rule enforcement at runtime.”

Levenson leads the 12-person company alongside his former Apple colleague Ash Bhardwaj, who previously built large-scale cloud and AI infrastructure for the iPhone maker’s core offerings. Their next focus is on a capability called “iterative control,” which was developed in response to things like the 2024 suicide of a 14-year-old Florida boy who became obsessed with a Character AI chatbot. Instead of a flat refusal when harmful topics come up, the system would intercept and redirect the conversation, adjusting prompts in real time to nudge the chatbot toward a more active supportive response.

“We hope to add to our action toolkit the ability to steer the chatbot in a better direction, essentially taking the user’s prompt and adjusting it to force the chatbot to not just be an empathetic listener, but a helpful listener in these situations,” Levenson said.

When asked if his exit strategy included an acquisition by a company like Meta, bringing his work in content moderation full circle, Levenson said he realizes how well Moonbounce would fit into his old employer’s stack, as well as his own fiduciary duties as CEO.

“My investors would kill me for saying this, but I would hate for someone to buy us and then limit the technology,” he said. “Like, ‘Okay, this is ours now and no one else can take advantage of it.’”

Source link

The Facebook insider building content moderation for the AI era

Travel and tourism industry leaders launch US Tourism Economy Alliance | News

Holland America Line returns to Alaska with a season of extensive exploration and new experiences | News

Trump calls female reporter ‘such a disgrace’ in heated discussions

Azzi Fudd’s story highlights power of foster family support

A look back at Arizona’s 1997 NCAA men’s basketball title

Related Articles

Here’s what Jony Ive and Sam Altman revealed about their secretive AI hardware project at OpenAI’s Dev Day

The xAI–X merger is a good deal — if you’re betting on Musk’s empire

Direct Preference Optimization: A Complete Guide

Bumble to launch an AI dating assistant, ‘Bee’

Travel and tourism industry leaders launch US Tourism Economy Alliance | News

Holland America Line returns to Alaska with a season of extensive exploration and new experiences | News

Trump calls female reporter ‘such a disgrace’ in heated discussions