Inside LinkedIn’s generative AI cookbook: How it scaled people search to 1.3 billion users

November 14, 2025

4 6 minutes read

LinkedIn is launching its new AI-powered people search system this week, after what seems like a very long wait for what should have been a natural offering for generative AI.

It comes three years after the launch of ChatGPT and six months after LinkedIn launched its AI offering for job search. For tech leaders, this timeline illustrates an important lesson for business: Deploying generative AI in real business environments is challenging, especially at the scale of 1.3 billion users. It’s a slow, brutal process of pragmatic optimization.

The following account is based on several exclusive interviews with the LinkedIn product and engineering team behind the launch.

First, here’s how the product works: A user can now type a natural language search query like: “Who has knowledge about curing cancer?” in the LinkedIn search bar.

LinkedIn’s old keyword search would have failed. Only references to “cancer” would have been searched. If a user wanted to get more advanced, they would have had to do separate, rigid keyword searches for “cancer” and then “oncology” and manually try to merge the results.

However, the new AI-powered system understands the intention of the search because the LLM understands the semantic meaning under the hood. For example, it recognizes that ‘cancer’ is conceptually related to ‘oncology’ and, even less directly, to ‘genomics research’. As a result, a much more relevant list of people emerges, including oncology leaders and researchers, even if their profiles don’t use the exact word “cancer.”

The system also balances this relevance utility. Rather than just showcasing the best oncologist in the world (who may be an unreachable third-degree connection), it will also consider who in your immediate network – such as a first-degree connection – is “reasonably relevant” and can serve as a crucial bridge to that expert.

See the video below for an example.

But the most important lesson for business professionals is undoubtedly the “cookbook” that LinkedIn has developed: a replicable, multi-stage pipeline of distillation, co-design and ruthless optimization. LinkedIn had to perfect this on one product before trying it on another.

“Don’t try to do too much at once,” wrote Wenjing Zhang, LinkedIn’s VP of Engineering, in a post about the product launch, and who also spoke to VentureBeat in an interview last week. She notes that an earlier “extensive ambition” to build a unified system for all of LinkedIn’s products “brought progress to a standstill.”

Instead, LinkedIn focused on winning one industry first. The success of the previously launched AI Job Search – which resulted in job seekers not receiving a four-year degree 10% more likely to be hiredaccording to VP Product Engineering Erran Berger – provided the blueprint.

Now the company is applying that blueprint to a much bigger challenge. “It’s one thing to be able to do this for tens of millions of jobs,” Berger told VentureBeat. “It’s a whole different thing to do this with over a billion members.”

For enterprise AI builders, LinkedIn’s journey provides a technical playbook for what it is Actually what it takes to move from a successful pilot to a product on a billion-user scale.

The new challenge: a graph with 1.3 billion members

The job search product created a robust recipe for the new people search product to build on, Berger explains.

The recipe started with a “golden data set” of just a few hundred to a thousand real query profile pairs, meticulously scored against a detailed 20 to 30 page “product policy” document. To scale this for training, LinkedIn used this little gold set to power a large foundation model to generate massive volume synthetic training data. This synthetic data was used to a 7 billion parameters “Product Policy” model – a high-fidelity judge of relevance that was too slow for live production, but perfect for teaching smaller models.

However, the team hit a wall early on. For six to nine months, they struggled to train a single model that could balance strict policy compliance (relevance) and user engagement signals. The “aha moment” came when they realized they had to solve the problem. They have distilled the 7B policy model into one 1.7B teacher model focused solely on relevance. They then linked it to separate teacher models trained to predict specific member actions, such as applying for the jobs product, or connecting and tracking for people search. This “multi-teacher” ensemble produced soft probability scores that the final student model learned to mimic via KL divergence loss.

The resulting architecture works as a two-stage pipeline. First a bigger one 8B parameter model provides broad retrieval and casts a wide net to pull candidates from the graph. Then the highly distilled student model takes over for fine-grained ranking. While the job search product was successfully implemented a 0.6 billion (600 million) parameter student, the new people search product required even more aggressive compression. As Zhang notes, the team has reduced their new student model from 440 million to just 440 million 220M parametersachieving the necessary speed for 1.3 billion users with less than 1% loss of relevance.

But by applying this to human research, the old architecture was broken. The new problem included not only ranking but also to retrieve.

“A billion records,” Berger said, is a “different beast.”

The team’s previous fetch stack was built on CPUs. To meet the new scale and latency demands of a ‘snappy’ search experience, the team had to move indexing to GPU-based infrastructure. This was a fundamental architectural shift that did not require the job search product.

Organizationally, LinkedIn benefited from multiple approaches. For a while, LinkedIn had two separate teams — looking for work and looking for people — try to solve the problem in parallel. But once the job search team broke through using the policy-driven distillation method, Berger and his leadership team stepped in. They conveyed to the architects the search for work — product leader Rohan Rajiv and technical leader Wenjing Zhang — to transplant their ‘cookbook’ directly to the new domain.

Distill for a 10x throughput gain

With the collection problem solved, the team faced the challenge of ranking and efficiency. Here the cookbook was adapted with new, aggressive optimization techniques.

Zhang’s technical post (I’ll insert the link once it goes live) provides the specific details that our audience of AI engineers will appreciate. One of the most important optimizations was the input size.

To feed the model, the team trained another LLM with reinforcement learning (RL) for one purpose: summarizing the input context. This “summary” model was able to reduce the input size of the model 20 fold with minimal information loss.

The combined result of the 220M parameter model and the 20x input reduction? A 10x higher ranking throughputallowing the team to efficiently deliver the model to its massive user base.

Pragmatism over hype: building tools, not agents

During our discussions, Berger was adamant about something else that might get people’s attention: the real value for today’s enterprises lies in perfecting recommendation systems, not in chasing “agentic hype.” He also declined to talk about the specific models the company used for the searches, suggesting it almost doesn’t matter. The company selects models based on the model it finds most efficient for the task.

The new AI-powered human search is an expression of Berger’s philosophy that it is best to optimize the recommendation system first. The architecture includes a new “intelligent query routing layer,” as Berger explained, which is itself powered by LLM. This router pragmatically decides whether a user’s query – such as ‘trust expert’ – should go to the new semantic, natural language stack or to the old, trusted lexical search.

This entire, complex system is designed as a ‘tool’ that a future agent will use, not the agent itself.

“Agentic products are only as good as the tools they use to perform tasks for people,” Berger said. “You can have the best reasoning model in the world, and if you try to use an agent to find people, but the people search engine isn’t very good, then you’re not going to be able to make that happen.”

Now that the people search feature is available, Berger suggested that the company will one day offer agents to use it. But he gave no details about the timing. He also said the recipe used for job and hiring searches will be spread across the company’s other products.

For companies building their own AI roadmaps, LinkedIn’s playbook is clear:

Be pragmatic: Don’t try to boil the ocean. Win one vertical, even if it takes 18 months.
Codify the “cookbook”: Turn that victory into a repeatable process (policy documents, distillation pipelines, co-design).
Optimize ruthlessly: The real 10x profit is coming after the initial model, in pruning, distillation and creative optimizations such as an RL trained abstractor.

LinkedIn’s journey shows that for true enterprise AI, the focus on specific models or cool agentic systems must take a back seat. The sustainable, strategic advantage comes from controlling the pipeline – the ‘AI-native’ cookbook of co-design, distillation and ruthless optimization.

(Editor’s note: We’ll soon publish a full podcast with LinkedIn’s Erran Berger, delving deeper into these technical details, on the VentureBeat podcast feed.)

Source link

Inside LinkedIn’s generative AI cookbook: How it scaled people search to 1.3 billion users

The new challenge: a graph with 1.3 billion members

Distill for a 10x throughput gain

Pragmatism over hype: building tools, not agents

'We cried together': Trump's deportation drive forces tough decisions for couples

Two killed in strike on alleged drug boat in Pacific Ocean, US military says

Apple’s new CEO, and why Elon Musk wants to buy Cursor for $60B

The new challenge: a graph with 1.3 billion members

Distill for a 10x throughput gain

Pragmatism over hype: building tools, not agents

Pam Bondi helped John Travolta solve a major legal problem

Prince Harry's alleged Kate Fling sparked a row with William

Related Articles

OpenAI and Perplexity are launching AI shopping assistants, but competing startups aren’t sweating it

OpenAI bets big on audio as Silicon Valley declares war on screens

The Only Guide You Need to Fine-Tune Llama 3 or Any Other Open Source Model

Will the Pentagon’s Anthropic controversy scare startups away from defense work?

'We cried together': Trump's deportation drive forces tough decisions for couples

Two killed in strike on alleged drug boat in Pacific Ocean, US military says

Apple’s new CEO, and why Elon Musk wants to buy Cursor for $60B