Pew Research Center Increased LLM Visibility and Used Agentic Development to Ship Its Largest Release Ever

In an AI-first world, content has to be more than text on a page.

A laptop displaying the Pew Research Center website sits on a desk, showing articles and data visualizations. Next to it, the Pew Research Center logo and name are prominently displayed on a beige background with black and gold accent blocks.

Share

Table of Contents:

Key results

  • 6 months → 1 month to ship a fully featured LLM-readiness build.
  • Before, ChatGPT sent Pew Research effectively zero traffic. Thirty days later, it was their #2 referrer.
  • 1.5M lines of code changed in Pew’s largest-ever release, predominantly agentic.
  • ~3–4x ahead of roadmap, ~10x faster than conventional development.

When people ask an AI assistant for a number, trend, or finding, the answer depends on whether it can retrieve the right source and read it correctly.

For Pew Research Center, that created a discovery challenge. As audiences shift to rely on information through AI assistants outside of traditional search, Pew needed to make sure its research was accessible to LLMs and structured clearly enough to preserve accuracy.

With support from a WordPress VIP Forward Deployed Engineer, the team turned what it had scoped as a six-month internal project into a fully launched initiative in a single month.

“To get our site LLM ready was probably gonna be a six-month project just on our own. Having this additional engineering support directly from WordPress VIP, we were able to plan it and execute it basically within two weeks, and we launched a fully featured product within a month.”

 — Seth Rubenstein, Head of Engineering, Pew Research Center

The Challenge

Discovery was changing, and accuracy mattered as much as reach

Pew Research Center has long depended on search, direct traffic, and sharing across the web to bring readers to its work. But today, AI assistants are becoming a new front door to information. For a research institution, that creates two risks.

The first is reach. If AI assistants become a primary path to information, publishers that are hard to crawl and interpret lose visibility.

The second is accuracy. An LLM may still attempt to summarize a study it cannot parse cleanly, but it may return the wrong figure, the wrong topic, or the wrong conclusion.

Pew’s engineering team knew what it would take to improve that. Internally, they estimated the work at about six months. AI was changing audience behavior at the same time it was raising expectations for what engineering needed to deliver.

“AI has put a lot of pressure on our business as far as capturing audiences, but also pressure from up on high on our team of what they expect us to accomplish. We’re concerned mostly about how LLMs are going to factually ingest our data and get the topic correct.”

 — Seth Rubenstein, Head of Engineering, Pew Research Center

The Solution

A sprint to tackle LLM visibility with WordPress VIP

Rather than wait for internal capacity to open up or add headcount, Pew worked directly with a WordPress VIP Forward Deployed Engineer (FDE).

The engagement combined embedded engineering support with a practical implementation path. Pew already had the research content and the publishing foundation. They needed a faster way to make that content easier for AI systems to discover, ingest, and interpret correctly.

At a high level, the architecture looked like this:

  • WordPress VIP remained the publishing system and delivery layer for Pew’s research pages
  • a new discovery and ingestion layer sat on top of that environment to improve how research content was exposed to AI crawlers
  • structured signals and page-level representation were refined so models were more likely to retrieve the correct topic, statistic, and attribution

VIP handled the hard platform concerns in the background, including edge delivery, security, and scale under crawler load

That mattered because it kept the team focused on the parts that were uniquely theirs to solve: how each study should be represented, which signals should travel with it, and how to reduce the odds of a model misreading a finding.

Two implementation details that mattered

The first was crawler handling and visibility. Before rollout, Pew saw essentially no ChatGPT crawler activity on its research pages. After deployment, that changed quickly. The team could see ChatGPT requesting pages it had not visited before, with referrals following within days.

The second was content representation for factual accuracy. This was not only about making pages crawlable. It was about making the core components of a study legible to machines: the topic, the finding, the statistic, and the attribution. That work helped reduce the risk that a model would summarize a page with the wrong number attached to the wrong conclusion.

“He’s bringing in a force multiplier in a lot of these systems that he’s helping us stand up. Since we’ve deployed some of the systems that VIP has helped us with, we can see ChatGPT accessing these pages, and we know that they weren’t accessing them before.”

 — Seth Rubenstein, Head of Engineering, Pew Research Center

Where WordPress VIP made the hard parts easier

Pew did not need to spend the project solving for infrastructure first. WordPress VIP handled the platform work that could have consumed a quarter on its own: delivering pages globally, maintaining security, and absorbing crawler demand at scale.

That changed the economics of the project. Instead of spending engineering time on platform overhead, Pew and the FDE could focus on the higher-value work: shaping the retrieval layer, tuning how studies were expressed to AI systems, and shipping faster.

The engagement also influenced how the team built software more broadly. Under the FDE’s guidance, Pew adopted a more agentic development model to handle substantial portions of implementation under engineering review. What started as an LLM-readiness initiative became a new delivery pattern for the team.

In the first quarter operating this way, Pew shipped the largest release in its history: about 1.5 million lines of code changed, the vast majority developed agentically. By the team’s own review, that put Pew roughly one-third of the way through its annual roadmap before the year was a quarter over.

The Results

Increased LLM visibility and shipping Pew’s largest release ever

Thirty days after launch, Pew had early evidence that the initiative was working:

  • ChatGPT became Pew’s #2 referrer, up from effectively zero before deployment
  • Google Discover visibility increased, including placements the team had not earned before
  • Overall traffic increased, not just AI-referred sessions
  • Engineering velocity ran about 3–4x ahead of plan and roughly 10x ahead of conventional development, based on the team’s internal estimates
  • Pew shipped a fully featured product within a month after initially scoping the work as a six-month internal project, the team delivered its largest single release ever, with approximately 1.5 million lines of code shipped predominantly through an agentic workflow

The initial rollout also had effects beyond the original scope. Systems built for AI visibility began carrying into other parts of the roadmap, turning a point solution into a broader operating model for the engineering organization.

For Pew, the outcome was bigger than a faster project. The team improved how its research shows up in AI-driven experiences, increased the odds that findings are represented accurately, and built a repeatable way to ship faster on the same foundation.

“Look, the money spent already on the FDE is paying off because we are three, four times ahead of where we should be and 10X ahead of where we would’ve been with just us developing by code. As an engineering leader, it still breaks my brain to think about.”

 — Seth Rubenstein, Head of Engineering, Pew Research Center