Your AI search strategy is someone else's scraping problem

Firms obsess over appearing in AI answers without understanding what makes a site legible to LLMs in the first place. This isn't an SEO tweak - it's a structural architecture decision.

Adam Looker
6 min read
Your AI search strategy is someone else's scraping problem

There is a new anxiety circulating in the boardrooms and partner meetings of professional services firms, and it sounds like this: “We need to appear in ChatGPT.” The phrasing varies. Sometimes it is Perplexity, sometimes it is Gemini, sometimes it is just “AI search” spoken in the same slightly uncertain tone people used to reserve for “the cloud.” The underlying concern is the same. A new channel exists. Clients are using it. And nobody in the firm is entirely sure how to make it work in their favour.

This anxiety is not unreasonable. A growing proportion of research queries - the kind that used to start on Google and end on your website - now start and end inside an AI interface. The user asks a question, the model assembles an answer from the sources it considers most relevant, and the user gets what they need without ever clicking through to a firm’s site. If your firm is not among the sources the model draws from, you are not part of that conversation. You are not even aware the conversation happened.

Where the anxiety goes wrong is in the response it produces. Most firms, when they decide to take AI search seriously, reach for the same playbook they used for traditional SEO. They hire a consultant, or they task their marketing team, or they engage an agency that has rebranded from “SEO specialists” to “AEO specialists” with impressive speed. The advice that comes back is usually about content. Write more articles. Target question-based queries. Produce long-form thought leadership that answers the things people are asking. In other words, do more of what you were already doing, but phrase it as though it is a new discipline.

This advice is not entirely wrong. Content matters. But it misunderstands what is actually happening when an AI model decides which sources to cite, and that misunderstanding leads to investment in the wrong layer of the problem.

How AI models actually find you

When ChatGPT or Perplexity assembles an answer to a question like “which law firms in Manchester specialise in commercial property disputes,” it is not reading your website the way a human reads it. It is not impressed by your photography, moved by your brand story, or persuaded by the warmth of your tone of voice. It is processing structured data. It is looking for signals that tell it, with confidence, what your firm does, where you are, who your people are, and how those facts relate to each other.

The model’s ability to extract this information depends almost entirely on how your site is built - not on what your site says. This is the distinction that almost every AI search conversation in professional services is currently missing.

A site that uses proper Schema.org markup - structured data that explicitly labels your organisation, your people, your services, your locations, your credentials - gives the model clean, machine-readable facts to work with. A site that presents the same information as unstructured paragraphs of prose, however well written, forces the model to infer relationships that could have been stated directly. Inference is unreliable. Explicit structure is not.

Entity consistency matters enormously. If your site refers to “Jane Smith” on the team page, “J. Smith, Partner” on a case study, and “Jane” in a blog post, a human reader connects those references without thinking. A model processing your site at scale may or may not make the same connection. If it does not, the authority you have built around that individual fragments across what the model treats as separate entities. The expertise is real but the machine-legible signal is diluted.

Page speed matters in ways that have nothing to do with user experience in the traditional sense. When an AI model’s crawler hits your site, it has a time budget. A site that loads in under a second gets crawled more deeply and more frequently than a site that takes four seconds to render. This is not a ranking factor in the way SEO professionals traditionally discuss ranking factors. It is a practical constraint on how much of your site the model ever sees. If your service pages take three seconds to load because of unoptimised images and render-blocking scripts, there is a real chance that significant parts of your site are simply never indexed by the systems that feed AI answers. You cannot appear in an answer assembled from content the model never read.

The content calendar fallacy

The most common recommendation from consultants selling AI search optimisation is to produce more content targeting the questions your prospective clients are asking. This is presented as the primary lever. Write the article, rank for the query, get cited in the AI answer.

The logic is appealing but it confuses cause and effect. The firms that are currently winning AI citations are not, in the main, the ones producing the highest volume of content. They are the ones whose content is most legible to machines. A single, well-structured service page with proper schema markup, clear heading hierarchy, consistent entity references, and sub-second load times will outperform a hundred blog posts that exist as walls of unstructured text on a slow, poorly marked-up site.

This is not a content problem. It is a plumbing problem. And the distinction matters because the investment required is entirely different. A content calendar costs time and creative resource, and it produces assets that may or may not be useful depending on whether the underlying infrastructure makes them visible. A structural architecture review and remediation costs development resource, and it makes everything on the site - existing content included - more legible to every AI system that encounters it.

The analogy that feels most accurate is plumbing versus interior design. You can spend a considerable amount on beautiful furnishings, but if the pipes are blocked, the house is not functional. Most firms are being sold curtains when the problem is drainage.

What machine legibility actually requires

The technical requirements for making a professional services site legible to AI models are not especially exotic, but they are specific, and they sit firmly in the domain of web architecture rather than marketing.

Schema.org markup needs to be comprehensive and accurate. This means Organisation schema that correctly identifies the firm, its locations, its contact details, and its service areas. It means Person schema for every fee-earner, linked to the organisation, with their credentials, specialisms, and roles explicitly declared. It means Service schema that maps your practice areas to recognised taxonomies. It means FAQ schema where appropriate, Review schema where applicable, and Article schema on every piece of content you publish. This markup is invisible to human visitors. It exists purely for machines. And it is the single most direct way to tell an AI model what your firm is and what it does.

The heading structure of every page needs to follow a logical hierarchy. An H1 that states the page’s primary topic. H2s that break the content into clearly defined subtopics. H3s where further subdivision is needed. This is not a design preference - it is the scaffolding that allows a model to parse a page’s content into discrete, citable facts. A page that uses heading tags for visual styling rather than semantic structure is a page that a model will struggle to decompose into useful information.

Internal linking needs to create explicit relationships between entities. Your people pages should link to the service pages for their specialisms. Your service pages should link to relevant case studies. Your case studies should link back to the people and services involved. These links are not navigational conveniences for human users - they are the connective tissue that allows a model to build a graph of relationships within your firm. Without them, every page is an island, and the model has no basis for understanding that your commercial property team and your construction disputes team share expertise that is relevant to a developer client.

Content needs to be atomically structured. This means that each page should address a clearly defined topic and provide a substantive, self-contained treatment of it. Pages that try to cover everything - the classic “our services” mega-page that lists twenty practice areas in two sentences each - give the model nothing specific enough to cite. A page dedicated to “commercial lease disputes for retail tenants” is citable. A page that mentions commercial lease disputes as one item in a list of forty services is not.

And all of this needs to load fast. Not fast by the standards of 2018, when three seconds was considered acceptable. Fast by the standards of a crawler with a time budget measured in milliseconds per page. Sub-second fully rendered load times. Optimised images. Minimal render-blocking resources. Server-side rendering or static generation where possible. Edge caching. The technical performance of your site is not a nice-to-have that affects bounce rates by a fraction of a percent. It is a gating factor on whether AI systems ever process your content in the first place.

Why this is an architecture decision, not a marketing one

The reason most professional services firms are approaching AI search incorrectly is that they have categorised it as a marketing problem. Marketing problems get solved by marketing people using marketing tools - content, campaigns, channels, messaging. AI search legibility is not a marketing problem. It is an infrastructure problem, and it requires infrastructure decisions made by people who understand web architecture.

This does not mean marketing is irrelevant. The content still matters. The messaging still matters. The quality of the thinking that a firm publishes still matters, arguably more than ever, because when a model cites your firm as a source it is lending its own credibility to your expertise. But the content is the second layer. The first layer is whether the content is legible to the systems that might cite it, and that layer is pure engineering.

The firms that will win the AI search transition are not the ones with the best content calendars. They are the ones whose technical teams - whether in-house or through a partner - have made deliberate architectural decisions to maximise machine legibility. Schema markup implemented comprehensively and maintained as the site evolves. Entity consistency enforced through a content model rather than left to the discretion of individual authors. Page performance treated as a first-class engineering concern rather than a nice-to-have that gets deprioritised when a new feature request arrives.

This is the work we do at TRUSTED MARKETING. Not because AI search optimisation is a fashionable service to offer, but because the underlying discipline - building websites that are structurally sound, semantically rich, and technically excellent - is what we have always done. The arrival of AI search has not changed what good web architecture looks like. It has made the consequences of bad web architecture more visible, more immediate, and more expensive.

The window is still open

There is a timing dimension to this that is worth stating directly. AI search is new enough that most professional services firms have not yet addressed it seriously. The competitive landscape is sparse. A firm that gets its architecture right now, while competitors are still debating whether to hire an AEO consultant or add it to the existing agency’s brief, will build a structural advantage that compounds over time.

AI models learn which sources are reliable partly through repeated successful extraction. A site that consistently provides clean, structured, accurate information becomes a preferred source. A site that is difficult to parse gets deprioritised in favour of sources that are easier to work with. This creates a flywheel effect - early legibility leads to more frequent citation, which leads to more frequent crawling, which leads to more comprehensive indexing, which leads to even more citation. The firms that move first do not just get a temporary advantage. They get a compounding one.

The window will not stay open indefinitely. As the consultants and agencies catch up, as the advice improves, as the tools become more accessible, more firms will address their technical infrastructure. The advantage of moving early will diminish. But right now, for most professional services markets, the competitive bar for AI search legibility is remarkably low. The opportunity is not to do something clever. It is to do the structural work that almost nobody has done yet, and to do it properly.

The firms that are invisible to AI search today are not invisible because they lack expertise or reputation. They are invisible because their websites are built in a way that machines cannot easily read. That is not a content gap. It is an engineering gap. And it is eminently fixable - if you understand what needs fixing and why.

Found this relevant? Let's make it specific to your firm.

Tell us what's going on and we'll give you a straight answer on whether we can help.

Book a discovery call

30 minutes. No preparation needed.