Getting Agent-Ready with Symfony

TomvdPeet1 pts0 comments

Getting Agent-Ready with Symfony | Vindle

The ongoing large scale adoption of AI is rapidly changing the way many people interact with the web. Therefore websites and web-apps must adapt so they can continue to thrive. In this article i will go over some improvements you can make to your Symfony application to start making it agent-ready.

Since i want to cover multiple things in this article i will not be going into great detail on each of them, so consider this article more as an introduction to agent-ready-ness and these concepts. I did however write a small bundle for one of the concepts and for the last one I wrote a basic implementation prompt which can be found in the last chapter.

A great resource that will likely help you on this journey and going forward is isitagentready.com. This tool gives a good overview of the improvements you can make to your site and is in part what this article is based on.

SEO Fundamentals

Before getting into any agent specific improvements, it is best we first glance over some SEO fundamentals.

The reason for this is that most things we consider SEO improvements are generally machine- readability and discoverability improvements, and it just so happens that agents are machines too. That is why i would suggest having your SEO fundamentals on point first, since they will not only help with SEO but they're a nice baseline from which to start getting agent-ready

Now, there's a good chance you already have (most) these in place but if you don't i'd highly recommend you look them first before focusing on agents. I will not go into detail for these, if you do want more information on them you can find plenty online, or ask your agent.

Security make sure your site is secure and served over https

Pages must be crawlable requests should return the correct status codes and not be noindex

robots.txt You should have this setup including the sitemap direcive

Sitemap A proper XML sitemap that contains all the links you want to be crawled / indexed.

Inform indexers Setup google search console and Bing webmaster tools and submit your sitemap. Also check these tools occasionally for any indexing warnings

Canonical URLs make sure each page has a specified canonical URL using rel="canonical"

Proper titles and descriptions Make sure each page has proper title's and descriptions

Good semantic HTML using elements like: nav, main, footer,h1 etc. Where appropriate

Structured data implementing structured data where it fits things like: BreadcrumbList, Product, FAQPage

Fast, mobile-friendly and accessible your site should be fast, accessible and work on all devices.

Content Signals

Starting with content signals this is the simplest improvement you can make. content-signal is a proposed robots.txt directive, meant to communicate what you permit AI to do with content from your site. A simple example:

User-Agent: *<br>Content-Signal: ai-train=no, search=yes, ai-input=yes<br>Allow: /

What each of the categories mean:

ai-train Your content may be used for training or fine-tuning AI models

search Your content may be used for building a search index and providing search results. This is more so related to traditional search indexes

ai-input Your content may be used as input for AI models. This mainly means things like AI web search tools and does not include training

Generally you'll want search and ai-input to be yes for the purpose of indexing in search and for use by AI through web search tools. I would consider these two standard for most websites, even if you don't want AI to train on your content.

ai-train is more personal and project dependent. The obvious con is that AI may copy your idea's, content or style. But in the case your site is more commercial and features your brand name in the text having AI train on it could be a positive.

It is also possible to apply specific content signals for specific URLs, more information on this is best found on contentsignals.org

Markdown Negotiation

Markdown negotiation is the ability for agents to request your pages as text/markdown instead of HTML, using the Accept: text/markdown request header.

The reason we'd want to implement this is that agents speak Markdown very well, far beter than HTML and it is also much more token efficient. In a lot of cases the HTML you would otherwise return gets turned into Markdown anyway before an AI sees it. Therefore if you create the Markdown yourself you have a lot more control over it.

There are two general ways to implement this:

Explicit markdown responses: checking in each route if the request prefers markdown and returning an appropriate response.

Centralized HTML-to-Markdown conversion: turning the HTML returned by routes into Markdown centrally.

These two complement each other well, generally for the most control you'll want explicit Markdown responses and then fall back to passive HTML-to-Markdown conversion for pages that are less important.

Initially in this article i was going to...

markdown content search agent want html

Related Articles