Tom Cranstoun on stage at CMS Summit 26 held in Frankfurt
Have you also asked your favourite AI assistant for travel advice?
I did. I asked mine a simple thing: get me from Rovinj to Malaga on 16 June.
The answer was completely wrong.
It gave me a ferry that did not run on that date, a bus to the wrong airport, and a flight from an airport that the bus did not serve. The timings overlapped. The whole thing read like an itinerary. None of it would get me to Spain.
That failure, more than any pitch, explains a growing problem with the web. The information already existed. Ferry schedules existed. Flight schedules existed. Train schedules existed. Yet the system could not reliably assemble those facts into a journey that would actually work. The problem was not the absence of information. It was that the information was not available in a form that could be interpreted, verified, and trusted across systems.
As agents become readers of the web, that distinction matters. Content can no longer be published only for human eyes. It has to say what it means, where it comes from, how long it remains true, and whether it can be trusted.
The answer that could not happen
Look at what the plan actually stitched together. A ferry that was out of season. A bus that ended at one airport while the flight left from another. Connection times that ran backwards. Each fact, on its own, sounded plausible. Together they described a journey no passenger could take.
There was no obvious lie in it. The assistant combined fragments, inferred relationships, and presented the result fluently. The problem was not simply that the model made a mistake. The deeper problem was that the web gave it too much room to guess.
A transport network is not a list of facts. It is a system of constraints. Timetables, route limits, airport catchments, seasonal services, last-train cut-offs, and transfer windows all matter. A person reads "operates Monday, Wednesday, Friday" or "seasonal service from 20 June" and understands the consequence. A machine may read the same words without reliably knowing what action follows.
That is where web access alone is not enough. Search can retrieve fragments. Browsing can expose pages. Neither solves the problem when the meaning is implicit, visual, incomplete, or buried behind scripts and forms.
The harder question
Then I asked the harder one: get me home from Malaga to Leeds, all the way by rail in one day.
A clean answer exists, and it is not the one a model tends to reach for. You cannot do it in a day. Not in 2026, not with any plausible combination of high-speed services. The reasons are physical and operational.
The exit from Spain into France is a bottleneck. The fast trains north run a long leg, and the last workable connections land in Paris too late for the final Eurostar services. Even leaving Malaga on the first train at dawn, you reach the French connection too late to cross to London and continue to Leeds the same day.
So the fastest all-rail route is two days, not one.
Day one: Malaga north to Madrid, across to Barcelona, then the long leg to Paris, arriving late. Stay the night.
Day two: Paris to London in the morning, then London to Leeds.
Treat that outline as directional, not as a booking. The real times live with the operators. The issue is that today those times, constraints, validity periods, and connections are often not available in a form a machine can read and trust end to end.
Why bigger models will not fix this
Scaling the model does not fix the web it reads.
More parameters, better retrieval, more tool use, and more access to live pages can all help. They do not change the underlying content. If the facts are implicit, visual, inconsistent, or never declared, the system still has to infer too much.
That matters because inference is where the journey breaks. The model reaches past the evidence, connects incompatible facts, and fills the silence with something that sounds right. The result is not random nonsense. It is plausible nonsense.
The defect is not only in the model. It is also in the publishing environment around it.
The data already exists, but it does not travel
The facts the assistant needed were not exotic. Many are already described by open standards. Schema.org can express flights, trips, schedules, and locations. Public transport has used machine-readable timetable formats for years. The vocabulary to state that a service does not run on a given date, or that one leg does not connect with another, already exists.
What is missing is consistent declaration, provenance, and trust.
A schedule buried in a rendered table is not the same as a declared schedule. A claim with no source, validity period, or authority is not a trustworthy claim. A machine left to scrape and interpret has to hope that the page is current, complete, and meant to be used in the way it is using it.
This is the gap MX is designed to address. MX is metadata that records a file's provenance, context, and intended use, and travels with the file. Where Schema.org, transit feeds, and other standards already meet the need, MX should defer to them rather than duplicate them. The aim is not to reinvent the web's existing vocabularies. It is to make content state, plainly and portably, what it is and how it should be used.
Machine-readable content is one problem. Machine-trustworthy content is another.
A machine may be able to parse a date, a location, or a fare. It still needs to know whether that information is current, authoritative, and valid for the action being taken. REGINALD, the MX registry, is intended to address that second challenge: not just what a file says, but whether the system reading it has reason to trust it.
The page that outlived its service
Checking the ferries for myself, I opened one operator's own booking page. It was there. It looked live. I clicked "Book a trip", and the calendar opened with nothing in it. No sailings. No dates. An empty grid where a timetable should be.
So I went looking. The operator had gone bust. The company was gone, and the page it left behind still presented itself as a place where you could buy a ticket.
A person frowns at the empty calendar and works out what happened. A machine reads a booking page, sees a booking widget, and reports a ferry.
Nothing on that page said it was out of date. That is the whole defect. A timetable has an expiry. A price has an expiry. A booking page for a service that no longer runs has expired in the most complete way there is, and it announces none of it.
MX can carry that fact. A page, PDF, timetable, or file can state the date after which it should no longer be treated as authoritative. An agent that reads that signal can stop treating the schedule as current, rather than planning a sailing that will never depart.
The information already exists
Here is the part that should frustrate anyone responsible for publishing information on the web: the information is often already there.
Modern content platforms routinely hold review dates, expiry dates, owners, workflow status, approval history, and unpublish schedules. Organisations maintain operational knowledge about what content is current, who owns it, when it should be reviewed, and when it should stop being trusted.
Then, too often, that knowledge stays inside the tool.
It does not reach the rendered page in a form a machine can read. It does not travel with the PDF. It does not tell an agent whether a page is current, reviewed, superseded, or expired.
That is no longer a small operational issue. If machines are becoming readers that act on what they find, then provenance, validity, and intended use become part of the content itself.
In practice, the problem is everywhere. An Easter offer is still live in June. A Christmas advert sits on a landing page two years later. A press release for an event that has been and gone remains indexed as if it were current. Sometimes the expiry date existed and nobody acted on it. Often, the date was never set. Either way, the page kept being served, and a machine reading it had no way to know it was out of season.
The discipline is old. What is new is the reason for making it machine-readable. The machine is no longer just indexing the page. It may be acting on it.
From guessing to reasoning
MX changes the assistant's job from guessing from prose to reasoning over declared facts.
With declared and verifiable constraints, the ferry's out-of-season date is a stated fact, not an inference. The last northbound train is a published constraint, not a gamble. The airport a flight leaves from is written down next to the bus that does, or does not, reach it.
The assistant stops stitching plausible fragments together because it no longer has to. It reads what the operator declared, checks whether the declaration can be trusted, and reasons from there.
That is the difference between an agent guessing from text and an agent working from structured, verifiable, machine-declared facts.
The mess was a demonstration
The broken itinerary was not just an amusing AI failure. It was a live demonstration of what happens when a machine is handed a web built for human eyes and asked to act on it.
The model predicted, inferred, approximated, and filled the gaps confidently. The gaps were there because the web is still too implicit, too visual, and too inconsistent for systems that need to make decisions from it.
The web has spent thirty years optimising content for human readers. We are now entering a period where machines increasingly become readers as well.
The itinerary that never existed suggests a simple question: can a machine understand what we publish, and can it trust what it finds?
