Optimization strategies for ChatGPT and other LLMs
Remember. Rethink. Implement.
Can you influence whether your own company, its products or its services are mentioned by ChatGPT? Ideally in such a way that it generates more traffic for your company website? In other words, is SEO for ChatGPT possible?
Bear in mind that ChatGPT and comparable LLMs such as Claude are more than just search engines. Therefore, this type of optimisation could also be called something else to avoid confusion. Olaf Kopp, for example, calls it LLMO, i.e. Large Language Model Optimisation.
The general functionality of LLMs is well known. However, details about how particular systems handle search queries are not. It would be important for LLMO to know how exactly the systems arrive at results in which they link to external websites and how these results can be influenced.
There are currently no reliable answers. Nevertheless, it is worthwhile for companies to tackle the issue now. Our translation agency DialogTicket wants to help you to get started by discussing possible approaches.
The most popular large language models
ChatGPT from OpenAI is currently the market leader, by a wide margin. However, this can change quickly. Unfortunately, there seems to be no reliable data on the number of requests for each system. That’s why some people try to measure their popularity indirectly. For example, they check how often the login URL for the respective system is googled. This is how the marketing agency Definition does it and arrives at the following result for May 2023 to April 2024:
Rank | LLM Keyword | USA – Average monthly search volume |
---|---|---|
1 | ChatGPT login | 823.000 |
2 | Gemini login | 4.100 |
3 | Copilot login | 880 |
4 | Claude login | 590 |
5 | Mistral login | 10 |
6 | Llama login | 10 |
Rank | LLM Keyword | Germany – Average monthly search volume |
---|---|---|
1 | ChatGPT login | 165.000 |
2 | Gemini login | 350 |
3 | Copilot login | 110 |
4 | Claude login | 30 |
5 | Llama login | 10 |
6 | Mistral login | 10 |
Three reasons why classic SEO alone is no longer enough
From unnecessary links to search queries without search engines.
SEO, i.e. the optimisation of content for search engines, has always been a cat-and-mouse game. Companies like Google originally wanted to provide searchers with exactly the information they were looking for. Marketers tried to design their content in such a way that it was more likely to be returned as a match for a search query.
This optimisation was not always in the interests of the searcher. Content not matching the search intent repeatedly found its way to the top of the SERPs (search engine results pages). This in turn prompted Google to expand the criteria for what constitutes good content and change its algorithms accordingly. Marketers adapted and the game started all over again.
This dynamic was first seriously shaken when Google and other search engines began to display the answers to search queries directly on the results page. Searchers had to click on a link less and less often and the number of users on many company websites declined. A problem for blogs as well as for content marketing.
Recently, Google seems to be increasingly listing organic results that are less relevant to the query – perhaps deliberately. Why? Perhaps because searchers are more likely to click on paid ads if they seem to match better than organic results. That would mean more revenue for Google, after all. Search engines may look more and more like advertising platforms while becoming less and less suitable for research.
Finally, the search landscape is more diverse today. Searches are increasingly being conducted on platforms such as Amazon, YouTube and TikTok. Recently, searches via large language models (e.g. ChatGPT Search) have been added, in some cases directly in the applications (e.g. Copilot). True, only around 4.33% of all search queries currently run via ChatGPT, but it would overtake Google in four years’ time, if the current trend continues.
Thus, it’s worth thinking about strategies on how to deal with the change most effectively. One problem: Large language models are currently still a terra incognita of SEO. However, even though the systems are still young and changing rapidly, we can reflect on the unknowns and structure them a bit.
Cross-platform share
of search volume
The following is an estimate of the share of search volume in October 2024 across a selection of platforms. However, the figures for social media are too low, as the estimate is based solely on desktop and mobile usage and does not include search queries via the respective app. Keep in mind that ChatGPT has only had a more sophisticated online search function that could seriously compete with Google since December 2024.
How to become part of the ChatGPT data set
Widely distributed and popular.
Many large language models were trained based on the so-called common crawl. This is a very large data set consisting of billions of books, articles and websites. In order to improve the systems, they are trained at regular intervals with ever larger data sets and ever more computing power.
Regularly, but not often. This is because training is very, very expensive. While ChatGPT-3 cost just over four million dollars, the costs for ChatGPT-4 already totalled over 60 million. It is estimated that ChatGPT-5 will cost over a billion dollars – not including personnel and research costs.
One approach to LLMO (Large Language Model Optimisation) would be to ensure that your company, brand, product or service is mentioned on web pages that are very likely to be included in the next training data set. How? Some domains are known to be part of the common crawl dataset. These include sites such as Wikipedia, YouTube and Medium. (By the way, it’s important to include transcriptions or subtitles for video content, as text files are more likely to be included than video files due to their smaller size). However, due to the cost and training required for the LLMs and the time that passes between substantial updates, optimisations that utilise this approach must take a long-term view.
The good news is that traditional search engines also regard it as a sign of quality or trust if your brand, products or services are mentioned in many places. SEO and LLMO overlap at this point.
The bad news is that the basic principle of large language models is based on statistical frequencies. Based on the words in the prompt, the systems calculate what the most likely next token or word is. The more frequently words appear close together in the data set, the more likely it is that they will be part of the answer. However, nobody knows what the benchmark for this is. Is it enough if your brand or product is mentioned on two or three larger pages? Presumably, as in traditional SEO, it depends on how present your competitors are in the data set in similar contexts.
However, at DialogTicket we are more concerned about possible conflicts between SEO and LLMO. Duplicate content should be avoided in SEO. If the content on different pages is too similar, Google & Co. will penalise this. However, LLMs react positively to similarity in content and structure, as they strengthen the statistical relationship between tokens. Thus, it is unclear what a consistent SEO and LLMO strategy could look like in this approach and whether one is possible at all.
ChatGPT Search
A first look.
ChatGPT has been able to access the internet since May 2023. This was important in that the system was now able to provide information that was more up-to-date than the data set it was trained on. Without this function, neither online shopping nor searching for the latest news would make sense. It was not until December 2024 that the GPT search system was officially integrated into ChatGPT. This answers search queries in a similar way to a traditional search engine. However, it enriches the output with additional information, similar to what Copilot does based on Bing. In this video, Robert Leitinger takes a closer look at the new function. The video is in German, but you can switch on auto-subtitling in the language of your choice.
Alternative LLM optimisation strategies
Authoritative. Relevant. Clearly structured.
One approach to LLM optimisation is therefore to ensure that the desired company information is included in the training data set. We encountered three problems here:
- Updates are only possible at long and irregular intervals.
- It is unclear how wide-ranging the online presence needs to be for the system to include the company information in responses. The effort and costs are currently uncertain.
- This approach may only work at the expense of existing SEO strategies.
Fortunately, there are alternative approaches. Instead of targeting the data that plays a role in training, you could focus on the data that the system accesses in real time whenever online information is processed. Here, it’s relevant how exactly the large language model processes this data.
The simplest variant would be this: a) The LLM converts the prompt, which was formulated in normal everyday language, into a suitable keyword string. As the system is capable of learning, this list could differ from the list that a user would have selected themselves. It could therefore guarantee better results than if the searcher had used a traditional search engine. b) The LLM inserts the keywords into an existing search engine. c) It creates an answer from the SERPs of the search engine, including a link for the user. At present, LLMs such as ChatGPT seem to concentrate primarily on the top entries. If this remains the case, SEO and LLMO strategies would be identical. As long as your websites are ranking highly, you don’t need to worry too much about large language models.
However, it is also conceivable that the systems could learn to take into account later entries in the SERPs if they appear more relevant. Or they modify the keywords if the first search query does not appear promising. When making judgements, LLMs have the advantage of context. They “know” more than just the user’s search history or which websites they visited earlier. Depending on the user’s settings, they could, in principle, take into account the entire course of communication or even all past interactions. An advantage for understanding user intent better. However, it is questionable whether optimisation for users in general would still be possible. The more individual or particular the context is, the more difficult it is to identify general relevance criteria.
A final variant would be to retire keyword-based searches and replace them with highly context-sensitive everyday language searches. Instead of relying on conventional search engines, entirely new forms of indexing and evaluation systems could be created.
At DialogTicket, we don’t know where the journey is heading. Nor can we give any clear recommendation on what to do. But it seems that, if you are faced with many unknown variables, a promising course of action is to do what is hardest, as it is the least likely to be emulated: Create high-quality, clearly structured content that is new, brings real added value to the searcher and stands out from the competition. Ultimately, this is what reputable SEO experts recommend anyway. If, on top of that, you manage to get your brand on everyone’s lips, you may have created a solid basis for LLMO.