Trends and Innovation

The Complete Guide to Filtering AI Traffic in Google Analytics 4

This article demonstrates, in a practical way, how to filter traffic from AI tools such as ChatGPT and OpenAI in Google Analytics 4 using regular expressions.

In the context of digital marketing, it is crucial to obtain detailed insights about the traffic arriving at your website. If part of that traffic originates from Artificial Intelligence tools (such as ChatGPT, Google Gemini, OpenAI, among others), it can be advantageous to filter these sources in Google Analytics 4 (GA4) to better understand the impact of these tools on your business.

In this article, I will show you how to use regular expressions (regex) in GA4 to segment traffic coming from AI tools. Through this step-by-step guide, you will be able to apply effective filters and analyze more accurately the behavior of users coming from these sources.

What is Regex?

Before you begin, it is important to understand what a regular expression (regex) is and how it can benefit the analysis of traffic from AI tools.

A regular expression is a sequence of characters that defines a search pattern. In the case of GA4, you can use regex to identify and filter session sources and mediums that contain specific terms, such as “openai”, “chatgpt”, “gpt”, among others.

Step-by-step for filtering AI traffic in Google Analytics 4

1: Access your GA4 account

Open your Google Analytics 4 account and select the property you want to analyze.
From the side menu, select Reports and then Acquisition > Traffic Acquisition.

Menu de navegação do Google Analytics, com a secção “Traffic acquisition” selecionada.

2: Add a filter to the traffic report

In the Traffic Acquisition report, click the funnel icon (at the top of the page) to add a filter.

Cabeçalho do relatório “Traffic acquisition: Session primary channel group” com opção para adicionar filtros ou comparações.

Under “Select dimension,” choose Session source/medium.

Secção “Segments” no Google Analytics com o botão para adicionar um novo segmento.


Why this dimension? It allows you to filter traffic according to the source and medium of the sessions, making it the ideal point to apply regular expressions (regex) that can identify specific sources, such as AI tools.

3: Choose the match type

In the filter, under “Match type,” select the option “matches regex.”

Painel de criação de filtro com a condição "Session source / medium" e correspondência por expressão regular.

Importance of this choice: Regex matching offers the flexibility to use complex text patterns to capture specific sources, which simple filters do not allow. This is essential to accurately identify traffic from AI.

4: Paste the regular expression (Regex)

Here is the regex you can use to filter AI traffic sources:

^.*\.openai.*|.*copilot.*|.*chatgpt.*|.*gemini.*|.*gpt.*|.*neeva.*|.*writesonic.*|.*nimble.*|.*perplexity.*|.*google.*bard.*|.*bard.*google.*|.*bard.*|.*edgeservices.*|.*bnngpt.*|.*gemini.*google.*$

Painel “Build filter” no Google Analytics, com a dimensão "Session source / medium" selecionada e tipo de correspondência ainda por escolher.

This regular expression filters session sources and mediums that contain keywords related to AI tools, such as openai, chatgpt, gpt, bard, among others.

5: Apply the filter and view the data

After pasting the regex, click Apply.
GA4 will now apply the filter, allowing you to view traffic data originating from AI in the reports. This will enable a more detailed analysis of user behavior from these sources.

6: Add AI sources one by one

After applying the filter and viewing the traffic results for the selected analysis period, the next step is to associate the AI source for a deeper investigation of the data.

How do we do this?

  1. Alongside the traffic source results, click the + on the right side.

Painel “Build filter” no Google Analytics, com a dimensão "Session source / medium" selecionada e tipo de correspondência ainda por escolher.

  1. In the tab that opens when you click this +, search for Session source/medium.

Campo de pesquisa de dimensões no Google Analytics com resultados para "Session" na categoria de origem de tráfego.

  1. Next, the results for the respective sources will appear.

Tabela do Google Analytics mostrando canais de origem como "Referral", "Unassigned" e "Organic Search" com fontes como chatgpt.com, perplexity.ai e gemini.google.com.

7: Analyze the data

With these filters applied, you will be able to observe how AI tools are influencing user behavior, such as conversions and interactions with your website.

Custom segments for a more detailed analysis

Although standard GA4 reports allow you to filter traffic with regex, you can also use the Explore functionality for a deeper analysis.

Creating custom session segments with regex allows you to further isolate traffic from AI tools and analyze interactions in greater detail.

Example of a custom segment:

  1. Access Explore and create a new Blank exploration.

Página inicial das Explorações no Google Analytics, com a opção "Blank" para criar uma nova exploração.

  1. Add the Dimensions and Metrics you wish to analyze on the left side of the screen.
Secção para adicionar dimensões e métricas numa exploração personalizada do Google Analytics.

Example:

  • Dimension: Session source/medium
  • Metric: Sessions
  1. Next, click Segments on the left side of the screen and Create new segment in the top right corner.

Ecrã de criação de novo segmento no Google Analytics, com opções para "User segment", "Session segment" e "Event segment".
Botão no Google Analytics com o texto "Create a new segment" para iniciar a criação de um novo segmento.

  1. After opening a new tab, click Session segment.

Filtro construído com expressão regex para filtrar sessões por origem/medium contendo domínios como openai, copilot e chatgpt.

  1. Finally, in the new window, create a Session Segment with the condition “Session source/medium matches regex,” and paste the provided regex.

Filtro criado para sessões em que a origem/medium corresponde a uma expressão regex envolvendo openai, copilot, chatgpt e gemini.

  1. Name the segment, apply it, and view the segmented data, adjusting the reports as needed.

Conclusão

Filtering AI traffic in Google Analytics 4 using regex is a powerful strategy to understand how Artificial Intelligence tools are influencing user behavior on your website.

Now that you know how to use regex matching in GA4, make the most of this data to maximize the impact of AI on your business.

For more up-to-date content about digital marketing, follow our blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.