This article demonstrates, in a practical way, how to filter traffic from AI tools such as ChatGPT and OpenAI in Google Analytics 4 using regular expressions.
In the context of digital marketing, it is crucial to obtain detailed insights about the traffic arriving at your website. If part of that traffic originates from Artificial Intelligence tools (such as ChatGPT, Google Gemini, OpenAI, among others), it can be advantageous to filter these sources in Google Analytics 4 (GA4) to better understand the impact of these tools on your business.
In this article, I will show you how to use regular expressions (regex) in GA4 to segment traffic coming from AI tools. Through this step-by-step guide, you will be able to apply effective filters and analyze more accurately the behavior of users coming from these sources.
What is Regex?
Before you begin, it is important to understand what a regular expression (regex) is and how it can benefit the analysis of traffic from AI tools.
A regular expression is a sequence of characters that defines a search pattern. In the case of GA4, you can use regex to identify and filter session sources and mediums that contain specific terms, such as “openai”, “chatgpt”, “gpt”, among others.
Step-by-step for filtering AI traffic in Google Analytics 4
1: Access your GA4 account
Open your Google Analytics 4 account and select the property you want to analyze.
From the side menu, select Reports and then Acquisition > Traffic Acquisition.
2: Add a filter to the traffic report
In the Traffic Acquisition report, click the funnel icon (at the top of the page) to add a filter.
Under “Select dimension,” choose Session source/medium.
Why this dimension? It allows you to filter traffic according to the source and medium of the sessions, making it the ideal point to apply regular expressions (regex) that can identify specific sources, such as AI tools.
3: Choose the match type
In the filter, under “Match type,” select the option “matches regex.”
Importance of this choice: Regex matching offers the flexibility to use complex text patterns to capture specific sources, which simple filters do not allow. This is essential to accurately identify traffic from AI.
4: Paste the regular expression (Regex)
Here is the regex you can use to filter AI traffic sources:
^.*\.openai.*|.*copilot.*|.*chatgpt.*|.*gemini.*|.*gpt.*|.*neeva.*|.*writesonic.*|.*nimble.*|.*perplexity.*|.*google.*bard.*|.*bard.*google.*|.*bard.*|.*edgeservices.*|.*bnngpt.*|.*gemini.*google.*$
This regular expression filters session sources and mediums that contain keywords related to AI tools, such as openai, chatgpt, gpt, bard, among others.
5: Apply the filter and view the data
After pasting the regex, click Apply.
GA4 will now apply the filter, allowing you to view traffic data originating from AI in the reports. This will enable a more detailed analysis of user behavior from these sources.
6: Add AI sources one by one
After applying the filter and viewing the traffic results for the selected analysis period, the next step is to associate the AI source for a deeper investigation of the data.
How do we do this?
- Alongside the traffic source results, click the + on the right side.
- In the tab that opens when you click this +, search for Session source/medium.
- Next, the results for the respective sources will appear.
7: Analyze the data
With these filters applied, you will be able to observe how AI tools are influencing user behavior, such as conversions and interactions with your website.
Custom segments for a more detailed analysis
Although standard GA4 reports allow you to filter traffic with regex, you can also use the Explore functionality for a deeper analysis.
Creating custom session segments with regex allows you to further isolate traffic from AI tools and analyze interactions in greater detail.
Example of a custom segment:
- Access Explore and create a new Blank exploration.
- Add the Dimensions and Metrics you wish to analyze on the left side of the screen.
Example:
- Dimension: Session source/medium
- Metric: Sessions
- Next, click Segments on the left side of the screen and Create new segment in the top right corner.
- After opening a new tab, click Session segment.
- Finally, in the new window, create a Session Segment with the condition “Session source/medium matches regex,” and paste the provided regex.
- Name the segment, apply it, and view the segmented data, adjusting the reports as needed.
Conclusão
Filtering AI traffic in Google Analytics 4 using regex is a powerful strategy to understand how Artificial Intelligence tools are influencing user behavior on your website.
Now that you know how to use regex matching in GA4, make the most of this data to maximize the impact of AI on your business.
For more up-to-date content about digital marketing, follow our blog.