Are You Already Fluent in RegEx?
We show you why learning the so-called regular expressions is worthwhile for online marketers and SEOs.
- What are RegEx and what are their advantages for your SEO?
- What does a RegEx syntax look like?
- Helpful RegEx SEO Use Cases
- Which RegEx testing tools can you use?
- Conclusion about RegEx
You do not feel like spending more time on data collection and processing than necessary? In this article, you will find out why and above all how you, as SEO beginners and advanced learners, can work more efficiently with Regular Expressions (RegEx).
What are RegEx and what are their advantages for your SEO?
RegEx you can imagine as a kind of filter function - only better. RegEx are regular expressions that contain conditions. These specify the criteria by which data is checked. If the conditions are met, the data records end up in your evaluation. Otherwise, they will not be included.
Regular expressions originate in computer science. But they are not only used in programming languages such as JavaScript and Python, but also in areas that are more commonly associated with online marketing. You may be asking yourself now: "Why shouldn't I just use the filter functions of Excel and Word?" The short answer is: RegEx can simply do more. With placeholders and modifiers you can formulate search queries much more precisely than MS Office applications and the best SEO tools can.
Nowadays many online marketers and SEOs have recognised the value of regular expressions for their Search Engine Optimisation. Typical marketing use cases are:
- Checking URLs with the same directory
- Automatic redirections of URLs in the .htaccess, which follow a certain pattern
- Filtering of domains and URLs during website analysis
- Defining page types in A/B testing tools (Conversion Optimisation)
You can perform RegEx in many programs. Regular expressions for search engine optimisation are most frequently used in these applications:
- Web analysis tools (e.g. Google Analytics, Google Search Console)
- SEO tools (e.g. Seobility, Ryte)
- Website Crawler (e.g. Screaming Frog)
- Spreadsheet programs (e.g. Google Sheets)
- Text editors (e.g. Notepad++)
You are probably wondering why you have never seen RegEx. This is probably because regular expressions are usually not visible to users. However, you have probably used them often without knowing it, for example when assigning a new password. Software with password policies, which specify character combinations, almost always rely on RegEx.
What does a RegEx syntax look like?
A RegEx syntax consists of characters that describe individual functions. The sum of the characters is also called a pattern. They check a text like a noodle sieve. Everything that does not fit is fished out.
Patterns check whether data records meet corresponding criteria. Both positive and negative matches with your patterns can be linked to functions. Common functions are searching for and replacing certain characters.
You can link RegEx together as you like to form complex requirement chains. The connectors are called operators.
These are the most important operators:
These are the most important regular expressions (RegEx):
Helpful RegEx SEO Use Cases
You can use RegEx in various cases. We have listed the most common use cases for you:
1. Understand website structures and hierarchies
Source: Seobility
With mod_rewrite you can redirect incoming URL requests to a specific part of your web presence. The complete URL with its sub-information and sub-parameters is taken into account.
This module is extremely important for your SEO as it transforms complex URLs into so-called "talking URLs". These help users and search engines understand the structure and hierarchy of your website.
Store this rule in the .htaccess file of your web server:
RewriteEngine on
RewriteRule (.*)\.html$ /cgi-bin/script.pl?var=$1
2. Filter in Google Analytics
With the simple filter functions in Google Analytics, you can display individual pages of your report - with RegEx several pages at the same time. You can filter any pages that do not have the same characteristics (e.g. page path).
RegEx can protect you from spam with a hostname filter. You can filter data views, e.g. websites that point to several domains. This filter allows all hostnames that contain a certain domain name:
.*domain\..*
3. 301 redirects with the .htaccess
Source: SEO-Summary.de
If you have carried out a relaunch, you can use RegEx to redirect complex URLs to your new URL:
Redirect 301 ^/jobs/?$ www.domain.de/karriere/
4. Filter out keywords with low click rates
Source: Sistrix
SEO tools like Semrush and Sistrix have the advantage of inundating you with relevant data in seconds. In order not to overwhelm users with a lot of data, SEO tools already offer simple filter functions such as "Contains" or "Begins with".
You can refine your Keyword Research Analysis with RegEx: If you are analysing keyword data from relatively unknown brands, you often get search queries with brand names. You should sort these out, as the click rates are usually low.
Another typical beginner use case is the consideration of common spelling mistakes. Using Edeka as an example, you can apply this RegEx:
(edeka|edekaa|edeek|edekka)
5. Filter data from crawled websites
Crawlers like Screaming Frog are particularly helpful for your search engine optimisation. You can extract, among other things, page titles, headings and structured data from the source text.
To collect even more information, you can apply RegEx when filtering and extracting data from the crawled websites. For example, you can count how often a particular expression occurs and filter out the result with the extraction function. Examples of sensible use of RegEx are:
- With ["'](UA-.*?)["'] filter out Google Analytics ID
- With "ratingValue": "(.*?)" filter out Schema.org rating value from JSON-LD
- With ["']datePublished["']: *["'](.*?)["'] filter out Schema.org publication date from JSON-LD
- With [a-zA-Z0-9-_.]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+ filter out email addresses
- With ["'](GTM-.*?)["'] filter out Google Tag Manager ID
6. Search for page titles
RegEx can also help you find page titles. For example, if you want to analyse guide articles, you can search for page titles that start with a number and contain terms like "tips", "how-to" or "tutorial". Apply this regular expression:
^[0-9].*(tips|tutorial|how to)
Which RegEx testing tools can you use?
Applying RegEx is easier than it looks at first glance - provided you have a technical understanding and spend a little time on the subject. For the induction, we recommend you to use RegEx testing tools. They warn you when errors slip into your rules or you filter or replace too much data. Users have rated RegEx testing tools for you. You can find their advantages and disadvantages on OMR Reviews.
These RegEx testing tools are particularly popular with users:
- Regular expressions 101
- regexr
- RegEx Tester
- CyrilEx Regex Tester
- Freeformatter
- Site24x7 Regex Tester
- Java in use Online Regex Tester
- DebuggexBeta
- RegexPlanet
- Coding.Tools Regex Tester
On the Internet, you will also find some videos that teach you the basics of RegEx. The following RegEx tutorials are particularly popular for getting started in the world of regular expressions:
- Programming Learning #71 - RegEx
- RegEx - Quick start
- Learn regular expressions (regular expressions) in 10 minutes!
- Python Tutorial #41 - Regex
- Introduction to Regular Expressions
- Regular Expressions (Regex) with regex101.com
Conclusion about RegEx
Learning Regular Expressions (RegEx) is like learning a new foreign language. At first glance, they may seem completely alien to marketers and SEOs, but a closer look reveals that the vocabulary list is limited. Whether a number person or not: diving into Regular Expressions (RegEx) is definitely manageable for each of you.