Entities in SEO: How Semantic Search Works on Google

Olaf Kopp 2/1/2023

What you need to know about SEO entities and Google's evolution towards a semantic search engine

Table of contents
  1. Google's path to becoming a semantic search engine
  2. What is an entity?
  3. How are entities recognized and processed by search engines?
  4. How does Google identify relationships between entities through Natural-Language-Processing
  5. Assessing Websites Based on E-E-A-T
  6. What are entities in relation to SEO?
  7. How does Google identify relationships between entities?
  8. How to use entities for your SEO strategy
  9. Are there certain industries or topics where SEO entities play a special role?
  10. You should avoid these mistakes when implementing SEO entities
  11. These tools can help you consider entities in SEO analyses
  12. Conclusion

At the latest since 2012, Google has been increasingly developing into a semantic or entity-based search engine. Although there are many explanations about what semantic search is and what distinguishes it, these are often very imprecise and can easily lead to misunderstandings. For this reason, we will give you an overview of the topic in this article, so you can understand everything important about entities in SEO and semantic search.

Google's path to becoming a semantic search engine

Google's efforts to develop a semantic search engine go back as far as 1999 (see the article by SEO-thinker Bill Slawski). A semantic search engine accepts natural language as input and tries to grasp the semantics of a question.

Semantic search first became practical with the introduction of the Knowledge Graph in 2012 and the fundamental change brought about by the ranking algorithm update Hummingbird in 2013. The update brought about the biggest change in the processing of search queries and the calculation of rankings that Google has ever undertaken. All other major innovations such as Rankbrain, E-E-A-T, BERT or MUM were directly or indirectly related to Google's further development into a semantic search engine.

As early as 2013, Hummingbird had an impact on more than 90% of all search queries. For comparison: Rankbrain only had an impact on about 15% of search queries when it was first introduced. So you can say that Hummingbird was a fundamental innovation for a large part of the existing ranking algorithms. In addition, it was Google's first semantics-based ranking system.

With the introduction of the Hummingbird algorithm, Google was now able to take into account entities from the Knowledge Graph and other semantic databases, such as the Knowledge Vault, when processing search queries and ranking and outputting search results.


With the use of Natural-Language-Processing in search, Google is evolving exponentially fast to a fully semantic search engine. As part of the MUM-Update in 2021, the company further optimized the use of this technology.

What is an entity?

Entity is a term from philosophy, semantics and computer science. It describes the "essence or identity of concrete or abstract objects of being". Entities can be uniquely identified, e.g. via an ID, making them unique independent of entity name.

An example: The term "golf" has different meanings depending on context. For example, it could mean a car model, body of water or a sport. Depending on the context in which the term or entity name is used, its meaning is uniquely identifiable.

There is also a distinction between named entities (named entities) and concepts: Named entities are concrete objects from the real world such as people, places, organizations, products, and events. They play a particularly important role in computer science and search engines. Concepts on the other hand are abstract entities like distances, quantities, emotions and social concepts like human rights or peace.

How are entities recognized and processed by search engines?

Entities are used by search engines both as a central element for structured organization of information and for ranking. Semantic databases like the Google Knowledge Graph have the advantage over traditional databases that information can be related to each other.

In a graph index, the nodes are the entities and the edges connect them to each other, provided the entities are related to each other. Additional information can be organized around the nodes or entities.

Entities are core elements in the organization of semantic databases like the Google Knowledge Graph. In the organizational structure of an entity-based index, it can distinguish between the entities themselves, entity types, attributes and other content and information related to the entity.

As a result, a semantic search engine can query information that is not directly recognizable in a search query via Keyword.


Structure of a semantic database

For example, the entity Taylor Swift belongs to a hierarchy of different entity types (person, actress, singer, etc.):


Example of the organization of the two entities Joe Alwyn and Taylor Swift in a semantic database in graph structure

The search query "mother taylor swift" for example returns a Knowledge Panel with information about Andrea Swift including pictures, relatives, social media profiles, etc.


SERP and Knowledge Panel for the search query "mother taylor swift"

Which information Google displays in a Knowledge Panel, depends on the entity type and the demand for certain information about an entity. Such answers to search queries were not possible before the Hummingbird Update because search engines then worked according to purely keyword matching. This means that the content searched had to contain exactly the keywords appearing in the search query - other spellings or synonyms were not detected.

How does Google identify relationships between entities through Natural-Language-Processing

Nowadays, Google uses Data Mining to collect attributes and other information about entities, organize them, and relate them to one another. Natural-Language-Processing plays a big role here: Through so-called triples of subject, object, and verb, the search engine can identify attributes and relationships between entities even from unstructured content. In this case, subjects and objects in a text or sentence are potential entities.


Data Mining via Natural-Language-Processing for a Knowledge Graph

Due to the exponential further development of methods like Natural-Language-Processing and the ever-increasing computing power of systems, Google's potential for a fully semantic search engine has grown enormously in recent years. The goal is to truly understand search queries and content in terms of contextual meaning, and not just to analyze keywords and their occurrences.

Currently, Google uses the classic keyword match for search queries that do not have a relation to entities for the search engine. As soon as a relation to entities in the search query is recognized in the context of Search-Query-Processing, semantic databases come into play.


How Google Might Work Today

Assessing Websites Based on E-E-A-T

For this form of organizing information around entities in SEO, an evaluation based on E-E-A-T is possible within the scope of Google's Search-Quality-Rater-Guidelines. E-E-A-T stands for Experience, Expertise, Authority, and Trustworthiness and is a concept by which Google evaluates the credibility of a source. Only at the end of 2022 was the second E for Experience added to E-A-T. Based on this, Google determines whether the contents of a website also based on personal experiences. This can be, for example, experience reports on the use of certain software or products.

The E-E-A-T rating is particularly important for websites in sensitive industries such as health, medicine, and finance. They are also called YMYL industries ("Your Money, Your Life"). A detailed explanation of these industries and their relevance in terms of entities in SEO can be found further down in the article.

What are entities in relation to SEO?

From an SEO point of view, it is important to understand the basic principles of Google's semantic search. Entities are a means of structuring, thematic categorization, as well as evaluation of entities themselves, e.g. authors and companies, and their domains and content.

Content is published by people like authors and organizations like companies, associations, and government agencies. If these entities are trustworthy and have thematic authority, the published content is also of higher quality according to E-E-A-T. Thus, entities like organizations, products, and people play a special role in SEO since they can be evaluated based on characteristics like authority and trust (E-E-A-T).

Entities that belong to certain entity types such as persons or organizations can have digital representations. These can be, for example, the official website (domain), profiles on social media, images, and Wikipedia entries. While images usually visually represent entities (especially when it comes to persons or landmarks), a person's website or social media profile provides a content assessment. These digital counterparts are central points of reference that are closely associated with the entity and crucial for SEO.

Google can identify this connection between entities in different ways:

  • by analyzing the external linkage of the websites or profiles that contain the exact name of the entity
  • by examining the unique click behavior on a URL for search queries with navigational, brand, or person-related search intent

How does Google identify relationships between entities?

For the identification of a thematic brand, Google can use all information about an entity. The relationships between entities and topics are important for the search engine because this is how it algorithmically determines contextual relationships, the quality or strength of relationships, and thereby authority and expertise. Thus, the establishment of a brand plays a significant role in E-E-A-T for SEO.

Google can recognize which topics entities are associated with, based on the joint occurrence of entities and keywords. The more often these so-called Co-Occurrences occur, the greater the probability that a semantic relationship exists. Co-Occurrences can be determined from structured and unstructured information from website content and search terms.

If the entity Empire State Building is frequently mentioned together with the entity type skyscraper, a relationship exists. This is how Google determines the relationship between entities and entity types, topics, and keywords. The search engine can also determine the degree of the relationship based on the average proximity in texts and/or the frequency of joint occurrence.

For example, Zalando is closely associated with other entities like fashion brands (e.g. Tom Tailor, Nike, Tommy Hilfiger, and Marco Polo) and product groups (shoes, dresses, bikinis):


Thematic relationships between entities, using the example of Zalando

Relationships between entities can vary in strength. Google can use the strength of these relationships to assess expertise and especially authority, and incorporate it into the E-E-A-T concept. Along with these quality characteristics, the popularity of a brand is also an important factor for Google's review and the impact on SEO.


Possible E-A-T evaluation of the entity "Zalando"

Possible signals that Google uses to evaluate an entity in terms of E-E-A-T could be, for example, co-occurrences from brand and theme relevant keywords in search queries or content.


Possible factors for an E-E-A-T evaluation

How to use entities for your SEO strategy

This procedure in Google's evaluation of content leads to some useful measures for entities in your SEO strategy:

  • Provide a sufficient amount of content that is relevant to your subject on your website:Demonstrate expertise by regularly publishing high-quality content on topics you want to be found for. This shows Google that you have expertise as a publisher. However, you should also beware of Keyword Cannibalism.
  • Make it easy for Google to correctly classify your on-page content:
    • Write in simple sentence structures, not in nested sentences.
    • Try to avoid personal pronouns in sentences.
    • Use adjectives and adverbs only if absolutely necessary.
    • Avoid waffle and focus on the essentials when writing.
    • Structure content with logical paragraphs and subheadings. They should have clear reference to the topic even without surrounding content. So, do not just "Sources", but "Sources on topic X".
    • Use TF-IDF analyses (e.g. from Termlabs.io or the Seobility online tool) and integrate the relevant terms into text to improve the semantic context of the content.
  • Link the contents of the Main Content (MC) semantically with each other. This can also consider a possible user journey: What are consumers interested in next or in addition?
  • Ensure transparency regarding authors, website operators, and website purpose:
    • An imprint is mandatory for legal reasons alone.
    • Create an expressive "About Us" page
    • Create author profiles or provide information on the authors.
  • Work with recognized experts as authors, reviewers, co-authors, and influencers: Recognized means that people are already recognizable as experts in the Google search, for example, through online publications, Amazon author profiles, personal blogs and websites, social media profiles, profiles on university pages, etc. ... What is important here is that the authors already have Google-crawlable references in the respective topic area.
  • Link to other authoritative sources: This puts you in relation to an authoritative environment.
  • Make it easy for Google to recognize your entity and its digital mirrors and profiles:
    • Link/connect your entity's representations such as domains, apps, YouTube channels, Wikipedia entries, Wikidata entries, social media profiles, etc.
    • Ensure consistent information about your person or your company in profiles on the net.
    • Link your representations with author profiles on e.g. Amazon and back.
    • Use link texts with your entity name to link to your representations.
    • If necessary, use a sameAs identification of schema.org.
  • Regularly create co-occurrences through marketing and communication outside of your own website and position yourself as a recognizable brand for Google:
    • Link thematically relevant specialist publications from your own website so that Google can assign them faster.
    • Share content also via your social media channels so that Google can assign it faster.
    • Build links from semantically appropriate environments.
    • Make offline advertising that influences Google's search patterns or generates appropriate co-occurrences in search queries (TV advertising, flyers, ads, etc.). However, this should not be purely image advertising, but advertising that contributes to the positioning in a subject area.
    • Write thematically appropriate guest articles and link these contents with your own website and your social media profiles.
    • Give interviews.
    • Give presentations at professional events.
    • Host webinars.
    • Arrange cooperations (e.g. with suppliers and partners) that create appropriate co-occurrences.
    • Do PR that generates appropriate co-occurrences. (But no pure image PR).

Are there certain industries or topics where SEO entities play a special role?

Google does not differentiate between industries in semantic search. With entity-based search engines, entities are the central element for the organization of information and content. When it comes to E-E-A-T and SEO, Google talks about a higher weighting for so-called Your-Money-Your-Life topics (YMYL topics) and corresponding search queries.

YMYL topics include:

  • News and current events: News about important topics like international events, economy, politics, science, and technology. Note that not all news articles are necessarily considered YMYL (e.g., sports, entertainment, and everyday lifestyle topics are usually not YMYL). Here, content and publishers have a central role in the Google News search.
  • Government and law: Content that is important for informing the population, such as information about voting, government agencies, public institutions, social services, and legal issues (e.g., divorce, child, custody, adoption, writing a will).
  • Finance: Financial advice or information about investments, taxes, retirement plans, loans, banking, or insurance. Particularly websites where people can make online purchases or transfer money.
  • Shopping: Information or services related to researching or buying goods and services. Particularly websites where people can shop online.
  • Health and Safety:Advice or information on medical topics, medications, hospitals, emergency preparedness, how dangerous an activity is, etc.
  • Person groups: Information about groups of people, e.g., grouped by ethnic origin, religion, disability, age, nationality, veteran status, sexual orientation, gender or gender identity.
  • Others:There are many other topics that pertain to major decisions or important aspects of people's lives and are considered YMYL content, such as fitness and nutrition, housing information, choosing a university, job hunting, etc.

If your company needs to be found in such topics, you should follow these measures and strategies. Otherwise, you will find it difficult to rank on the first search result page.

You should avoid these mistakes when implementing SEO entities

As already mentioned, digital mirrors, links, and content play a key role in the identification of entities and their relationships to each other.

So in order for your entities to be successful in SEO, you should avoid the following mistakes:

  • The use of too many personal pronouns in content:To present Google with as understandable and consistent signals as possible, you should avoid using personal pronouns. Name the entity by its name.
  • Linking different target pages with the company name as anchor text:With the company name as anchor text, you should only link target pages that are closely related to your entity (homepages of your domain, social media profiles, author pages…). Here, you should restrict yourself to a few target pages.
  • Different spelling for your own entity name: You should pay attention to a uniform spelling of the entity name. If you switch, for example, between Andy and Andreas in the first name of a person entity, it is harder for Google to assign it.
  • Too broad thematic positioning of your entity: The broader and more diverse the topics in which you are mentioned as an entity, the more difficult it is for Google to assign you to the relevant thematic contexts.

These tools can help you consider entities in SEO analyses

There are only a few pieces of software that take into account entities in SEO analyses. The big SEO tool vendors have so far rather neglected the topic. However, you can fall back on some lesser known free or paid solutions:

  • Inlinks: Inlinks is a paid tool that analyzes websites based on the entities that occur.
  • Diffbot: With Diffbot, texts can be represented as an entity graph. Here you get an impression of how Google extracts entities from texts and establishes relationships using Natural-Language-Processing.
  • The Entities Swissknife: This free tool analyzes texts for entities, categorizing the text according to categories and thematic areas.

Semantic Text Analysis with Diffbot


With the Hummingbird update, Google started a journey towards developing a semantic entity-based search engine. This development is still proceeding, and one can see increasingly in the SERPs what influence entities have on the organization of information and the output of search results.

Through language models and technologies like Natural-Language-Processing, MUM and LAMDA, Google has infinitely many sources at its disposal to capture the knowledge of the world in the form of semantic databases and output it in multimedia search results.

For search queries with a relation to entities, one can now find a variety of different information in Knowledge Panels, Answer Boxes, carousels, images, and videos. The information is organized around the entity as here in the example Tom Hanks, and divided into categories such as Movies, News, Videos, and Relationships.


Search result for "Tom Hanks" (query in the USA)

The SERPs are thus evolving from a simple link list to a knowledge database that is organized around entities. It's less about which keywords appear in a search query or in content. Rather, keywords are often just identifiers for entities, the real meaning of which Google understands increasingly better.

Olaf Kopp
Olaf Kopp

Olaf Kopp ist Online-Marketing-Experte mit mehr als 15 Jahren Erfahrung in Google Ads, SEO und Content-Marketing. Olaf Kopp ist Co-Founder, Chief Business Development Officer (CBDO) und Head of SEO bei der Online-Marketing-Agentur Aufgesang GmbH. Er ist Autor, Podcaster und anerkannter Branchenexperte für semantische SEO, E-A-T, Online-Marketing und Content-Marketing-Strategien entlang der Customer Journey und digitale Markenbildung. Olaf ist Autor hunderter Fachbeiträge und war als Gastautor in diverse Buch-Veröffentlichungen involviert. Er ist Moderator der Podcasts Content-Kompass und OM Cafe. Olaf Kopp ist Autor des Buches „Content-Marketing entlang der Customer Journey“ und Mitorganisator des SEAcamp.

All Articles of Olaf Kopp

Software mentioned in the article

Product categories mentioned in the article

Related articles

Join the OMR Reviews community to not miss any news and specials around the software seeking landscape.