What is “semantic search” and why does it matter? “Semantic search” and “semantic SEO” are getting a lot of press lately, but these terms are unfortunately still poorly defined. In truth, the term “semantic SEO” is a misnomer. Semantic is about defining relationships between entities, and the signifiers that act as flags for those relationships. So, we’re technically not optimizing search engines. Rather, we’re slowly optimizing the web itself for semantic. This helps search engines more accurately identify topics, ideas and entities that are both relevant and related to specific pages.
With that brief distinction aside, let’s explore what the “semantic web” truly means and the widespread impact it will hold for search marketers. The Semantic Web Wiki defines it in the following terms:
The “semantic web” by definition is an extension of the World Wide Web. It breaks down boundaries by defining entities instead of apps or websites. Search is simply a byproduct, a cog, a spoke in the wheel. As it applies to search, we simply create associations of known entities through the “structured data” within the page markup. Do not mistake this statement as a proclamation that there is no search benefit within the semantic web.
Structured data is another buzzword we’re hearing more and more, but is often misused. The definition reads:
Plenty of search opportunities are emerging from the semantic web in both ranking prominence and the new SERP landscape. Rest assured, many more will arise. What’s most exciting is the fact that the semantic web is still in its infancy. The full potential of semantic still up in the air. Search marketers have an enormous opportunity to define the semantic’s web place in search history.
For now, it’s clearly a search-powered technology. Seeing as they are leading the way in defining much of the language, Google will obviously reap most of the benefits. But, before we get too far ahead of ourselves, let’s explore a few structured data and search-related opportunities that are currently being optimized by the semantic web.
The semantic web already has a standard for describing things. This “description framework” enables us to link increasing amounts of data in a format that computers can readily interpret and relate to other complex entities. The format, reference language and structure enable brand new relationships between datasets that previously didn’t exist. This is exactly what we are referring to when we say semantic web and structured data.
The Knowledge Graph – Semantic Web
Think of a database of the world’s information. The 2010 Google acquisition of Metaweb was big eye opener on Google’s intent to utilize and construct the knowledge graph. Data comes from many sources; Wikipedia, CIA World Factbook, and Freebase. As of 2012, the semantic network contained over 570 million entities and 18 billion facts about relationships between entities. Think, TV shows, celebrities, businesses, events, locations, prices, times and so much more all connected together.
Overview of Resource Description Framework and the Semantic Web
Open graphs enable a common framework that allows data to be shared and reused across boundaries and syntaxes. Internationalized Resource Identifiers (IRIs), which are internationalized cross-language versions of Uniform Resource Identifiers (URIs), can be divided into URLs and URNs. To clarify before going much further, not all URIs are IRIs, and not all URLs are URIs. The URN is the name of a resource on the web, and the URL is the location.
Building a Resource Description Framework (RDF) as a standardized method for modeling resources and the information contained within web resources allows data to be more re-usable and accessible for different purposes or to describe different entities. For example, a large descriptive data set can have subsets of different pieces of data queried in order to describe different resources that are searched for via an IRI. Essentially, this could mean that in the future instead of having individual databases for each resource in the web, there could be fewer, larger databases (or even one massive database) that could be utilized to provide data for web resources.
In order to accommodate this, there are various proposed and implemented ontological languages such as RDF Schema (RDFS) and Web Ontology Language (OWL), and one front-runner as an RDF query language that can store, retrieve, and manipulate data in the Resource Description Framework: SPARQL. SPARQL is not indexed and allows for data to be queried from a wider variety of sources than relational databases.
The future seems to be moving toward larger databases that hold data objects as opposed to tables of data. Properties of these data objects will be accessed to pull together new data objects and unique parcels of information, created from disparate sources.
Properly structured data will be the web of the future. For machines to be able to ‘understand’ and reference the information on the web, it will need to conform to the rules of the Resource Description Framework. This could be accomplished via the proposed Semantic Web Rule Language (SWRL), which tries to provide explicit logical assertions about classes, individuals, and properties on the web.
Property and Data Assertions about the Object/Individual “Scott”
An Object Property assertion: Scott hasChild Bianca (shows an object to object assertion)
A Data Property assertion: Scott hasMobilePhone “6468201666”^^string
SWRL Syntax would be used to state the relation of an individual (or object) of a class to either data or another individual (or object).
What does this mean?
To sum it up, this means that the future of the web will built on a framework that ‘understands’ the data contained within it in a deeper way. It will use logic and a crude form of reasoning to pull data and serve information to clients/browsers and apps. This also pushes us to adopt these rules and practices in order to be sure that we are a part of the reasoning matrix being developed deep within the inner sanctum of the web.
Defining the New Standard and Applying to Search Accuracy with the Semantic Web
Google has always been a leader in providing answers. Now, tapping into the semantic web, they are able to do much more than provide answers from third party sites. Entity relationships and user behavior allows prediction models which tell you what you need before you do. The end goal is complete transparency in data both online and off, coupled with location and empowering users to tap into this with mobile devices. This is where Google will become much more of a personal assistant than a traditional search engine.
All search query types are relevant to connections between entities. Take for example, the predefined set of queries:
- Navigational – queries attempting to find a particular website or web page.
- Informational – a broad set of keywords attempting to find information on specific topics.
- Transactional – queries powered with buyer’s intent to purchase.
Take, for example, asking Google an informational query:
Prior to the Knowledge Graph introduction to results pages, we would have seen something similar to this:
This would be largely organic search results, powered by traditional search engine optimization tactics: link building, on-site optimization, site size, keyword saturation and the rest. Now, let’s look at the new search engine results page layout:
We have a clear answer. This is great for voice search. No hands, no time for surfing. Siri knows what you need. Google does too.
Located on the right side of the screen, everything Google deems relevant to your person-query is populated from multiple sources. Movies, songs, basic life facts and even the ability to subscribe for new updates about Wildford Brimley, the entity, and the data that compiles surrounding his digital entity.
User Intent Across All Devices Coupled with the Semantic Web
Machine learning, query predictions, the mobile revolution and voice search has allowed insights into the evolution of how we will and currently utilize the web. As more entity-relationships are defined and more sites adopt schema/structure data, the more functionality grows and adapts as well.
The last known number (at least this author could find) on usage of Schema.org on websites was 5 million. That means well over 5 million websites are already communicating to search engines just what entities’ pages contain.
Google’s Hummingbird update showed it’s no longer a pipe dream, but something that algorithms updates are gearing towards.
In the recent Searchmetrics study, “Schema.org in Google search results” nearly 40% of search queries returned Schema.org data.
Best Way to Get Started with Structure Data
A good way to begin is to define and outline all known entities within sites or web applications. Look for books, movies, events, objects, organizations, people, places, products, offerings, reviews and more. For a complete list, visit schema.org’s complete Type Hierarchy.
By adding these additional tags to the HTML, you can help search engines and other applications better understand your content and display it in a useful, relevant way. Coupled with recent performance actions, no doubt CMS support and plugins will soon be on their way to make your life easier. For larger sites, entity extraction tools and API’s can help expedite the job.
- Identify all entities within web properties.
- Utilize the widely adopted Schema.org standards for markup.
- Be sure to use Authorship markup, as this could be an influencer for rankings in the future.
- Identify your consumers intent within websites. Providing more concise, properly marked up content.
- Utilize Structured Data Testing Tools, like Google and Yandex both offer. Ensure everything is set up properly.
- Be sure to cover bases with multiple pages with synonyms referring to the same underlying thing is exactly the wrong approach.
Semantic Web, Structured Data and the Future
Ten years ago, machines would consider keyword occurrences as indicators of a page’s content. Today, they are able to use massive connected datasets (such as DBPedia or Freebase) filled with data as well as semantic markup directly implemented by developers, and combine it to their own language processing abilities to serve results that are much more relevant and useful.
Today, the device usage powering search means people on the go, Google Now wanting to be our personal assistant, focus on authority, expertise and reputation (integration with G+) and be proactive in the adoption. As we see the usage of mobile, voice search and wearable technology increasing, we will find ourselves with answers to informational queries before we know we need to ask. Transactional queries may run as we view items at the store, with price comparisons and navigational-related answers with vendor reviews happening right before our very eyes.
In closing thoughts, let’s stop calling it “Semantic SEO.” Let’s stop calling it that. This is what the pioneers of the world wide web envisioned. Google and aggressive ad models have revolutionized the digital marketplace through algorithms that are incomprehensibly efficient. SEO has changed folks. Next stop for us is “Entity Optimization.” It’s clearer as to what we are attempting to accomplish.
Whether it’s launching a Wiki page or adding Schema.org performance actions to events on a site, we’re not bound by search on this strategy, nor one specific search engine.
Entity Optimization. Are you ready?