“To manage a system effectively, you might focus on the interactions of the parts rather than their behavior taken separately.”
Russell L. Ackoff
Speakers of English can understand what “giraffelivesinsavannah” means. They can easily split the sequence of letters into the words “giraffe lives in Savannah”, and they have a set of distinct associations with the words. A “giraffe” is a tall animal with an extremely long neck, four long legs, and a distinctive coat pattern. It “lives” (meaning that it inhabits a certain place) in the “Savannah”. We may not know exactly what a “savannah” is, but we can effortlessly classify it as someplace. After all, we normally live in a “place”, don’t we?
Search engine bots are not that smart. Once they stumble upon “giraffelivesinsavannah”, they will save 22 Latin characters in their memory and head for a new piece of information to consume. Without a preloaded vocabulary and a set of special rules (or, if I were to use a buzzword, without “structured data”), computer programs can’t derive meaning from a string of characters. When someone searches the web for “giraffelivesinsavannah”, search engines will eagerly return the web page where this exact string appears. But if you search for “where do giraffes live?”, a search engine will likely not pick up that same web page — even though it has the answer.
Effectively, this means that you can’t expect search engines to understand language like humans does. But as an SEO, it is in your best interest to help them understand it. And that’s where structured data comes in.
From this blog post, you will learn how to get extra traffic and exposure in the search results by helping search engines understand your site better.
The semantic web and structured data
Internet marketers like to make hype of fancy words. The semantic web and semantic search are good examples. There was barely an SEO meet-up or conference in 2017 that didn’t mention it on the agenda. But you might be surprised to find out that the concept of the semantic web has been around since 1998 or so.
Let’s remove the marketing smoke and mirrors and figure out what the semantic web is about. Semantics is the study of meaning in languages. Particularly, it studies the relationship between signifiers (such as words, phrases, and symbols) and what they stand for (their meaning). Thus, the semantic web is a meaningful web. The semantic web isn’t about keywords and backlinks, it’s about relationships between concepts (or things). Instead of looking at the strings themselves — the “signifiers”- it looks at how the concepts behind them and their properties.
There are vocabularies and grammars for the semantic web, much like for human language. You can use them to form logical statements on your site, and search engine bots can collect, analyze, and process them. What makes semantic search different from regular search is that the rules of logic can be applied to the information. If a search engine finds a logical statement on Barack’s website that says “Barack is a friend to Michelle”, and someone does a search for “Michelle’s friends”, then, even if Michelle’s website doesn’t mention Barack (even if Michelle’s website doesn’t exist!), the smart semantic search engine will let us know Barack considers himself to be Michelle’s friend.
From the example above we see that search engines can derive new knowledge from the data with a high degree of organization. We can call it meaningful or structured data.
Why use structured data on your website?
Over the years, search engine result pages (I’m talking about Google) have evolved from a boring list of blue links…
…to a fairly informative page that abounds in useful information. In fact, the SERP itself may satisfy users’ requests, without them having to click on the actual search results.
These various widgets and cards are called search features. There are two types of search features:
- Content type features that appear as separate results. These include direct answers, knowledge graph panels, or news carousels.
- Enhancements of the search results. These are part of the actual search result snippets, such as breadcrumbs or ratings.
Search features hog space in SERPs, and to top it up, they have a substantially higher click-through rate. From what I observe, the snippets with enhancements gain about 30% in CTR over their plain counterparts. If your website doesn’t leverage search features, you may be losing both additional impressions, e.g. in top stories or direct answers, and clicks.
Furthermore, structured data opens a new world of user functionality. Users can transfer structured data between applications and websites. If a site uses structured data, web browsers can provide an enhanced user experience. For example, an event on a web page can be directly imported into a user’s desktop calendar; users can book tickets to a movie or concert right from the search results page, and find the phone number of the nearest restaurant to have dinner afterward.
Search features are part of the semantic web, and they are based on the structured data that Google can understand and interpret. Google can enable rich features for your page in the search results if it understands the content of the page, and if you explicitly provide additional information in the page’s code with the help of structured data.
I hope I managed to persuade you that structured data markup is not an option anymore. Now, let’s dig into the techy details.
Schema.org, Microdata, Microformats, or RDFa?
The Internet community is not unanimous regarding the best way to mark up structured data. As a result, a whole bunch of new confusing terminology was born, including RDF, RDFa, Microformats, Microdata, Schema and what not. I’ll try to explain them in plain English and we will figure out which of those is the best choice for SEO.
Basically, if you want to convey information, whether in natural or machine language, you need two things:
- Vocabulary: a set of words that represent sign-meaning pairs, and;
- Grammar: a set of rules that tell how to use the vocabulary to convey the meaning.
Below is an example of a simple vocabulary for structured data markup. It includes only five entries.
- Person — a person (alive, dead, or fictional). Person may be described by the following properties:
- familyName — Family name or the last name of the Person;
- givenName — Given name or the first name of the Person;
- gender — Gender of the Person;
- birthDate — Date of birth of the Person.
And we need some grammatical rules that we must apply so that a computer program can comprehend and store the data. For example:
- Enclose the structured data in curly braces;
- Separate the property and its value by a colon and enclose them in double quotes;
- Separate the property-value pairs by a comma.
Without getting into unnecessary details, most of the scary terminology concerning the structured data markup can be put into two buckets — vocabularies and grammars. Except for Microformats, you can arbitrarily combine grammars and vocabularies to satisfy your needs. Follow the links if you are looking for specific information about any of them.
Note 1: Microformats specify both the grammar for embedding structured data into HTML documents and the vocabulary of specific terms. That’s why I included it in both columns. With Microformats you can only mark up your content if the Microformats community created and accepted an appropriate vocabulary. This is a big drawback of this format. On the contrary, you can use any vocabulary, even your own, with RDFa, Microdata, and JSON-LD.
Note 2: Twitter and Facebook encourage webmasters to use their own data markup as well. These are Twitter Cards and the Open Graph protocol. These formats are not intended for the search engines; thus they are not covered in this article. Both Twitter Cards and Open Graph may co-exist with other types of markup. With Twitter Cards enabled on your website, users who Tweet links to your content will have a “Card” added to the Tweet that’s visible to their followers. The “Card” may be prepopulated with images, videos, or text of your choice. The Open Graph protocol enables any web page to become a rich object in a social graph, this is used on Facebook to allow any web page to have the same functionality as any other object on Facebook.
What vocabulary and grammar should you use on your website?
Schema.org should be your vocabulary of choice. It is supported by the major search engines, including Google, Bing, Yahoo, and Yandex. S?hema.org is well documented, versatile, and it is under active development.
As far as grammar is concerned, there’s no short answer. There are three major players now: RDFa (Resource Description Framework in Attributes), Microdata, and JSON-LD (JSON for Linking Data). RDFa and Microdata are conceptually very similar. Both of them allow one to reuse visible HTML data.
In the RDFa implementation (see below), “startDate”, “endDate”, and other marked up values are actually visible to the user, there is no duplication of the information:
<div vocab="http://schema.org/" typeof="SportsTeam">
<span property="name">San Francisco 49ers</span>
<div property="member" typeof="OrganizationRole">
<div property="member" typeof="http://schema.org/Person">
<span property="name">Joe Montana</span>
JSON-LD, on the contrary, duplicates the data, which is inserted in <head> or <body> of a page as a <script>.
"name": "San Francisco 49ers",
"name": "Joe Montana"
This is a core difference from the point of view of an Internet marketer or an SEO.
According to Web Data Commons (see the chart below), Microdata is the most widely used specification, followed by JSON-LD, and JSON-LD is gaining popularity. At the moment, Google recommends encoding data with JSON-LD, though the search engine is also able to parse Microdata and RDFa.
In my opinion, S?hema.org + JSON-LD is the best bundle for most website owners.
How to implement structured data markup?
Finally, we are ready to put the theory to practice. You are just four steps away from the structured data nirvana.
1. Choose the schemas.
Study carefully the available schemas at Schema.org. These are among the most widely used:
- Local Business
- Creative Work
- Music Recording
- TV Series
Create a map of schemas for your website in a spreadsheet. List the URLs of individual pages or website categories in one column and relevant schemas in the other column.
Note: Multiple schemas can be combined to describe one object. For example, Person is a good schema to describe a John Smith. But Person may also have an Address, and be associated with an Organization, which in its turn may have its own Address. Person, Address, and Organization are three different schemas we used to describe just one John Smith.
When the map of schemas is finished, you are ready to proceed to the next step.
2. Create structured data markup.
Thanks to Google, you don’t have to be a web developer if you need to mark up the structured data on a website. You can use Structured Data Markup Helper. It’s an easy-to-use tool that guides you through the whole process.
1) Open Structured Data Markup Helper, select the relevant schema, and enter a URL from the spreadsheet that you created at the previous step. Then click Start Tagging.
2) Highlight page elements and assign schema tags to them. You can add missing tags if there’s no visible representation for the information. Just click the Add missing tags button. Click on Create HTML, when ready.
3) Select JSON-LD from the drop-down menu. Copy the code and paste it into the <head> or <body> tags in the HTML code of the respective page on your website.
Note: If your website has thousands of pages that can make use of structured data, it would be more efficient to approach your web development team with this task.
3. Test the markup.
Visit Structured Data Testing Tool and enter the URL of the page you want to test. The tool displays all the marked-up data and provides the information about errors and warnings.
Now it’s time to sit back and relax. The rich snippets will not be displayed in search before Google re-crawls the website. Keep in mind, there’s no guarantee that your structured data will show up in search results even if the structured data is marked up and can be extracted successfully according to the testing tool. These are the most common reasons:
- The structured data is not representative of the main content of the page or potentially misleading;
- The structured data is incorrect in a way that the testing tool was not able to catch;
- The marked-up content is hidden from the user.
To put it simply, don’t try to trick Google. In the worst case scenario, your website may be penalized for the improper use of the structured data. There are cases when Google took manual actions against websites. The penalty message typically goes like this:
“Markup on some pages on this site appears to use techniques such as marking up content that is invisible to users, marking up irrelevant or misleading content, and/or other manipulative behavior that violates Google’s Rich Snippet Quality guidelines.”
Or like this:
You have been warned. For additional best practices and recommendations, read the Introduction to Structured Data by Google.
4. Use the structured data tool to diagnose issues.
As Murphy’s law goes “things will go wrong in any given situation if you give them a chance”. Your web developers can commit buggy code, or a newbie marketing manager can add structured data with errors to your pages.
Make the structured data checks part of your SEO routine. Log in to Google’s Search Console. Click Search Appearance > Structured Data. Not only will you get the details about the errors, but it’ll also highlight detailed information about the structured data types detected on your website.
If you are looking for an industry-grade tool to manage the structured data, have a look at Anything To Triples (Any23). You can use it to:
- Verify structured data;
- Extract structured data, and;
- Convert between structured data formats.
The “semantic web” is an old-school term that has been around since the late 90s. It denotes a meaningful web, where real relations between things are more important than keyword instances and href links. It’s easier for the search engines to understand the meaning of the data when it is structured. New search experiences, such as Recipes or Knowledge Graph panels are based on the structured data. You can implement the structured data with the help of the Schema.org vocabulary, and JSON-LD syntax. The easiest way to create the markup for individual pages is the Structured Data Testing Tool by Google.
Now you have all the tools and knowledge that are necessary to get your website prepared for the semantic web. And if you have any questions or feedback, drop a comment below.
Thank you SEOPower Suite and Yauhen for this wonderful article.
By: Yauhen Khutarniuk