Google’s Knowledge Graph is a continually growing part of Google’s search results and Google’s products. It’s different from the traditional organic algorithm. And there’s much less known about it.
In this episode, Nate & Brett explore what the Knowledge Graph is; what it means for businesses; bleeding edge research into ranking factors; and tips for how businesses can compete in a Knowledge Graph world.
- What the Knowledge Graph is
- Why the Knowledge Graph matters
- Variations of the Knowledge Graph in Search Results
- How Knowledge Graph Answers are created
- The Commercial Angle
- The Google+ Angle
- Knowledge Graph Optimization & Best Practices
- The Medical Knowledge Graph Controversy
- Making the most out of the Knowledge Graph
- The Future of Search
Brett Snyder: Hello, there. Welcome to Bamboo Chalupa, your digital marketing podcast. My name is Brett Snyder. I am the president of Knucklepuck. I am joined by my co-host, Mr. Nate Shivar, the author of ShivarWeb.com.
Nate Shivar: Howdy, howdy.
Brett Snyder: This afternoon we’re actually going to be talking about one of what Nate and I look at as the bleeding edge element of the search results pages. It’s something that Google released about two or three years ago, I think, maybe even a little bit more, that-
Nate Shivar: 2012.
Brett Snyder: -continues to fascinate consumers, but remains largely understudied, in terms of how you actually can impact visibility in the search results. What we’re referring to here is we’re referring to the Knowledge Graph.
Nate Shivar: The Knowledge Graph is Google’s system for organizing information about millions and millions of well-known entities. Entities is the specific term that they use to think of distinct people, places, organizations, concepts that exist in the real world. Google actually said that when they move to building out the Knowledge Graph that this is the critical first step towards building the next generation of search, which taps into the collective intelligence of the web and understands the world a bit more like people do.
Brett Snyder: I think that’s huge, there, to tie into this fact that they want to understand the world a bit more like people do, because they say this as if it’s something new and innovative here, but it’s really what Google has been trying to do. It’s what they tried to do when they released the Hummingbird update. It’s what they tried to do when they released PageRank back in 1999. When Nate was talking about this idea of millions of well-known entities, your people, your places, your organizations in the real world, this distinction among entities is really important as we talk about the Knowledge Graph, because it’s a natural extension of this advent of semantic search.
We are now moving in a world that we want to move from strings to things. In other words, we used to be looking at strings of words, or keywords, and say, “We need to optimize for these strings of keywords.” We do that by using it in an x density, or we have links with this anchor text in them. What the Knowledge Graph, in this advent of semantic search, is pushing us to do is it’s pushing us to move away from looking at just strings of keywords and trying to embrace this idea of things. It sounds pretty confusing, but, when you boil it down, it really is very simple. It’s about understanding the context of our relationships.
We’ve talked before on the podcast about how Google, and Yahoo!, and Bing, and any emerging search engine, really, is designed to model relationships across the web. They want to be able to take your offline relationships, the things that signify trust, authority, and relevance, the way that other people will endorse or validate your claims to trust, authority, and relevance, and they use that to impact your visibility in organic search. When we talk about this idea of the Knowledge Graph and we talk about entities, we’re really talking about the relationships among those people, places, and things that we see on the web.
Nate Shivar: It’s interesting that you just talked about Hummingbird, which was really a move towards creating search results not just based, like you said, on keywords, but also understanding the implicit intent behind the keyword searches.
Brett Snyder: The way that I always look at it is you want to be focusing on what a user means, rather than what they say.
Nate Shivar: One technique that Google is using to model this, when it comes to the semantic search and Knowledge Graph, is this concept of a triple. This is something that semantic strategists use when they talk about three parts of speech that form any sentence. You’re talking about the subject, predicate, and object. These three things, when combined, build the relationship between the subject and the object, and really give an idea of what something is. If I were to say, “R.E.M. plays music. R.E.M. is from Athens.” The idea of music, Athens, Georgia, is going to be associated with R.E.M., so that when someone searches “Athens Georgia band”, Google isn’t going to … Google doesn’t want to just return a list of music review sites, or whatnot. They want to actually understand what is the person searching for and what is the entity associated with those ideas, and return that as an answer.
Brett Snyder: I think what’s important to understand, with regards to this concept of a triple, is it’s actually different from the way that the Knowledge Graph works. The triple is, as Nate mentioned, goes into this idea of semantic language. The way that you can understand, are you referring to R.E.M. the band or are you referring to REM, rapid eye movement, the phenomenon that occurs while you sleep? When Google looks at the Knowledge Graph, they actually want to understand these triples, they want to understand the semantic meaning behind a phrase, but they also want to, what they call, disambiguate these entities, which is essentially breaking them down into their component parts and then grouping them according to overlapping themes.
Again, we’re going away from this idea of strings of keywords where we’re saying “Athens Georgia band”, but we want to understand that somebody is looking for … What exactly are they looking for? What entity? What person, place, or thing are they looking for when they’re searching for bands? Now they may be looking for R.E.M., or they may be looking for any of the other bands who, unfortunately I’m not as keyed into the Athens, Georgia music scene as Nate is.
Nate Shivar: Come on. Widespread Panic, B-52’s. There’s a wide variety of concepts that share information like that.
Brett Snyder: When Google looks at the Knowledge Graph, they want to have this disambiguation of the entities so that they can understand specifically what it is that somebody is looking for, and what is the relationship between that entity and the overall focus or context of what they’re searching for.
Nate Shivar: What’s really interesting is how Google is creating this. If you think about how humans are easily able to do this when you land on, say, a Wikipedia disambiguation page, if you search in Wikipedia for R.E.M. and you land on the disambiguation page, where it asks, “Did you mean the band? Did you mean the sleep? Did you mean something else?” is something that Google is trying to use. What they’ll do, and what they’re doing, is they’re trying to merge information about entities from all sorts of data sources and to create, again, this literal graph with all these different types of information, and how they relate, and how they overlap. What’s interesting to us, as marketers, is to understand not only the data sources that they’re pulling from, but also the fact that Google has mentioned that, a lot of times, the best source of data is the actual entity itself.
Brett Snyder: Let’s talk a little bit specifically about some of the different variations of the Knowledge Graph, and then why it matters for us as marketers. When we talk about the Knowledge Graph there’s three primary ways that this will manifest in the search results. Again, it all pulls from this idea of being able to identify and disambiguate individual entities. The first one is what’s commonly referred to as the Local Carousel. This is something where if you type in, say, “pizza restaurant”, a lot of times it will show up as a horizontal ribbon at the top of the search results. It will actually start to pull in top photos, many of which are pulled from Google+. You see photos, you see reviews, you see address and phone number. You have this information that is presented in a carousel at the top of the results.
We also have what I’ll call cards. It’s also been referred to as graph panels. These are really almost the note card or some sort of single takeaway abstract of information about largely proper nouns.
For example, if you type in “Barack Obama”, along the right side of the search results in Google you’re going to see a picture of Barack Obama. You’ll see that he is the President of the United States and married to Michelle Obama. You’ll see his kids, you’ll see his previous positions in the Illinois Senate. You’ll start to see who Barack Obama is in all of the different entities, all the different pieces of information, that tie into who this person is and why they are important.
The last one that we talk about is what is probably the most familiar and something that we’re going to focus on for the rest of this episode, and that has to with the Knowledge Graph boxes, or the abstracts. This is the one that, in my opinion, is most appropriately named the Knowledge Graph, because it’s presenting actual knowledge. It’s almost answering questions for you that you would previously have had to click through to one of the sources listed in the search results. Now Google has this box at the top of their search results, right beneath the search bar, that actually includes a brief overview of that topic that, if Google did their job correctly, specifically answers or addresses the needs that you presented with your search query. The easier way to think about this is, when you search for something in Google, you’re asking a question. The Knowledge Graph is providing you an answer without you having to click through to the necessary sources.
Nate Shivar: When it comes down to the fact about why this matters to us, and not just the fact that Google says it’s the next generation of search, and all that, it’s the fact that the Knowledge Graph, in many ways, is taking over the search results. There’s a lot of firms, including Pete Meyers at Moz, who are tracking how the Knowledge Graph is growing and expanding. The latest research shows that almost one in five search results now include Knowledge Graph features. The potential in organic traffic, it’s enormous. Wikipedia page views have actually declined over 21% after the Knowledge Graph came out.
If you are a company or a publisher that’s previously relied on providing some of these basic answers, I think about weather or FAQ-type content, you can get into the idea about is Google stealing your traffic? If people are coming to Google, they’re getting their answer right there in the search results from data that is actually scraped and pulled in from your site. Those people aren’t clicking through to your site. Even though, in a way, it’s providing a better user experience, there’s going to be positives and huge negatives for understanding or not understanding the Knowledge Graph.
Brett Snyder: Make no mistake, Google is stealing your traffic. This is one of those things. One of the most compelling arguments that I’ve ever heard about this is that Google, who vilifies sites that scrape other sites’ content, is not only is it the world’s biggest scraper, anyway, in terms of pulling information in to return them in their search results, now it’s actually taking that information, scraping that information, and putting it on the front page of their results. They do include a link back over to the citation source in the Knowledge Graph box, and so you could argue that they are simply citing your content, anyway. That’s why it’s important for us to understand that Google is actually taking page views or traffic that would have otherwise gone directly to your site.
This may impact your ability to build a retargeting audience. This may impact, if you’re a publisher that relies on page views and generates revenue through AdSense, it may affect your bottom line.
It may be able to impact the likelihood that you’re able to naturally attract links, if people aren’t actually going through to the citation source and being able to see the complementary or supporting information. There is a lot of implications to a consumer, especially for content-heavy sites. The fact of the matter is that the Knowledge Graph is here to stay. It benefits Google. Consumers, as we mentioned in the beginning, they love the Knowledge Graph because it’s instant gratification. They don’t have to go searching for it. Google’s doing all the work for them. Now they don’t even have to search through Google.
Google is searching its own results to be able to give you the answer to your question. Google is able to keep you on their website longer. To focus back where we said this before, where the grim reality, in a lot of ways, is that Google doesn’t give a shit about you and your business. They care about themselves and their business. The Knowledge Graph allows them to keep you, as a consumer, on their website longer, on Google dot com longer. You’re not losing people who aren’t necessarily in what we’ll call a conversion mode, by Google’s definition, somebody who’s willing to click on an ad. They’re just looking for quick FAQs content.
This allows them to keep you on the website, give you that initial information, and then potentially encourage you to refine your search or to send follow-up information that would put you in more of a conversion mindset, where you’re more likely to click through to say, “Now I know the answer to my question. This means that I want to buy x, y, and z product. Let me search for that and click on an ad,” or, “I want to look for this service,” or, “I need more information about this. Maybe now I’ll refine my search and be able to click on one of the ads for a white paper that will get somebody into the lead gen funnel for that particular client.” Make no mistake about the fact that Google does the Knowledge Graph because it benefits Google. Google will say that they’re there to be able to drive people to your website, but they are not. They are there to provide people with information, with answers to their questions, and the Knowledge Graph allows them to take advantage of your content but keep them on Google’s website.
Nate Shivar: Just like we talked about in the last episode about lessons from Black Hat, as frustrating as Google can be, the most sustainable approach is going to be working with them, and working with them to give Google what it wants for their interests, and trying to best align your interests with theirs. In that spirit, we did want to cover some of the key things that we need to know about the Knowledge Graph so that we can use it so that it does end up serving our interests for our websites, and not just giving information to Google for free.
Brett Snyder: The biggest thing, and this is really essential, the Knowledge Graph is a different algorithm. It’s not entirely different, it’s not entirely independent. It’s still reliant on quality signals, and it’s designed to be able to reward quality content. It’s an entirely separate algorithm for what processes your traditional organic results. In the rankings that we’ve been tracking, we actually saw that the number one organic ranking sees no real correlation with your visibility in the knowledge graph. We’re seeing things that it actually, typically, at least for the small subset of queries that I track regularly, that there are only a couple of examples where the Knowledge Graph is also the number one organic listing.
The Knowledge Graph result also almost always comes from page one. If your site is not good enough to be on page one, it’s highly unlikely that you’re going to be good enough to exist in a Knowledge Graph result. A lot of it comes down to the format of the content on your page, but it also has to do with what criteria Google looks at to be able to say, “This is a valuable Knowledge Graph result,” as opposed to, “This is a valuable comprehensive resource that people would link to from, that would click through to from, the results.”
Nate Shivar: I find this really interesting, especially when you start thinking about links, and maybe how links are less important for the knowledge graph, just because in this, since it is a different algorithm, you can think of it in a way that relationships are defined differently. In the traditional algorithm, we think about how links are cross-website relationships, one website saying, “I think that this content on this website is trustworthy, it’s relevant.” The Knowledge Graph, it’s focusing more on the entities outside of the context of what website it appears on. Even though the content is obviously communicated on a website in content, it’s almost like the relationship is on a different level. I think that that’s something really important to keep in mind when we start talking about optimizing for it.
The second thing I think to keep in mind about the Knowledge Graph is that Google’s mission is to organize the world’s information. That’s something that cannot be accomplished without organizing what we think of as offline information. Basically, information that is in digital form, it’s on a webpage, Google has a pretty good way of parsing that content, understanding it, and serving it up. That’s been their bread and butter for so long. Where they struggle with, and we’ve seen this, we talked about this in our Local episode, is when you start going offline with concepts that exist in the real world, that exist in conversation, and things that humans understand intuitively but search engines don’t really understand. The Knowledge Graph, it’s an important piece, not just for serving better search results, but also for helping Google achieve that mission of organizing the entire world’s information.
Brett Snyder: It’s also important for marketers to understand that, as they’re organizing that information, that there is still a finite amount of real estate where they can present that information. We won’t go so far as to say that it’s a zero-sum game in the search results, because the Knowledge Graph doesn’t replace your number one listing, it just pushes it down. There is something to be said about the fact that the Knowledge Graph plus the Google ads … Now in a lot of queries you’re seeing one, two, maybe none, no organic listings above the fold on most browsers, meaning that there is no content that actually is not in some way directly benefiting Google in the primary real estate on the search results. Like we’ve said, the nice thing is it doesn’t appear to be taking away from your page one visibility, in terms of, if I was number ten, I’m still number ten, that’s still page one. It devalues what that page one visibility means when somebody has no opportunity to see anything other than Google information above the fold.
Nate Shivar: When we talk about how the search results are changing, it’s also changing in a mysterious way. When Google released their PageRank patent, everyone had a good picture about how Google would produce the top results in the traditional algorithm that we’ve seen for years and years. The Knowledge Graph, it’s very mysterious, in a way. There’s a lot of people doing interesting tests. We’ll talk about some of the takeaways that we found. Google has said some information, but, when it comes down to it, Google is drawing from what they call their Knowledge Vault, which is a bank of information that they have produced, that no one really knows for sure how they’re getting it, where they’re getting it, how they’re using it, how they’re combining it, and how they’re actually producing these results. It’s much less transparent than what Matt Cutts has been talking about for years and years, and now John Mueller, about SEO best practices.
Brett Snyder: There is a real danger in this lack of transparency. We’ll accept that fact that the current algorithm is not exactly transparent, but SEOs and Google patents, we can do some pretty solid, what we’ll call, ballpark reverse engineering. We’ve got a really good understanding about what types of criteria go into organic results so that we can align, and the guidelines are very clear. I’ll take that back. Maybe they’re not exceptionally clear, but at least there are concrete guidelines that we can follow, in terms of being able to present content that Google likes. This transparency can create more abuses. It can be somewhere where if we say links will impact our ranking algorithm, that’s where these get-a-thousand-links-for-five-dollars sites pop up.
It also solves problems. It allows Google to close loopholes. We talked about that in our Black Hat episode, as well, where it identifies weaknesses and shortcomings. This locked up Knowledge Graph vault has none of those checks and balances. There’s no quality assurance that really is addressed to this idea of at least partial transparency so that people are speaking the same language, that we know that it’s something that we are aligning with what Google wants. Nate mentioned that earlier, where we talk about this idea that we have to play the game. We know that it’s here to stay, so there’s a lot of benefit to going along with them and finding a way to play within the rules. This lack of transparency means that there are no clearly-defined rules, and that it limits our ability to be able to construct content or be able to tweak our content that will align with what Google is specifically looking for.
Nate Shivar: The other thing, when it comes to figuring out what Google’s looking for, is what’s the commercial drive? Google’s mission is to organize the world’s information, but, when it comes down to it, they need to make money for their shareholders, as well. I really think the commercial drive for the Knowledge Graph really comes from Facebook. Facebook has what they call their Open Graph, which is absolutely incredible for marketers. Facebook has, what is it, 1.2 billion people voluntarily-
Brett Snyder: I think it’s close to 1.35, at this point.
Nate Shivar: Facebook has 1.35 billion people voluntarily structuring their data. Facebook has all these people saying what they like, who they’re associated with, and they have pages saying what they like. They have all this data that people are voluntarily throwing up so that Facebook can say to marketers, “Hey, if you want to buy an ad for pet owners who live in San Francisco and also love the San Francisco Giants, you can do that.” They can form those associations. That’s something where Google, with their traditional algorithm, when you’re depending on other websites, other links, signals, there’s a lot of these correlations that Google would never otherwise find. That’s one reason that they’re building this Knowledge Vault and trying to form their own Knowledge Graph of the world, to be able to find these big data correlations and relationships between all these different entities.
Brett Snyder: That example right there, with dog owners in San Francisco that love the Giants, only begins to scratch the surface. You can look at unmarried female dog owners with no kids that live within a certain geographic segment that have a french bulldog and that have previously bought Purina Dog Chow. You can get so, so specific, and all this information has been volunteered through Facebook. That’s why, and I’m sure people expect whenever you talk about Facebook and what Google’s trying to do to be able to get that information, we always seem to circle back to this idea of what’s Google+ doing? Google+ was the start, there. Google+, again, I’ve said this since it came out, Google+ was not meant to compete with Facebook.
Google+ was meant to be able to curate the same type of information that Facebook has. They want to know who you are. They want to expand their Knowledge Graph about you and your interests so that they understand what is going to be most valuable to the community as a whole. Google+ was the real first way that they structure content for Google’s users. Google+ is them getting an understanding of who we are. It ties into authorship. Who are you and what type of content do you write? It ties into this idea of Google+ for Local. Who are these local businesses? Who do you review? I want to tie it back to an individual so that I can identify wants and needs of an individual, and be able to try to find correlations between those and other individuals that share criteria, like man-woman, where you live, kids or no kids, if you have siblings, interests, likes. All of that kind of information is what Google+ is really looking to accumulate. It was really just the start of it that now feeds into what their ultimate goal is by providing this information in the Knowledge Graph.
Nate Shivar: We have an idea of where Google is going with this. There’s some limited people writing about things. Brett, you’ve done a lot of tests and reverse engineering. Let’s talk about some of our more out-in-the-weeds findings, especially when we talk about what we call the Knowledge Graph abstract, so the little snippets of information that appear just above the traditional search results, not necessarily the Knowledge Graph panels or the Local Carousels.
Brett Snyder: We’re going to go through these pretty quickly, but these are really … We can almost think of them as best practices for optimizing towards the Knowledge Graph. The first one is, there’s a very heavy association between Knowledge Graph content, and when we say “Knowledge Graph content” we mean the actual sentence or sentences that appear in that abstract box, there’s a very high correlation between that content and header tags on the page. These header tags essentially identify the context of your query as it relates to the Knowledge Graph answer. What I have been tracking, it’s almost always you will see the Knowledge Graph information immediately follows an H1, H2, H3 that also reinforces that context.
If we were going to look for something along the lines of “What are the symptoms of influenza?” The header tag on that page will say “Symptoms of Influenza”. Then, underneath that, you are more likely to find the content that is actually presented in the Knowledge Graph. Having these very close and direct associations between header tags and header content for context has been something that has been really been tracking these for over a year and a half now. This is something that has remained consistent throughout everything that we’ve been tracking for the couple of clients that we are actively optimizing for Knowledge Graph visibility.
Nate Shivar: What’s really interesting is that header tags … That’s not just older, traditional SEO best practices. That’s web development best practices. When you talk about structuring data on a HTML page, that’s … I think that circles back to the importance of really following best practices to a T. They’re a reason they’re there.
Brett Snyder: One of the other best practices that we’ve found had to do with what we’ll call a Q&A format. The Knowledge Graph, as we’ve alluded to throughout the whole episode, the Knowledge Graph has to come down to providing answers to questions. The answer is that Knowledge Graph abstract. The question is actually the search query that’s dropped in. We’ve found that for content that includes explicit question-and-answer format, meaning instead of that header tag being “Symptoms of Influenza”, that header tag is “What Are the Symptoms of Influenza?” The very first paragraph starts, “The symptoms of influenza include,” x, y, and z. This type of format has been very prominent in the Knowledge Graph listings that we’ve been tracking.
To me, this stands to reason. It stands to reason for the fact that they want to be able to say, “I want to know the associations among those entities. I want to be able to say, ‘Symptoms of influenza.
What are the symptoms of influenza?'” Now we have the entities of influenza. We have the entities of symptoms. We know that we’re looking for a specific type of influenza information. By having not only the header tag, but also explicitly calling out the introduction by means of a question, this is something that, to me, it makes sense why Google would look for this type of format and reward that, because they can say with an extremely high degree of confidence that this information is addressing the question of what are the symptoms of influenza, because it is explicitly stated right there on the page.
Nate Shivar: I think this tactic, I think this would be really easy to peg as SEO content or content written for search engines. To me, when we talk about FAQ structured data, I think about the old adage for journalists about not burying the lead, i.e., the most important bit of information, don’t bury that somewhere in the page. If the page is supposed to provide information for people who are looking for a simple answer to a simple question, provide it right there in a format that users like, and it’s something that Google can easily process and will reward you for it.
Brett Snyder: Page placement is also something that we looked into. Do they like content at the top of the page, the middle of the page, and the bottom of the page? We saw that the Knowledge Graph seems to prefer content towards the top of the page, but it’s not a hard and fast rule. We’ve found that they recognize that, I think it was about two thirds of our content appeared in the top third of the page. This is what I think, honestly, is a prime example of recognizing the difference between correlation and causation. To Nate’s point, you don’t want to bury the lead. If it’s your most important information, you want that to be prominent on the page. You could make the argument that this page placement is actually not something that Google explicitly includes as a ranking factor for the Knowledge Graph.
The factors that influence the Knowledge Graph, this trust, authority, relevance, are the same factors that would dictate that a piece of content or a paragraph belongs at the top of the page. It is the key premise of what your content is trying to communicate. For that reason, it belongs at the top of the page. I’ll admit that this could potentially be a situation of correlation more so than causation.
If we look at the underlying themes there and we recognize the fact that your most valuable content belongs in the most prominent position, it stands to reason that the Knowledge Graph would present content that exists at the top of the page more prevalently than it would for content elsewhere on the page.
Nate Shivar: Those are some really interesting on-page things that you can do. We talked about how Google’s pulling information from other sources. There’s also some more structural tips that I think we want to touch on.
Brett Snyder: The first one, we don’t want to spend too much on this, because we’ve already admitted that we’re going into the weeds, but we at least want to be able to see our way out of it. There was a platform or a database known as Freebase. This was essentially almost a structured data equivalent of Wikipedia. It was a user-generated or user-curated source where you could actually explicitly define the individual entities associated with your brand, meaning that you could go in there and say, “This is the name. This is the parent company. This is the type of content that my site involves.” Almost think about it like an external source of rather than adding structured markup to your data, it’s adding data within a structured markup framework. That’s what Freebase was. In the past, I think it was even fairly recently, about a month or two ago, they actually said that Freebase is going to be deprecated and they’re going to discontinue use of Freebase.
They suggested that people move towards a platform called Wikidata. Very similar, user-curated, being able to find entity information but have that be something that is presented by the brands and the consumers themselves. Wikidata is not expected to be as influential as Freebase. Wikidata actually explicitly says in their FAQs that adding information to Wikidata will not increase the likelihood that you will show in the Knowledge Graph, because Freebase was explicitly called out as an influencing factor to the Knowledge Graph. Wikidata, I think, does not want their results to be spammed or to be manipulated, so they want to make that clear. They actually drive you more towards this idea of using schema markup, rather than attempting to manipulate Wikidata to be able to secure the signals that Google looks for when deciding who to rank in the Knowledge Graph.
Nate Shivar: When you talk about using schema, this could be technically classified as on-page, but I think it’s more effective when it’s a structural part of your website. We talk about schema, a little bit of background, this was a shared standard for microdata and structured HTML markup that Google, and Bing, and Yandex, and some of the other large search engines all agreed upon as a standard. Basically, this is a way to markup the pages and elements on your website to highlight entity connections. Again, I think the importance of schema isn’t simply that it’s out there, but it’s something that you can control that’s structural that’s across your website.
You can not only have your logo and your header, you can add schema markup to that logo that tells Google explicitly, “I am this business, and this is my logo.” On your logo page you can tell Google, “This is my street address. This is my city,” and explicitly tell Google with this markup, and not rely on them trying to extrapolate it from page elements or from other data sources. It’s a recurring recommendation among SEOs, but I think it’s particularly important for the Knowledge Graph, because it’s basically a foundation of the Knowledge Graph, or at least according to Google.
Brett Snyder: It just comes down, the easiest way that I’ve found to remember this is, it’s explicit definitions. What schema is is it’s explicitly defining the context of your information. The knowledge graph is looking to do that. They’re just taking it a step further, representing it to a user, tying it to a search query. Schema allows you to explicitly define your entities, which makes it a lot easier for Google and other search engines. Obviously they have a different use with the kind of schema markup. That explicit definition makes it so much easier for them to understand where it associates directly to that search query.
I also want to talk about … I know we said that we’re going to focus largely on the abstract content with regards to Knowledge Graph, but we’d be remiss if we didn’t discuss what is unimaginatively being referred to as the medical Knowledge Graph. I’ll go so far as to call it the medical Knowledge Graph controversy, because there are very serious distinctions between what Google is doing, in terms of this Knowledge Graph information for medical queries as opposed to regular queries. For a little bit of background on this, it was only six to eight weeks ago, really, where Google now has Knowledge Graph related to medical queries. It’s actually having information that calls out symptoms, it calls out treatment options, it calls out prognosis or different diagnostic procedures. It has that information right in the search results.
This is actually a bit controversial, because it’s not pulling from information. It’s not an algorithm necessarily processing information, being able to pull the best information based on the criteria that build that algorithm. It’s actually something where they have partnered with Mayo Clinic where Google will still use an algorithm to crawl and generate this information, but then it’s essentially curated and polished by Mayo Clinic and what they claim to be twenty trusted sources, up to twenty trusted sources. It really has to do with Google collecting information, and then Mayo Clinic turning that information and being able to say, “All right. We’re going to QA it. We’re going to make sure that it’s accurate. We’re going to put our stamp on it.”
When this first came out, I was a little salty about it, to be perfectly honest, because this is showing an overt preference for Mayo Clinic, where Google has traditionally at least tried to maintain some degree of objectivity and being able to say that there is a likelihood that you will secure visibility. They don’t have partner brands, necessarily, where they show that heavy preference. There have been anti-trust lawsuits about the fact that they showed some preference for YouTube content, versus sites like Vimeo or Wistia. I actually spoke to … Spoke to, I shouldn’t say. In the SEO sense, I had a Twitter conversation with a woman named AK Anderson, who I believe works with WebMD. We asked, it’s like, “Hey, was there any kind of discussion with some of these other prominent sources, in terms of how do we curate the information, here?”
She made a great point, that this health one box, as she referred to it, or the medical Knowledge Graph, however you want to call it, it isn’t crowd-sourced authority. It’s a handpicked set of sources headed by Mayo Clinic. It’s not just scraping content anymore, which is problematic, as you can imagine, with medical queries, because of the sensitivity there. I think the key distinction is that it’s no longer crowd-sourced information. It’s curated from a select group of sources, and that’s where the controversy comes into play, because who’s to say that Mayo Clinic’s information is better than some of these other sources? Mayo Clinic, I’m sure, can make a strong argument for that. Google can make a strong argument for that.
Let’s be perfectly honest, there is a strong argument for the fact that Mayo Clinic is an authoritative resource that deserves to be able to influence and ensure quality assurance of the information in that Knowledge Graph. In my opinion, it goes against what Google has always purported to do, and that is to provide these [inaudible 00:37:55] results and be able to curate sources from across the web to find the very best resource, regardless of any preexisting connotation associated with that brand.
Nate Shivar: The biggest issue, to me, in this situation is liability. In the past, when all of Google’s organic results were generated by an algorithm, in some ways, they could throw up their hands and say, “Hey, it’s a computer algorithm.” Once you bring in the human element and have actual humans QAing and crafting search results, you start talking about who’s liable? What’s the disclosure, here? Brett, I know you have clients that work in these spaces, where some of the information displayed in the Google search results in these health boxes, it’s flat out wrong. I think some of this goes back to the issues of transparency that we talked about earlier, and about how even though this is, in theory, locked up and away from abuses, that abuses and things can still happen.
Without the transparency, a lot of times, there’s fewer avenues to rectify it.
Brett Snyder: Those abuses may not necessarily even be conscious, but abuses of bias. These things where, all of a sudden, what if Mayo’s wrong? What if there’s differing opinions? Medicine is something where there are tons of differing opinions in how to approach things. When you start to say this is the one result, and when you are in a position like Google is, where people are now assuming this to be veritable fact, and you are presenting it as such … You call it the Knowledge Graph. You don’t call it the Opinion Graph, you’re calling it the Knowledge Graph. You’re presenting this information as objective fact, but it’s objective fact that is, at least in some sense, influenced by a bias. I think that, to me, is concerning, because it doesn’t allow for a natural checks and balances against that information. Again, it removes the issue of, “Hey, this algorithm is going to pull information that’s wrong.” At least we know that it went through QA, but it went through QA from one particular source.
I think it’s controversial. I don’t have a solution for it. I don’t have something that I can say that I think that they are inherently wrong by partnering with Mayo. I think that it’s also something that warrants a discussion over the fact that Mayo and then twenty trusted sources that go unnamed, the other twenty sources are unnamed. When you start to talk about the fact that are those twenty sources just what Google’s algorithm scraped and pulled and just sent information to Mayo, because I can do that. I can go in there and I can search for an hour, and I can send a bunch of information to somebody to QA.
I think that it goes back to that idea of transparency. It goes back to the fact that the lack of transparency with regards to the Knowledge Graph is concerning. It’s something where it doesn’t appear to be going anywhere. It’s here to stay. This lack of transparency doesn’t give us the fact that it compromises my trust in the results. If you distrust the source or you see that it’s a source that has provided incorrect information in the past, it compromises what you look at and how you evaluate the source in the future.
Nate Shivar: When we talk about evaluating sources, I think one thing that a lot of SEOs wish were part of the Knowledge Graph Knowledge Vault is authorship, which Google killed off Google+ authorship quite a while ago, quite a while in digital marketing terms. We found that what SEOs like to call author rank doesn’t appear to be influencing Knowledge Graph. I think here it’s a case where you could still make a viable claim that this is a good source of information, and that if you are providing the signals and building up your portfolio that it’s not something that … It wouldn’t hurt, even though right now it doesn’t appear to influence the Knowledge Graph.
Brett Snyder: I think that this idea of author rank, where you started to say, if somebody is more … If a person is objectively more of an expert in a field, or if they have published resources, if they have all of the things that we know we’d go into for regular Google results, where if we start to say that these people are now linked to, they’re cited as the experts, they’re considered to be the authority, that that author rank, this concept of author rank that is heavily- debated as to whether or not it even exists, but this concept of author rank, I want it to exist. I’ll admit that that’s a slightly absurd way to phrase that.
I want author rank to exist because I think that being able to tie the influence of an individual to the influence of the content they produce very much aligns with Google’s goals, not only for their traditional results, but also for the Knowledge Graph results, as well, because you’re starting to be able to validate that what somebody says stands up to external QA, that there are endorsements, that there are ways that we can look at this person, and look at their information and their influence, and say, “Yes. I can say with a high degree of confidence that this person is an expert in the field, and what they say about this particular topic is valuable.”
Nate Shivar: The last thing that we’ll cover, I would say, is just this general principle of thinking about your semantics and how you’re using language. Use nouns in your writing. Don’t be vague. Remember that Google wants to return results based on what the user means, not necessarily what they say in their search. Making your content easy for users and search engines to know what you’re talking about, and what these entities are, and what this content is relevant for, makes it a lot easier for Google to understand and to serve up. It also makes it easier for users to recognize the information if they do click through to your page.
Brett Snyder: To round out this episode, we’re going to go back to our quick tips and our rapid fire segment at the end, here. We’re going to really talk about how to tackle Knowledge Graph optimization. A lot of these are pulled from AJ Kohn, who any of our regular listeners know that both and I are huge fans of AJ’s work. He actually wrote a post in March of last year, 2014, that talked about Knowledge Graph optimization. A lot of the principles that I have found and validated through my own research, I know Nate has done the same. We’ll link to AJ’s post in the show notes. We want to use AJ’s framework to be able to go through some rapid fire tips that we can use to be able to help optimize for Knowledge Graph visibility.
Nate Shivar: The first tip he gives is to get connected and link out to relevant sites. The whole idea of hoarding link juice, and no following, or not linking out to other relevant sites is very, very old school SEO, 2001, 2002. Now with the whole idea of Hummingbird, and Knowledge Graph, and creating relationships between entities, it’s important to make sure that that entity information can flow between sites, and that Google can easily extract concepts and entities that you mention on your site, and how they relate to concepts and entities on another site, so that the relationship isn’t just website-to-website, it’s now entity-to-entity. If you can contribute to that, that will build your own authority, your own relevance, and make it more likely for you to do well.
Brett Snyder: It just ties back to that whole concept of relationship modeling. By linking out to another site, you’re almost creating or manufacturing that context. You’re manufacturing a relationship between you and that site. If you then start to say that we are linking out to highly-authoritative sites, and we are getting links from highly-authoritative sites, we’re putting ourselves in the conversation and in the neighborhood to be considered amongst those same types of resources. Linking is a natural part of the web. PageRank sculpting hasn’t worked for over a decade. We got to start really looking at this idea of understanding relationships between linking out and getting links back to your site, and how they impact the perceived authority of your domain.
Nate Shivar: Use structured data for entity detection, AKA use schema, and site maps, and whatever things that you can do to make it easier for search engines to detect, extract, and connect entities to the Knowledge Graph. When we talk about search engine optimization, the key here is “optimization”. The idea is to make it as easy as possible for search engines to figure things out so that you get rewarded in the process.
Brett Snyder: Let’s do another hat tip back to AJ, here, where, search engines, you’ve got to treat them like a blind five-year-old. Put the information out there, explicitly communicate what your entities are, and then don’t force Google to infer those relationships. You can even go a step further. We’ll give you one concrete example of a schema that ties very specifically to the Knowledge Graph, and that’s the same-as property. This is almost can be referred to as the entity canonical. If you can say, “Hey, this is the same content as you’ll find on Wikipedia,” now you’re starting to be able to say, “Okay, we’ve created some sort of an association between us and Wikipedia, that we already know heavily influences the Knowledge Graph, and we’ve done it through the use of structured data. We can reference the exact Wikipedia entry for that particular entity.”
Now that we’re helping Google to be able to say, “Look, you’re seeing very similar information on both of these sites. You know that Wikipedia is a trusted resource. You can see from your crawling of both pages that there’s a lot of associations.” That same-as schema, like we said, is referred to, can be considered largely as a entity canonical, helps create those concrete associations in a way that is very easily-implemented, in the sense that you don’t have to go digging for all this information. It also touches on some very explicit criteria associated with Knowledge Graph optimization.
Nate Shivar: You can also claim and optimize your Google+ presence. Our next episode will be about Google+, and where it’s going, and where it’s headed. There is no doubt that Google+ still sits in the middle of a lot of the Knowledge Graph. That’s a place where businesses and individuals can go and basically submit their structured data. It’s something that you can go claim, optimize your presence, give Google all of the information they want. It also extends to getting reviews and getting other people to verify that all of the information that you’ve submitted is correct and it aligns with the information elsewhere on the web.
Brett Snyder: I’m sure everybody’s sitting there rolling their eyes, Google+ is stupid. There’s nobody on Google+. I don’t want to waste time. I don’t have the time to spend on Google+. It is a Google entity. At the end of the day, say what you will about the involvement there, say what you will about it as a social network, but Google+ is a Google-owned entity that you can provide information about your product or service and to specifically structure that information around the criteria that we know the Knowledge Graph looks for. It’s a no-brainer to at least make sure that your basic information, your basic best practices, are maintained on Google+, so that when Google, who doesn’t have to look very far to find that information, but when Google looks for more information about your entities through Google+, that you’ve at least covered your bases there.
Nate Shivar: Lastly, I would say, get exposure on Wikipedia. I will say that with an asterisk. Especially with the death of Freebase, Wikipedia is even more important if you do it right.
Brett Snyder: Make sure you read those Wikipedia guidelines. Wikipedia does not want their results to be spammed. They don’t want people to try to take a marketing spin and compromise the objectivity there. Just make sure that you read the guidelines. Something that I’ve suggested to people, as well, is don’t try to do it to game the system. Create a Wikipedia profile. Add value to different posts, even if they’re ones outside your industry. Find ways to improve the quality of these individual posts so that your Wikipedia profile is viewed as more authoritative. Then, if you do need to do anything, in terms of correcting information about your particular brand, product, or service, you at least have some historical context around that profile, so that it’s not looks like you’re just doing it for the marketing benefit.
Nate Shivar: I think, as far as a final takeaway, Nate Dame wrote a great piece on Search Engine Land that we’ll link to in the show notes, where he said that, “Knowledge Graph features can be treated similarly to Google algorithm updates: they are expressions of the search giant’s constant quest to provide a killer user-experience. As with the algorithm updates, the strategies might change, but Google’s end game never does: satisfied users.”
Brett Snyder: We want to remember these principles of semantic search. We want to optimize towards what a user means, and not what they say, to focus on these entities, and not just strings of keywords. If you focus on that, that aligns with that end goal that other Nate, as we’ll call him, talks about with Google’s end game of satisfying users, which is your end game, as well. The Knowledge Graph is just another medium, another channel, through which your information can reach consumers. As with the rest of our SEO best practices, as long as we can remember the general quality principles that go into what Google is looking to accomplish, you put yourself in a very strong position to be able to acquire visibility across this merging channel.
Nate Shivar: You can find previous episodes, links to things that we mentioned, and our contact information at BambooChalupa dot com. If you don’t want to miss another episode, please go subscribe to the podcast in iTunes, or your favorite podcast app, and do leave a comment and rating while you’re there. We learn a lot from your feedback, and your ratings help others discover the show. For Brett Snyder, I am Nate Shivar. Thank you for listening.