Always be doubtful of academic papers with names like: “Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion.”   Be even more suspicious if the paper lists 9 – yes 9 – authors.  That paper had to do with Google’s creation of their “Knowledge Vault,” a repository of facts collected from the Internet.  This fabled “vault” contains, according to the last reference I saw, 2.8 billion facts.  What are these “facts”?  Heck if I know.

I do know that the paper has subtitles like “Neural network model,” “Path Ranking Algorithm,” and “Fusing Priors.”  The mind boggles.  The eyes glazeth over.

So is it any wonder that not long after we have another paper, with a less weighty title, from some of the same authors.  “Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources.”  The upshot is this:  Google has decided to collect facts.  Billions of facts.  Google would like now to rank search results based on the accuracy of the facts.

So, hang with me now, it means that you will rank higher if your website is more “factual.”  But how do we determine that?  Well, by Google’s Knowledge Vault and their algorithms.  Well, no need to worry about that, right?

Facts and Fiction on the Street Corner

Fast forward a year and another article comes to my attention.  This one is slightly less academic, from the NY Times, no less, and has the rather unexpected title “Fake Online Locksmiths May Be Out to Pick Your Pocket, Too.”

So what in the world does one have to do with the other?  Maybe the hour is too late and I’ve had too much pizza.

Photo of fake locksmith shop

Ok Google, which one is the real image? (The bottom one, as it turns out)

Well, here’s the crazy connection I draw: the NY Times article describes how lead generating mills are able to game Google.  They cite an example of one locksmith featured in local search results whose address turns out to be a vacant lot with a building Photoshopped into it.   Note the photo (used without permission from the NY Times, but I hope they don’t sue me).  The image on the bottom is what exists at the address, the image on the top is what Google thinks exists at that address.

This is where the disconnect comes in for me.  Google wants to be the arbiter of the world’s facts?  And even with the resources of Google Maps and satellite technology that can see a dog chasing a cat, they still make a mistake about whether a building exists at an address or a vacant lot?  And we’re supposed to trust they can rank sites based on factual accuracy?

Obviously Google needs to work a bit more on the whole omniscient thing.

I am left wondering about the implications of a world where Google will determine what I see by how “factual” it is?  I hope they figure out a building from a vacant lot before that day comes.