Moz continues to provide interesting tools and site measures. I only follow things as I find it interesting (not as a profession). I am not a SEO person and paying $100 a month (or much more) they charge for their tools isn’t worth it for my curiosity. But they make some things available for free and provide some interesting blog posts on what they find and about their tools.
This new Spam Score analysis by Moz seems very interesting: Spam Score: Moz’s New Metric to Measure Penalization Risk. The idea is sensible, they are trying to determine the spam riskiness of a site based on the correlations they can draw from their web crawl data and Google search results. Moz can then see where sites are not ranking well when many factors would indicate they should rank and then draw a conclusion that Google has penalized certain sites (and not given sites with links from those sites credit or worse penalized sites with links from those sites).
This seems like a really good idea. The found 17 flags that are correlated with spam hits to the site. And when sites trip more and more of those flags the likelihood of Google classifying those sites as spam rise. When a site has 0 spam flags Moz calculates a .5% chance of the site showing up in Google search results (or not showing more likely) in a way that indicates Google sees the site as spam. 4 spam flags equals a 7.5% chance of being a “spam site.” A site with 6 spam flags has at 16% chance of being spam, 7 flags means a 31% chance, 8 is a 57% chance, 9 a 72% chance and 14 a 100% chance.
In their post Moz says that tripped spam flags are not meant to be an indication of something that needs to be fixed (after all the flags are just correlation, not causation – “fixing them” may do nothing for search results). That may be true but if sites are showing a 5-yellow for spaminess it is highly likely lots of people are going to want to reduce this scary looking feedback about their site.
It may well be changing to avoid the flag by adding twitter buttons and making whatever tweaks to get rid of several more flags is what is likely to happen.
My guess is a spaminess rating that wasn’t just x/17 but a factor of how many of 17 tripped plus an understanding of how important that was (I would imagine including which interactions of spam flag were more critical…).
I would be surprised if there isn’t a big difference in a certain 3 flags being tripped versus 3 other flags being tripped (plus say 4 other random flags). That is to say, even with Moz’s limited ability to know what Google is directly reacting to versus correlations you can observe. I would imagine this could big improved into a 100 point (or whatever) system that gave a much more valuable spam site insight than just treating each flag as equally important (and ignoring especially deadly interactions between flags – which flags when they are tripped together cause the likely spam hit to be seen in google results.
Moz actually provides details on the power of the various flags. As I read it now it seems a bit confusing how they get this measure. But the largest score is for the “no contact info or social links flag.” That has a 11.8 “odds ratio” calculated by taking the percentage of sites with that spam flag that are penalized 17% and the percentage of sites without that flag that are not penalized, 1%, 1.x * 11.8 = 17.x. The least powerful flag is 1.3, 13% of sites with this flag are determined to be spam, which 10% without it are determined to not be spam.
So sites that don’t get the contact info flag have only a 1% chance of being spam. It seems likely then that a site with 5 flags (but none of them being this flag) has a lower chance of being seen as spam than a site with just 2 flags. But without the raw data that guess may be wrong, but that would be my guess. So ignoring the power of the individual flags and relying on just a count of flags seems like a poor way of using the data Moz has gathered.
It would be better to have a spam score that counted and weighed the power of flags to provide a spam score.
And even better one that did that and paid attention to interactions between flags. It could be that for example having flag 3, 4 and 6 together drastically increased the odds (the proper risk is not just adding the power of each of these measures but of the interaction when each of these 3 are flagged being extremely powerful). I happen to very much like the power of interactions to influence results based on my father’s work in statistics and design of experiments (which is very focused on the power of interactions).
Correlation, of course, is not causation. Moz developed this flags by analyzing the correlation. So for example, a site is flagged as raising a spam flag if there is not contact info, or link to social accounts on the site. It could well be that Google pays no attention to this. So having the contact information or social links doesn’t reduce the true chance of a site being penalized by Google. But there is a correlation in that sites that are flagged by Google have a likelihood of not including those details.
So Moz will flag a site for raising a spam flag. Then the owners of the site may “fix” that so that it no longer raises the flag. And to Google this may have no effect. But their Moz Spam Score will decrease by one flag. So in effect the owners would be making an “SEO change” that really didn’t impact SEO but does impact how others may view the site (those looking at the Moz Spam Score may well be more comfortable if the flags are reduced from 6 to 4 – even if Google doesn’t care). The Moz Spam Score itself becomes a metric that they pay attention to outside of what Google cares about.
And really it may well matter. I can imagine sites getting scared about linking to high Moz Spam Score sites because they think Google may penalize them so then they don’t link to your site if you don’t address your Moz Spam Score. And that reduction in incoming links would have an impact on your ranking in search results.
Moz says they will use these flags to help refine their MozRank and Page Authority (and such like) measures. Which also makes a great deal of sense. So, for example, the page authority of sites with bad spam scores will be lowered (and those with better spam scores will rise). Hopefully when they do that they won’t use a simplistic 2/17, 7/17 etc. score but one that more fully captures the importance of the flags and their interactions.
I also have a bit of an issue of how they determine spam sites. They explain in their blog post. It seems a bit too simplistic to me, but they probably know better than me on this score. If their ability to determine which sites Google is penalizing is off then their analysis of the importance of various flags could be off. It seems their process is sensible, and would provide value, I just wonder if it is too simple and therefore the accuracy is a bit less than another option would be.
Another data question that I think could easily have an even bigger impact is how they actually determine something like contact info or social buttons flag. Automated tools are wonderful and can be very accurate about some things. But they can also be somewhat questionable, I wonder what an audit of that measure by a smart human being would find. It could be that Moz process has a tendency to flag sites for this that actually a human would not (but in a way that is correlated to not spam sites). So that Moz’s data gathering actually makes this flag seem more powerful than it is. That is just an example, they could have correlation errors in their marking flags that makes them more or less powerful than they really are (the error being created solely by the code used to determine if the flag should be set or not).
This issue is likely to reduce over time, as sites complain that for example, “we do have contact info listed – see, right here…” but you set that flag for us. Then Moz keeps improving the code to better access when flags should be set (outside of any analysis improvements – just the improvement of getting the flags correctly determined for every site).
Moz does let you see site reports for web sites with a free registration and that will show you the spam score for the site (2/17 or 6/17 or whatever but not which flags are tripped). You can see reports for maybe 3 or 4 sites a day? I am not really sure I just look at those reports occasionally.