No Comments

Real-time Web traffic estimator tools

SEO digest

This week I shared some Webalizer configuration advice with you, although there were things I left out (for example, I did not go into how to filter robots out of your visitor and page counts).

If you’re using an analytics package that processes your server logs, running that process on a daily basis can help you identify trends and analyze visitor traffic patterns much better than waiting a week or a month for the analytics reports. When you make big changes on a Web site, particularly a large Web site, waiting a week, a month, or even longer could be the kiss of death.

Your own server logs are good, but analyzing that data is time-consuming. If you’re using a third-party package (they usually rely on Javascript captures) you’ll see about 1/3 to 1/2 less data then your server shows you but you can still identify trends. Short-term trends help you see if you improved conversions for obscure traffic or just slaughtered your sales by breaking something.

You can use third-party traffic estimators to compare your site’s performance with competitors’ performance. Of course, those tools’ data isn’t always good. In fact, it’s rare when you find a tool that a lot of people like or trust, and even more rare when I don’t trash a tool a lot of people like or trust.

The current SEO metrics darlings of the industry are Alexa, Compete, and Quantcast. There are, of course, other tools that SEOs occasionally discuss. Traffic Estimator and StatBrain are two such tools, for example.

Between these five tools, you can pretty much size up yourself as other people see you, and you can equally evaluate your competitors. You’ll get better data from these tools than you will from Google Trends (which received a lot of buzz recently).

Google Trends gets its data from one place: Google. The Google Trends reports show you how well specific domains perform in Google’s queries — where Google records a lot of data. If you run a report for, say, Xenite.Org, you’ll get no data even though Xenite receives thousands of referrals from Google every month (some month Xenite has received more than 20,000 referrals from the Google network).

I can assure you that Google Analytics doesn’t report anything like 20,000 Google referrals for Xenite.Org, but my server logs don’t lie. There is no random Apache process that just makes up referrer strings that say someone found my domain from a Google search. For the month of July 2008, Google Analytics reports only about 11,000 referrals to Xenite from Google (although I still don’t have the Analytics code on every Xenite page, that number is just way too low to be reliable).

If I cannot trust Google’s reports for my own sites, where I’ve installed the Analytics code and shared it with other Google services (and SEO Theory has a similar data quality issue), then why should I trust Google’s reports for other sites? Google Trends for Websites is a great idea but there is a lot of room for improvement.

Third-party services can obtain their data from the following sources:

  1. Embedded tracking code
  2. Internet Service Provider click data (which some ISPs sell to metrics companies)
  3. Custom toolbar data
  4. Other third-party services

Google Analytics provides us with embedded tracking code. That code is written in Javascript and therefore does not capture all the traffic data that a raw server log will capture. Still, Google Analytics allows you to share your data (for example, with the Benchmarking service). You can trust that if Google isn’t getting more than 60-70% of your data, it’s probably not getting more than 60-70% of anyone else’s data. Hence, the benchmarking reports should be statistically acceptable for most sites.

If you’ve opened up your AdSense data, then it makes sense that Google should be able to take its query data, your Analytics data, and your AdSense data to produce some sort of comprehensive multi-angle analysis of your site’s performance. I have complained about the Webmaster Central functions but I still look at the ranking and keyword performance reports they share. Some information is better than no information.

Still, Google just doesn’t provide enough data for reliable competitive analysis.

Services like Compete buy ISP click data every month (NOTE: Compete describes its data sources for us).

Since Google’s real search market share is less than 40% of search destination visitors, you have to ask if search engine audiences are searching for the same thing. Your server logs can tell you the whole story about why you get more traffic from Google than Yahoo! (or vice versa). In fact, I have some sites that get more traffic from Yahoo! than from Google. Why? In part because they rank better on Yahoo! for relevant keywords, and in part because people search Yahoo! for different things than people search for on Google.

Google’s query data thus doesn’t show you everythihg that is happening in search. That is, Google’s query data is no more a statistically valid sampling of search than is Yahoo!’s or Microsoft’s. In other words, different search behaviors by search engine audience introduce a bias into the search engines’ reporting.

But Internet Service Providers’ users also exhibit different search behaviors. At the very least, there are differences between people with dialup and broadband connections. These inconnsistencies in ISP user surfing patterns thus render suspect any analysis based solely on ISP click data (since not all ISPs sell their click data).

Compete (and other companies) seeks to balance out the data by mixing source information as much as possible. How well they succeed at achieving a scientifically valid estimate is impossible for anyone to really determine. Web surfing and searching behavior are very quantumacious — immeasurable by any resources we can use today.

Now, like Compete, Alexa incorporates toolbar data into its calculations. Alexa has long been badmouthed by the SEO industry for being easily manipulated, but they reengineered their process earlier this year. In fact, they now have a completely new database (and I hope they leave the old data in whatever dusty alcove they placed it). I don’t know how much the Alexa database can be manipulated. If the old “get your friends to install the toolbar” trick still works, then I suppose nothing has changed. They do claim, however, that Alexa obtains data from multiple sources, and other companies have alleged that Alexa buys click data.

So while none of these options really offers us great quality, they at least offer us diversity in data sources and we can sort of play them off against each other. But we can do more than that. Whenever we’re evaluating trends we can compare the trends to our server logs AND to analytics reports (regardless of whose analytics software is installed). Trends don’t require complete data sets, as long as the data that is captured is accurate and statistically acceptable.

Still, many people in the industry long for a secret window into their competitors’ traffic reports. We all suffer from traffic envy, either mildly or severely, and we just want to know who else is getting traffic and how much. In some industries if you know who all the major players are you can develop a substantial projection for when the market may peak (or die out completely). You cannot make such reliable projections solely on the basis of your own traffic.

But if you’re making changes to your site’s structure and content, being able to peek at your competitors’ logs in real-time would help settle some of those queasy feelings you experience. What should you do if you make a sweeping change and suddenly traffic drops off? Knowing that it may have dropped off for your competitors will help you make a better decision about whether to undo your changes or wait out the storm.

For the most part, people have to watch the search results and guess whether their competitors are suffering. For a large site that is not so easy to do. I can lose key rankings with my sites for a few days, even a few weeks, and hardly notice any drop in search referral traffic. Or I may notice substantial losses in traffic without having done anything at all. So being able to track a competitor’s traffic and watching his visits bob up and down along with yours is indeed both comforting and strategically advantageous.

So, by now you’re probably ready to shoot me for that lengthy preamble. Do I have some resources to share? Sure. Have you seen them before? Maybe. Most likely for many of you these resources have proven to be problematic. I can only offer you my opinion on their usefulness and a brief explanation of why I think they may be useful.

Alexa - There are actually quite a few sites out there (most of which seem to be made-for-ads sites) that estimate your traffic on the basis of Alexa data. These sites underwhelm me, but I’ll mention one: WebTrafficForecast.com. The estimated monthly traffic is on the low side but it looks better than many other estimators’ results.

Nonetheless, the Alexa-based traffic estimator tools mostly focus on monthly projections (like WebTrafficForecast does). But Alexa itself will give you a daily heads up — if your site makes it into the top 100,000. Still, you can grab the 1 week averages and play with them. On the other hand, if your site is outside the top 100,000 and your competitors’ site is in the top 100,000, you can at least get Alexa’s idea of how much daily traffic your competitors receive.

Site: Alexa’s Real-time Web traffic estimator

Compete - Many people have complained about Compete’s awfully low traffic estimates. If my sites truly received as few visitors as Compete suggests, I’d probably shut them down. Nonetheless, the trends patterns look pretty good. When I compare them with trends based on my server logs (AND Google Analytics) I feel like they are capturing a reasonable sample.

Of course, Compete only updates its data once a month. That’s a horribly long time to wait if you want to see what is happening with your competitors. Well, here’s hoping they don’t make any changes on the basis of my revelation, but you can look at real-time data for free on Compete.

Just look at the Daily Attention report under their ENGAGEMENT button. You’ll get a rolling snapshot of the past 30 days’ worth of estimated activity.

You can also look at the Velocity report under their GROWTH button but I don’t feel that data is as useful in most cases as the Daily Attention report. Velocity shows you the change in Daily Attention, so it’s a derivative value and is not as useful for competitive analysis (the timeframe is too short).

Site: Compete’s Real-time Web traffic estimator

Me.dium.com - So, Me.dium.com believes that sites people currently view indicate BOTH what is “hot” AND relevant. Frankly, I think it’s only a matter of time before they are spammed into oblivion. But I could be wrong, and you should take a serious look at what Me.dium.com can tell you about a Web site’s traffic.

This is an intuitive measurement, so don’t expect any numbers. Rather, if you find a site ranked first on Me.dium.com for a specific keyword, that’s an indication that people are somehow looking at that site AND associating it with the keyword. This is not based on click-through data like DirectHit’s old spamwagon was (and that technology, you may recall, has been incorporated into Ask’s Edison algorithm).

Me.dium.com explains its technology in somewhat broad and vague terms. They imply strongly that their search results are as close to real-time as possible.

If Me.dium.com survives to become a viable search engine, I think you should look for a whole new set of metrics operating on their principles. Real-time metrics may one day make the difference between failure and success in some search optimization strategies. That day could still be far off, but my heart forbodes that it may be closer than we are all prepared for.

A lot of SEOs are talking about social media, social search, and social networking. However, the social Web technologies are still in their infancies. Of course, I say this as someone who works for a social media technology company, so understand there is some bias in my point of view. One of the very real unmeasured factors of social media technology is that no one has yet surfaced a limit to the vulnerability of these technologies to manipulation.

That’s a complex statement. What I mean is that, so far we have only seen limited manipulation of social media technologies: toolbars have been spoofed, sock puppets have been used to inflate voting results, and voting gangs have all but seized control of social media resources. These are only the first sorties in what promises to be a long, drawn-out conflict between the social media enablers and the social media manipulators.

All that said, if you see your site bobbing up and down in the search results at Me.Dium, take note. Understand that you may be passively participating in a hot topic area. That means you have to think in terms of event-driven SEO, and you have to figure out ways to ride the trend waves and anticipate the next trend wave that will come your way. If you have enough resources at your disposal, you’ll even explore options for creating trend waves (some people call that “marketing”).

Site: Me.Dium’s Real-time Web traffic estimator

Traffic Estimate - No one seems to know how this site comes up with its data. A lot of people have speculated that it uses Alexa data as a basis but I’ve never found any admission from the site founder or any other verifiable proof. NOTE: The charts may be the best clue it offers to its data source. And the unavailability for some sites’ data also implies it’s using Alexa data.

So even though the estimates tell you they are for the past 30 days, they could be based on the previous month’s estimate from some other service.

A lot of people have claimed (since the site came online in 2004) that it delivered pretty accurate estimates for their own sites. Then many of them followed up by saying it didn’t do so well with other sites. I could not help but wonder how they knew it wasn’t accurate for sites they did not control.

People in the SEO community say some really bizarre things.

Anyway, this site comes the closest to matching my server log data of any traffic estimator I have used. I have no idea of how well it does for sites that I don’t control and sites that other people (who claimed it did well for their sites) don’t control. Based on that limited data, you’re probably going to get the most bang for your buck from this site, but the estimates appear to be monthly rather than daily.

I wish it could do real-time estimates, but it was worth mentioning anyway.

Site: Traffic Estimate’s Web traffic estimator

feeds.feedburner.com

published @ September 1, 2008

Similar posts:

Sorry, the comment form is closed at this time.