November 23, 2009
Google Indexes Its Own Toolbar Content(?) 
posted by Erik Dafforn in category: Crawling and Indexing
I don't think this is a particularly big deal, but I am fascinated by crawler behavior and the wheres and whys of crawlers not honoring sites' specific robots directives.
And it makes it even more interesting when the robot and the site belong to the same company.
A few weeks ago, I was trying to find out exactly when Google overtook Yahoo in the race for search engine market share. (It's not important why, but it will help you understand why I was searching for such an odd phrase.)
I ended up searching for this query:
["google passes yahoo" "search market share" 2004]
And the results page looked like this:

If you click over, you can clearly see that we're in the /archivesearch portion of the toolbar.google.com site:
![]()
If you go to the Google Toolbar site's robots.txt file, however, you'll see that this portion is supposed to be off-limits to Googlebot:

(Note: This robots.txt file also has certain "allow" commands, but none that should pertain to this particular page.)
But wait. Couldn't this just be an "uncrawled reference" -- that rare-but-easily-recreated instance where Google indexes pages based on incoming links, but doesn't actually crawl the page, so therefore still honors the robots.txt exclusion protocol?
No, I don't think so, at least in this case. Uncrawled references are generally don't have snippets attached to them, and if you look at the SERP above, you'll see a snipped pulled from deep within the actual page:

I'm not claiming to know each subtle nuance of uncrawled references, but I study robots exclusion pretty closely, and this is the first instance I've seen of a section from within an excluded page being used as its snippet.
I'm certainly willing to concede that Google just happened to find this information somewhere else and attribute it to this page, but part of me making that concession is someone proving that it actually happened. I'm not tied to any particular outcome; I'd just like to learn more about why this happens.
Google Indexes Its Own Toolbar Content(?)
Posted by Erik Dafforn at 5:22 PM
| Comments (14)
| TrackBacks (0)
Printer-friendly version
November 6, 2009
Social Media URL Duplication and the Canonical Link Element 
posted by Erik Dafforn in category: Crawling and Indexing
When most people discuss the canonical link element, they describe its usage in the context of duplicate content mitigation, such as www vs non-www content, print-friendly pages, and so on. This is entirely appropriate. But the ways that we're all creating duplicate content are constantly growing and changing, which means that even if you think you don't need to canonicalize your pages, you might be wrong.
This post discusses how using the canonical link element might help you even if you don't think you need it.
Quick question: Should you use the canonical tag on your pages even if you're not sending out multiple versions of them?
Absolutely.
Why?
Because someone else might be creating versions of your pages that you don't even know about.
Here's an example: When I share something in my Google Reader, here's what happens:
- Twitterfeed grabs my Google Reader "public" RSS feed, which is how my shared items are dispersed.
- Twitterfeed takes the URL I'm sharing and appends two UTM tags to it -- "source" and "campaign".
- Via Twitterfeed, Bit.ly shortens the long URL (including UTM tags) that I'm sharing.
- Twitterfeed shoots the title of the post and shortened URL out over the @intrapromote Twitter stream.
In other words, I might read this URL:
http://searchengineland.com/blocking-and-tackling-10-fundamentals-of-local-seo-29115
But when I share and tweet it, it ends up looking like this:
http://searchengineland.com/blocking-and-tackling-10-fundamentals-of-local-seo-29115?utm_campaign=ipshare&utm_source=reader
Basically, I've created a duplicate URL for Search Engine Land, which they didn't ask for and probably don't know about. But the crew over there has anticipated this, because when you look at the source code for the page I created, you see this code:
link rel="canonical" href="http://searchengineland.com/blocking-and-tackling-10-fundamentals-of-local-seo-29115" /
This tag tells engines that no matter what tags I (or anyone else, including SEL) puts on those pages, this one is the authority.
UTM tags, of course, are primarily for measuring the effectiveness of your own social media endeavors on your own content, but the idea of someone appending tags to your content isn't far-fetched. Don't rule out people wanting to measure everything -- including their effect on other sites' traffic. Agencies use it to measure their efforts to a variety of client sites, and ad-selling sites use it for case study purposes to illustrate their reach.
Search Engine Land likely uses the canonical tags to consolidate authority because of their own tracking tags. But in this case, I've shown how someone on the outside can splinter your authority. It's pretty easy to add this tag to your pages (despite the obvious fact that I haven't done it on this blog yet), and the more ways you distribute your content, the more sense it makes to find the time to do it.
Social Media URL Duplication and the Canonical Link Element
Posted by Erik Dafforn at 7:33 AM
| Comments (84)
| TrackBacks (0)
Printer-friendly version
September 2, 2009
It's Time for Twitter to Take URL Structure Seriously 
posted by Erik Dafforn in category: Social Media
About three months ago, I mentioned in a ClickZ article that Twitter should consider tightening up its structure to avoid some of the duplication it's creating in its URLs.
Back then, for example, Twitter had about 1.4 million URLs indexed on its secure (HTTPS) server. Today, that number has tripled to about 3.5 million. That latter number is just a shadow of the total number of URLs indexed that aren't on the HTTPS protocol, which is about 314 million.
Cap style -- the way URLs appear in your browser (upper or lowercase) -- is just as bad a problem. This link shows the tip of the duplication iceberg using @CNNbrk as an example. A smart server issues URLs in only one cap style and accepts only those same URLs, while redirecting any variations that get requested.
The following image, taken from that link, shows six different cap styles for the single account:

And don't forget the mobile site, m.twitter.com, which gets indexed right alongside the full-bodied version.
About the only canonicalization that Twitter is getting right is the www/non-www issue. Other than that, chaos rules.
The index size wars are over. Lean is the new fat. It's time for Twitter to make a few small tweaks and consolidate some of its splintered authority. All they need to do is agree on a case and protocol style, then either redirect non-conformers or issue canonical tags, and the problem will dry up relatively painlessly.
I also need to acknowledge the work of my industry colleague Edward Lewis, a vocal proponent of what he calls proper "Pascal casing." I make this acknowledgment despite the fact that he didn't appreciate my joke about the way he's "Blaising a trail" for proper Pascal case.
It's Time for Twitter to Take URL Structure Seriously
Posted by Erik Dafforn at 8:00 AM
| Comments (85)
| TrackBacks (0)
Printer-friendly version
May 6, 2009
Google Vanity Profiles Buggy on the iPhone 
posted by Erik Dafforn in category: Google
I wouldn't categorize this under DEFCON 5, but there's a bug with Google Profiles in the iPhone version of Safari.
When you do a vanity search and you've filled out your Google Profile sufficiently, it might show up at the bottom of the first SERP. My profile, for example, does show up correctly on normal browsers.
In an iPhone vanity search, however, something weird happens. Where the profile link would be, there is instead a link to a page called "prose%200", as seen here:

%20 is the escape code for a space, so in reality, the page is called "prose 0". When you click the link in the iPhone, you land here:

I imagine the "pro" in "prose" stands for "profile." As for the "se," I'm not sure. "Search engine," perhaps, although that seems too easy, as well as redundant.
Conclusion: If you have thousands or millions of people searching their iPhones for your Google profile (like I don't), they're not finding you.
Google Vanity Profiles Buggy on the iPhone
Posted by Erik Dafforn at 8:39 AM
| Comments (5)
| TrackBacks (0)
Printer-friendly version
February 10, 2009
Finally ... John Dvorak Exposes SEO Industry 
posted by Erik Dafforn in category: SEO Industry News
PC Mag's John Dvorak has declared SEO to be snake oil. Guess it's time to close up the shop.
Sigh. Not exactly a new theme, but as weak arguments go, Dvorak's is particularly so. I'll sum it up for you in case you don't have the time:
- John gets bad advice about optimizing his blog.
- John's page views decline.
- John equates his bad advice with SEO practice.
- John picks another third-tier technique (tagging) and also equates it with SEO.
- John anecdotally proves that tagging is ineffective.
- John concludes that SEO "simply doesn't work."
But wait! He completes the formula. Don't forget about the final, disclaiming paragraph, designed to hedge himself against any criticism:
Now don't get me wrong. I'm not saying there's nothing you can do to get more attention. Much of what you can control is structural. If you have a blog full of fancy AJAX code, it's going to be difficult to index, for example. Making your Web site search-engine-friendly is one thing, in other words. But using stupid human tricks such as the long URL and tags to get more attention is folly -- and bad advice, from what I can tell. Beware!
In other words, real SEO isn't bad, but bad SEO is bad, but you don't get to know that until you wade through his lesson on why singular techniques, in a vacuum absent an overall strategy, are unhelpful. Come on, John. You're better than this.
Finally ... John Dvorak Exposes SEO Industry
Posted by Erik Dafforn at 7:58 AM
| Comments (8)
| TrackBacks (0)
Printer-friendly version
November 20, 2008
Google Lets Users Promote, Remove, Comment on Listings 
posted by Erik Dafforn in category: Google
This has been discussed for a few months here and there, but this is the first time I've seen it in the wild. Google SERPs are giving users the ability to "promote," "remove," or "comment" on listings:

Here's a closeup of the three. See if you can figure out which is Promote, Remove, or Comment:

I've only begun to play with them, so I have no idea what the implications are. I suspect that like with most things, Google will harness the data and use it in aggregate to try to improve relevance of results. I'm sure we'll read more about that in the next couple days, along with the imminent speculation about "what it all means," which, in the grand scheme, is usually very little. Still, it's cool.
Google Lets Users Promote, Remove, Comment on Listings
Posted by Erik Dafforn at 10:32 PM
| Comments (47)
| TrackBacks (0)
Printer-friendly version
Google SERPs Showing MySpace + other Videos 
posted by Erik Dafforn in category: Universal SEO
I'm surely not the first to notice this, but I saw MySpace video thumbnails in Google SERPs for the first time today:

Looking around, G is pulling from multiple sources, including MetaCafe, CollegeHumor, and this example from Spike:

A couple months ago, AccuraCast noticed two video results in a horizontal line, but in that sample, both videos were from Google-owned YouTube.
This is the next logical step in the universality of Universal Search, so to speak. Is it also the beginning of the end of big corporate presence on shared video sites?
Google SERPs Showing MySpace + other Videos
Posted by Erik Dafforn at 6:32 PM
| Comments (15)
| TrackBacks (0)
Printer-friendly version
November 7, 2008
Social Media Reality Check: Facebook vs. MySpace 
posted by Erik Dafforn in category: Social Media
Submitted without comment:

Social Media Reality Check: Facebook vs. MySpace
Posted by Erik Dafforn at 9:26 AM
| Comments (14)
| TrackBacks (0)
Printer-friendly version
October 28, 2008
Follow Intrapromote on Twitter 
posted by Erik Dafforn in category: Social Media
We've been using Twitter as an internal communications tool for a while as a "protected" feed. In the spirit of TwitterGlasnost, however (and because we were surprised that several people found the feed and requested to follow it), we want to open it up.
What's in the stream?
- Links to posts from this blog
- Links to other SEO-related posts and articles from Intrapromote staffers
- SEO/M "required reading" -- a list of important SEO/SEM-related articles from around the web that our staff members have shared with one another
- Any upcoming speaking gigs or seminars we'll be attending
- The obligatory, enigmatic "and anything else we can think of..." items
Follow Intrapromote on Twitter
Posted by Erik Dafforn at 8:28 AM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
October 20, 2008
Error in Google's robots.txt Docs 
posted by Erik Dafforn in category: Crawling and Indexing
Update: This was fixed rapidly; see Riona's comment.
I don't want to get too deep into the complexities of robots.txt parsing (if you want that, try this, this or this), but I found something odd at the bottom of this page, one of Google Webmaster Help's many pages on robots.txt.
The page says:
URLs are case-sensitive. For instance, Disallow: /private_file.asp would block http://www.example.com/junk_file.asp, but would allow http://www.example.com/Junk_file1.asp.
Here's a picture just so you trust me:

This is wrong in a lot of different ways. Let's look at them with my comments following in bold.
URLs are case-sensitive.So far, so good.
For instance, Disallow: /private_file.asp would block http://www.example.com/junk_file.aspIt would? How?
..., but would allow http://www.example.com/Junk_file1.asp.
I suppose Disallow: /private_file.asp would allow /Junk_file1.asp, but not because of capitalization style. It's because /Junk_file1.asp has nothing to do with the excluded file, /private_file.asp
So what did they mean? If they're anything like me, this was a paragraph started, edited a few times, and never really finished. It appears to try to cover a variety of the issues covered on the page, including cap style, pattern matching, and wildcard characters. Here are a couple alternatives I'd suggest:
URLs are case-sensitive. For instance, Disallow: /private_file.asp would block http://www.example.com/private_file.asp, but would allow http://www.example.com/Private_file.asp.
or, to continue along the pattern-matching theme also discussed on the page, this would work:
URLs are case-sensitive. For instance, Disallow: /private_file*.asp would block http://www.example.com/private_file.asp, but would also block http://www.example.com/private_file1.asp. It would not, however, block /Private_file1.asp.
This is a pretty minor detail at the bottom of an esoteric page, but if you're looking for specific information on cap style and robots.txt, it could cause some head-scratching.
Error in Google's robots.txt Docs
Posted by Erik Dafforn at 7:30 AM
| Comments (25)
| TrackBacks (0)
Printer-friendly version
September 19, 2008
Linking External and Internal Search Terms in Google Analytics 
posted by Erik Dafforn in category: User Behavior
Have you ever wanted to match up internal search terms (i.e., terms that people searched for from your site's internal search feature) with their corresponding external search terms (i.e., terms that people used to find your site in the first place)?
In Google Analytics you can, and while finding the information is not particularly intuitive the first time, it's pretty quick once you know how to do it.
First, of course, you have to set up Site Search, which simply amounts to identifying your site's specific search parameter for Google Analytics so it can scrape the query terms out of your site's search results URLs. Once you've done that (and have begun to gather data for a little while), you're ready to go.
First, drill down to the Content | Site Search | Search Terms report, as shown here:

This shows you all the terms that people searched for on your site, from within your own internal search feature, in the given time period. Pick a term and click it, as shown here:

The resulting screen is the Search Term Overview, which tells you how many people searched for that term, etc. From the Dimension drop-down list, select Keyword, as shown below. This tells Google Analytics to report which external keyword was used by the visitor(s) who eventually searched for "404 redirect" (or whatever search term you selected).

The resulting screen will list the keyword that the user searched for at a search engine to first arrive at your site. In this case, the user searched for "seo using 404 301," as shown here:

If you have a popular search term on your site, the image above would likely be populated with several different external search terms. In this example, however, only one person searched for "404 redirect" on the site in the time period, so there's only one external search phrase that drove the traffic. To find the referring engine, select Source instead of Keyword from the Dimension drop-down.
Exactly what to do with this data is the topic for a separate post, which I hope to have ready soon.
Linking External and Internal Search Terms in Google Analytics
Posted by Erik Dafforn at 11:50 AM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
August 23, 2008
Intrapromote Welcomes Angela Moore as Director of Link Development 
posted by Erik Dafforn in category: Link Building
I wanted to let everyone know how happy we are with a new addition to our staff. Angela Moore has joined us as Director of Link Development, a position that we built around her significant experience and skills. She'll be managing a team and will really broaden the scope of our link building services. We have already seen great things and expect that to continue.
Here's the release. Angela is also a mod at SEW Forums and is already a veteran blogger, so keep an eye on our link-building category (& feed). Welcome, Angela.
Intrapromote Welcomes Angela Moore as Director of Link Development
Posted by Erik Dafforn at 8:26 AM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
August 18, 2008
Twitter and the "Black Box" of Reputation Management 
posted by Erik Dafforn in category: Social Media
I keep reading stories about social media sites like Twitter and how they're revolutionizing customer service. Comcast. H&R Block. Southwest Airlines. On and on. (The organic tie-in here is that for many companies, pages like Twitter profiles -- as well as the news stories that discuss them -- are already showing up on SERPs for company names, and that's going to continue for a while.)
All this is great, of course. But at the same time, it reminds me of an old joke I'm sure you've heard. If an airplane's "black box" is the single, indestructible element of the plane that is nearly always recoverable after a crash, why don't they just make the whole plane out of black box?
Silly, I know. But similarly, if using something like Twitter is the perfect, efficacious form of customer service we've all been waiting for, why is it the exception instead of the rule? Why do companies frequently use social media to apologize for more traditional forms of customer service that garner complaints, instead of propagating these rapid-response techniques across their traditional customer service and support environments? It's a cynical perspective, but I think one reason that Twitter users get quick reaction and kid-glove treatment is that their complaints "have legs." In other words, they're being broadcast to the world, not just to the company. If a company doesn't respond to your forum post or answer your email, yet they respond to your Tweet in 12 minutes, part of you should be happy, and part of you should be angry. You're being addressed because your method of complaint has the most potential to harm them.
If a company had a queue set up so that any 800 call or support forum post that languished unanswered for 24 hours was re-broadcast as a press release, now THAT would be some accountability.
Twitter and the "Black Box" of Reputation Management
Posted by Erik Dafforn at 3:12 PM
| Comments (17)
| TrackBacks (0)
Printer-friendly version
August 8, 2008
The Difference Between Crawling and Indexing 
posted by Erik Dafforn in category: Crawling and Indexing
I talk a lot about crawling and indexing (to the point that we have a dedicated category), but I think it's worthwhile to back up and describe some of what's going on.
The terms crawling and indexing (and indexing's cousin, caching) are frequently used together, but you should not consider them synonyms.
Exact definitions probably differ from person to person, but following is how I explain the processes:
Crawling is the process of an engine requesting -- and successfully downloading -- a unique URL. Obstacles to crawling include no links to a URL, server downtime, robots exclusion, or using links (such as some JavaScript links) from which bots cannot find a valid URL.
Indexing is the result of successful crawling. I consider a URL to be indexed (by Google) when an info: or cache: query produces a result, signifying the URL's presence in the Google index. Obstacles to indexing can include duplication (the engine might decide to index only one version of content for which it finds many nearly identical URLs), unreliable server delivery (the engine may decide to not index a page that it can access during only one-third of its attempts), and so on.
What's the difference between crawling and indexing, in terms of time? Here's a recent example. I recently watched a newly introduced URL to see when it would be indexed. I monitored the text cache query of the URL every four hours starting when the URL went live on July 2. (This URL was one of a number of URLs linked to on a new site map.)
On July 17, the text cache showed results and finally stopped saying "Your search - cache:[URL] - did not match any documents." But what was interesting is that the cached file showed the results of the URL "as retrieved on 8 Jul 08." So make special note that the URL was crawled and cached over a week before it appeared in the index.
A better, more comprehensive test would be to watch server logs and see how many times the file was requested, and with what frequency, between the original request date and date at which the cache query showed results. Additional testing would try to detect ways to shorten that time by increasing the number (and prominence) of incoming links and so on.
The Difference Between Crawling and Indexing
Posted by Erik Dafforn at 12:04 PM
| Comments (94)
| TrackBacks (0)
Printer-friendly version
July 1, 2008
Google to Index Flash Content ... Again 
posted by Erik Dafforn in category: Crawling and Indexing
In a post last night entitled "Improved Flash Indexing," the Google Webmaster Tools blog reports that
We've improved our ability to index textual content in SWF files of all kinds. This includes Flash "gadgets" such as buttons or menus, self-contained Flash websites, and everything in between. ... In addition to finding and indexing the textual content in Flash files, we're also discovering URLs that appear in Flash files, and feeding them into our crawling pipeline—just like we do with URLs that appear in non-Flash webpages. For example, if your Flash application contains links to pages inside your website, Google may now be better able to discover and crawl more of your website.
This brings up several satellite issues:
- Since it's been so difficult to index Flash content, a virtual cottage industry sprang up with ways to circumvent that disability, including methods like SWFObject, sIFR, user-agent-based delivery of plain text vs. Flash content, and so on. With these techniques becoming more sophisticated and easy to implement, is it likely that sites will abandon them soon?
- It appears that for now, Flash files spawned when users fail a JavaScript test will still be uncrawlable, since engines too typically fail a JS sniffer.
- If you have a SWF file embedded as only a part of a larger HTML page, trust me that you do NOT want only that SWF file being returned in search results. It typically looks awful, lacking both the size requirements you implemented, as well as the critical navigation that resides in your HTML. The Webmaster Central post didn't say that SWF files would be returned in SERPs, so I'm not saying that's what will happen. But I've tested client sites by searching for strings of text that only appear in Flash files, and I've seen it happen. So test with your own site and cross your fingers.
I chose a somewhat sarcastic post title because ever since search engines and Flash have butted heads, the ability for engines to index text embedded in Flash files has been "just around the corner." In 2002, for example, hearts were briefly aflutter about the Macromedia Flash Search Engine SDK, which was going to be the end of engines' inability to index Flash content. Hear that? The end. 2002.
So I enter into this new era with guarded optimism. Optimistic because Google never releases anything "new" until it's been tested in the wild for months or years. Guarded because the "right" recommendation for clients is never quite as black and white as people think it will be.
Google to Index Flash Content ... Again
Posted by Erik Dafforn at 9:18 AM
| Comments (4)
| TrackBacks (0)
Printer-friendly version
June 27, 2008
Exactly How Accurate IS Google Trends for Websites? 
posted by Erik Dafforn in category: Web Analytics
Much has been made of the week-old announcement that Google is in the traffic trending game. I weighed in earlier this week at ClickZ, focusing mostly on ways you can benefit from the information and largely sidestepping the already-trodden issues of Google being the only company able to opt out of the reporting, etc.
One question that hasn't been discussed to death, however, is the actual accuracy of the traffic numbers that Google is reporting. I ran some numbers on some sample sites and laid the Google Trends lines over the actual traffic numbers:
Example 1:

Example 2:

The verdict? In general, Google doesn't do too awfully bad, especially considering that neither of the sites above use Google Analytics or Urchin to measure their traffic.
The peaks and valleys are roughly similar. Roughly. Yet the scale is off pretty dramatically, with Google underreporting the traffic on one of the sites by a factor of two.
So my recommendation is that to gauge large trends (seasonality, results of large offline campaigns, etc.), Google Trends is a decent first look. It's probably a safe bet that when you plot two sites within the same vertical, that their relative lines will be more or less accurate when contrasted. But don't trust it for raw numbers.
Just to be fair, Google never said it was 100% accurate, stating in the post that "because data is estimated and aggregated over a variety of sources, it may not match the other data sources you rely on for web traffic information."
Exactly How Accurate IS Google Trends for Websites?
Posted by Erik Dafforn at 2:25 PM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
June 4, 2008
A Guide to Robots Exclusion Protocol 
posted by Erik Dafforn in category: Crawling and Indexing
Google's Prashanth Koppula wrote a ready-to-bookmark post over at the official Webmaster Tools blog, showing tons of different robots-exclusion protocol (REP) directives that can be implemented in various ways. Following is a listing of directives discussed and the methods of implementation:
Directives for the robots.txt file:
- Disallow
- Allow
- $ Wildcard
- * Wildcard
- Sitemaps location
Meta tags for insertion into HTML:
- NOINDEX
- NOFOLLOW
- NOSNIPPET
- NOARCHIVE
- NOODP
Of special note are the two different wildcard uses; the post links to usage models for each. One additional funny bit is in the explanation of NOARCHIVE, in which the post describes the tag's usage as "Do not make available to users a copy of the page from the Search Engine cache." Contrast this with "Do not cache the page," which I believe is most people's idea of the tag's effect. I love little semantic hooks like that.
The post notes that the directives above are observed by Google, Yahoo, and MSN/Live, which is a nice bonus. In addition, the post discusses some directives that only Google honors, such as UNAVAILABLE_AFTER (which I discussed about a year ago), NOIMAGEINDEX, and NOTRANSLATE.
I appreciate what engines are doing with the REP advancements. It's the equivalent of the basic Robotstxt.org protocol being the vehicle, but the engines have become after-market accessory specialists, showing you how to get additional mileage, power, and stunts out of your car.
A Guide to Robots Exclusion Protocol
Posted by Erik Dafforn at 8:07 AM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
May 22, 2008
Tumblr and SEO: A Case Study in Rapid Response 
posted by Erik Dafforn in category: Social Media
Here's a quick case study in how social media sites (more important, the conversations going on at social media sites) are enabling companies to interact with and respond to their users.
Here's the rough chronology. I may have missed some letters in the middle, but points A and Z are pretty accurate.
- Melissa Chang runs a blog on her own domain, using the Tumblr platform. (For the uninitiated, Tumblr is roughly similar to Blogger or Wordpress, although many people seem to use "Tumblogs" as a middle ground between article-length posts and Twitter-like microblog posts.) She is unhappy with her search traffic and writes a post saying so.
- Steve Rubel reads the post and bookmarks it at Del.icio.us.
- Steve's bookmark shows up at FriendFeed, where he aggregates his various social media endeavors.
- A conversation begins at FriendFeed about whether, and to what extent, the Tumblr platform is or is not search-friendly. A somewhat lively and mostly constructive discussion takes place.
- Others lend various perspectives at their own blogs.
- Tumblr reps follow -- and join -- the FriendFeed conversation(s).
- Tumblr responds on its official blog, saying it has already made many of the changes that came from the discussion on FriendFeed and elsewhere.
- Many are happy with the changes; some are not. My personal opinion is that Tumblr may have entered the egg-breaking stage of omelet-making. The site will be better off in the long run.
So a logical question is, how is a "conversation" like the one at FriendFeed different from Tumblr users merely writing to the Tumblr staff and making the same recommendations -- which some users claim they've been doing for a while? I don't know the answer to that. But I think the interest in and productivity resulting from the FriendFeed conversation had a lot to do with it.
Back in the day, big brands used to respond to customer letters. I mean respond. Like type up a reply and send it. This is because they realized that for each person who took the time to write or type a letter, stamp it, and walk it down to the mailbox (later known as the "barrier to entry"), there must be about 10,000 people who feel exactly the same way.
Today, you can send an email as easily as you can cook a Hot Pocket. Anyone can do it. So the 10,000:1 ratio or yore is more like 1:1 today. The FriendFeed conversation shows that not only is more than one person affected, but that actual recommendations can be spat out the back end. I think that's why the response was more rapid.
Very soon, this will be the norm in customer relations, at least for progressive, consumer-focused companies.
Tumblr and SEO: A Case Study in Rapid Response
Posted by Erik Dafforn at 11:15 PM
| Comments (9)
| TrackBacks (0)
Printer-friendly version
April 30, 2008
Real and Imagined Errors in Google Sitemap Feeds 
posted by Erik Dafforn in category: Crawling and Indexing
When you upload your XML sitemap feed to your server -- especially if it's GZipped -- don't expect it to look pretty. I got a nervous call from a client because when he called the XML feed URL in his browser, he saw this:

While it looks like an error, it's really not. Not in the traditional sense, at least. The error here is that your browser (in this case, Firefox) isn't able to view the file without a little help -- specifically, a stylesheet that tells it how it should look to human viewers.
The bottom line is that this message doesn't mean that engines can't read your XML feed -- only that you can't see it. To see whether Google can process it, for example, check the Sitemap Summary report. For some reason, this report isn't in the main GWT left nav. To find it, you need to click the "Details" link at the far right of the Sitemap Overview report. When you click that link, here's what you see:

Real sitemap errors do exist, even in the example I used above. In this case, I've inadvertently included in the sitemap a URL that I also excluded via robots.txt. So I'm sending Google a mixed message there. Fortunately, the robots.txt file overrides the URL's inclusion in the sitemap, so it ends up being more of a gentle nudge than a true, crippling error. If the error doesn't specifically say that the sitemap is invalid and unreadable, then it's probably not.
Real and Imagined Errors in Google Sitemap Feeds
Posted by Erik Dafforn at 8:52 AM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
April 23, 2008
Update on Google Showing Excluded URLs as Sitelinks 
posted by Erik Dafforn in category: Crawling and Indexing
A little over a month ago, I wrote about Google showing robots-excluded URLs as Sitelinks. Here's a shot of what Google showed for the query [seo speedwagon] in mid-March:

The ip login link was (and is) excluded via robots.txt. A month prior (in February), a link to one of our monthly archives -- a page with the robots "noindex" meta tag -- appeared as a Sitelink also.
Since then, the SERP has been cleaned up. I use the passive voice because I don't exactly know who to thank. Either the algo picked it up on its own, or someone hand-washed it. Either way, it looks better now:

I'm not sure if we're an isolated case, so if you have any examples of excluded URLs still showing up in Sitelinks, please let us know in the comments.
Update on Google Showing Excluded URLs as Sitelinks
Posted by Erik Dafforn at 8:12 AM
| Comments (4)
| TrackBacks (0)
Printer-friendly version
April 1, 2008
Google Serves Ads Based on Previous Queries 
posted by Erik Dafforn in category: Adwords
In 2005 (as reported by Search Engine Journal), Google applied for a patent called "Results based personalization of advertisements in a search engine." Part of the patent abstract reads as follows:
The search results are personalized based on a user profile of the user providing the query. The user profile describes interests of the user, and can be derived from a variety of sources, including prior search queries, prior search results, expressed interests, demographic, geographic, psychographic, and activity information.
Until now, I hadn't seen any instances of Adwords being served based on prior queries in the same session. (This doesn't mean it hasn't happened -- only that I haven't seen it.) But recently I've begun to notice it when signed in to my Google account. Each time I've noticed it (it's been hard to reproduce) it typically occurs after several searches for one particular topic, followed by a sudden shift to a query for another topic. For example, here is one recent search pattern:
[laptops]
[laptop repair]
[laptop parts]
[trucks]
Here is the resulting SERP for the [trucks] query. I've compressed the page so you can see both organic and paid results:

Here is the query set for the second example:
[gloves]
[work gloves]
[gardening gloves]
[jersey gloves]
[heavy duty gloves]
[wheelbarrows]
And here are the organic/paid results for [wheelbarrow]:

The second example is admittedly less convincing, because it's plausible that glove retailers could purchase bids for "wheelbarrow" terms. But I was unable to see any "glove" ads in subsequent searches for "wheelbarrow" terms.
This is interesting because query results like this allow the ad to really stick out contextually and give the advertiser the whole stage, so to speak, for a certain term. And even though the user has changed gears and is searching for something new, the "old" vein of queries is certainly still in his or her mind. I would love any feedback about how widespread these results are, CTR data for "residual" query ads, etc.
Google Serves Ads Based on Previous Queries
Posted by Erik Dafforn at 7:39 AM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
March 13, 2008
Google Showing Robots-Excluded Links in Sitelinks 
posted by Erik Dafforn in category: Google
You might have noticed that Google rolled out sitelinks for a new batch of sites a couple weeks ago. This blog was included in that batch, as you can see if you do a query for [seo speedwagon].
The goal here isn't to beat up on Google, but I think it's significant enough that site owners should be aware of it. In a couple cases, the sitelinks that Google shows (or showed) for our site have been links specifically excluded from robots, either via robots.txt or by the "noindex" attribute in the robots Meta tag. Following is a screen shot of the [seo speedwagon] query taken on February 26, which is roughly when the new batch of sites started noticing their sitelinks:

Note the two red-outlined links. The one in the left column, ip login, is our staff login page. It's been excluded by our robots.txt file for almost three years. Coincidentally, Google couldn't index that page if it wanted to, as it's password-protected. I know that robots.txt exclusion isn't a totally reliable way to keep a URL from showing up in SERPs, as it often causes what's known as a "partially-indexed" URL (example). But come on -- a Sitelink?
The outlined link in the right column (November 2007) is a typical (if capriciously chosen) monthly archive page -- exactly the kind you see in the third column of this blog. They're ugly, more or less useless (both for SEO and for people), and I'll probably eventually do away with them, but for now, there they are. But the important thing here is that I added the robots "noindex" tag to them well over a year ago.
Just this week, Google changed the format slightly. Here's a current shot:

The November 2007 link (excluded via Meta tag) is now off the list (automatically -- I didn't do it), but the ip login link remains.
Yes, I know I could block specific sitelinks from within Webmaster Tools. And I might, but I wanted to show it to you first.
It seems like excluding specific URLs via robots.txt or via the robots meta tag should be a sufficient method of opting URLs out of sitelinks.
This topic is especially timely as Matt Cutts just recently asked users how they'd prefer that a meta-tag-excluded URL appear -- if at all -- in the Google index. As of this writing, 83% say "Don't show a link at all." I don't want to speak for his readership, let alone all site owners, but I can confidently predict that most people don't want a robots-excluded URL (regardless of whether the exclusion mechanism was robots.txt or a robots "noindex" Meta tag) showing up in a Sitelink.
Google Showing Robots-Excluded Links in Sitelinks
Posted by Erik Dafforn at 10:33 PM
| Comments (9)
| TrackBacks (0)
Printer-friendly version
March 4, 2008
NYT Traffic Doubles, Revenue Grows Since Killing Subscriptions 
posted by Erik Dafforn in category: Old Media
John wrote a few times last fall about the NY Times tearing down its paid subscription wall and allowing spiders in.
Now, in an interview at The Deal, Google's David Eun (on p. 5) confirms that it was a good idea:
We have some partners that have made very bold steps, such as The New York Times, which went from a pay model to a free model. After they went free, the traffic they got from us alone doubled. Their math says they make more money by offering content free to consumers, but stimulating demand and making it work with advertising. The Financial Times did the same thing, and at least early on in the process they experienced at least a 100% growth in traffic.
Don't hold your breath waiting for further breakdown of the math, especially for the NYT example. Note that while Eun says traffic doubled, he was less specific about the money, saying only that "they make more" under the current scenario.
It should be no surprise that it's Google -- not the Times -- telling us the good news about expanded indexation. After all, Google has more to gain from all of us knowing about it, because it now gets a slice of the pie:

Thanks to BeetTV via SearchCap.
NYT Traffic Doubles, Revenue Grows Since Killing Subscriptions
Posted by Erik Dafforn at 10:51 AM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
February 28, 2008
Heirs Still Fighting Over the Page View Estate 
posted by Erik Dafforn in category: Web Analytics
Good article in Computerworld this week called Life After Page Views: Web Analytics 2.0.
To sum up, the page view has been tossed into the Pythonesque "bring out your dead" cart by a lot of people, including me, in an article I wrote at ClickZ a year ago:
Page views have long been one of the Web's most reliable measurements. But because of technologies like AJAX, Flash, and RSS, a site can perform at engines better than ever and users can spend as much (or more) time on your site than ever before, but the page view count won't reflect it. Page views rely on Web 1.0's click-and-wait model. ...
Sites with an income model that relies on excellent search engine positioning and subsequent page views must be especially diligent in showing potential advertisers a true picture of the site's user experience. Whether it's shifting the influence of time spent on a site, adding script-based click tracking to internal AJAX applications, or something entirely different, a multifaceted approach to Web measurement is becoming more and more important for Web monetization.
So imagine how vindicated I felt when, last July, Nielsen / NetRatings decided to abandon the page view as the primary web analytics metric. From the CW article:
At the time, the Internet benchmarking firm cited the growing popularity of Asynchronous JavaScript and XML, or AJAX -- which can refresh content without completely reloading a Web page -- as the main reason for the change to measuring time spent on a site.
But it turns out that video, not AJAX widgetry, is the major culprit in the growing chasm between falling page views and climbing "time spent" online. All of which leaves us with the same question: How do we measure consumer engagement in a post-page-view web publishing landscape?
The article is a little too long to sum up quickly, so I do recommend the read. The basic issue is that companies like Nuconomy are trying to be the first out of the gates with new engagement-measuring metrics such as "comments added to blogs, ratings, applications shared with friends, clicks on ads and online video use -- all of which can show how 'engaged' a user is with a particular brand or product," while folks like Avinash Kaushik (Google Analytics guru and recent SEMMY winner) caution us against rushing out and arbitrarily defining concepts while totally abandoning concrete measurements.
"I am not saying don't create engaging experiences," he added. "[Just] don't use the term engagement, because it has been bastardized to the point that it doesn't mean anything."
More questions than answers, certainly, but that's not necessarily bad.
Heirs Still Fighting Over the Page View Estate
Posted by Erik Dafforn at 10:30 PM
| Comments (5)
| TrackBacks (0)
Printer-friendly version
February 18, 2008
MSN's Berkowitz Pulled from the Index 
posted by Erik Dafforn in category: MSN
I haven't seen this anywhere except ClickZ and I thought you might be interested. As of last Thursday, Steve Berkowitz, the SVP of Microsoft's Online Services Group, is out. He'll be staying through August "to ensure a smooth transition."
In the big picture, two years doesn't seem like quite enough time to have turned the MSN Search ocean liner around, despite the fact that Berkowitz is credited with Ask's financial turnaround during his tenure there. But someone has to fall on the sword in situations like this, and it looks like he was the logical choice. One wonders whether a simple management shuffle will have a significant effect, or whether it's merely bringing a sharper knife to the gunfight.
Further reading:
MSN's Berkowitz Pulled from the Index
Posted by Erik Dafforn at 4:06 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
February 8, 2008
Predictive Search Merges into Consumer Apps 
posted by Erik Dafforn in category: Tools
This isn't breaking news, but in their recent versions, both Netflix and iTunes have integrated some very smart internal search utilities into their systems.
They're using a type of search function that goes by several different names, including "predictive," "intuitive," or "suggestive" to offer users additional help during the internal search process. Here's an example of Netflix's system in action:

The Netflix system appears to list terms in alphabetical order, and the search term itself is always first in the list of suggestions. This is good intuitive search, but it's not as good as iTunes' method:

iTunes uses a pretty sophisticated algorithm that appears to rank by popularity (instead of alphabetical order) and perhaps more important, inserts the typed term anywhere in the query that makes sense -- not just as the first term in the string.
Why does predictive search matter? Because when users select the right artist, song, film, whatever -- that's a conversion. These intuitive search features shorten the click path between a user wanting something and getting something. Compare these two potential search paths:
Without intuitive search:
- User types terms at a search box
- User clicks "submit"
- Site (or app) returns search result
- User scans search results page
- User clicks result that matches his/her query
- Site (or app) delivers correct page
With intuitive search:
- User types terms at a search box
- Site (or app) displays potential queries immediately
- User clicks term from dropdown suggestion box
- Site (or app) delivers correct page
A click path is like plumbing with loose joints. The more twists, turns, and connections, the more cargo (visitors) you lose due to leakage. In the cases above, the addition of intuitive search reduces the plumbing overhead by a third.
Coincidentally, the respective features of Netflix and iTunes parallel those of Google Suggest and Yahoo Search Suggest, which I wrote about a few months ago.
Predictive Search Merges into Consumer Apps
Posted by Erik Dafforn at 7:02 AM
| Comments (9)
| TrackBacks (0)
Printer-friendly version
January 28, 2008
Keyword Research as a Predictor of Sales 
posted by Erik Dafforn in category: Keywords
Here's a short but important note about relying on keyword demand to predict general industry sales trends or successes.
Sometimes, keyword demand is a fairly accurate reflector (or predictor) of interest and/or sales:

Stats: According to this game industry blog, here are the respective console sales for 2007:
Wii: 6.29M units
Xbox: 4.29M units
PS2: 3.97M units
PS3: 2.56M units
And sometimes it is not:

While this chart might correspond roughly to the sales of player units (578,000 HD DVD and 370,000 Blu-ray machines will be sold by the end of [2007]"), one would be advised against picking a format "winner" from this chart (see this or many other articles like it). Most of the technorati (small "t") realize that a PS3 console also comes with a built-in Blu-ray player, so those searching for [blu-ray] are only a fraction of those searching for Blu-ray. If that makes sense.
Disc sales tell a story different from the sales of hardware units. PC World says "Blu-ray Disc movie titles outsold HD DVD in the United States by a nearly 2-to1 margin last year, according to sales figures from Home Media Research."
Using trending charts to estimate sheer search volume is a pretty sure bet. But be careful about drawing conclusions about popularity and intent out of those raw numbers.
Keyword Research as a Predictor of Sales
Posted by Erik Dafforn at 12:00 PM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
January 23, 2008
Duplicate Content - Thinking Inside the 'Big Box' Stores 
posted by Erik Dafforn in category: Crawling and Indexing
SEOs (myself included) love to preach the Organic Gospel as if knowledge is the barrier, not implementation. "If people only knew this or that," we say, "they'd be saved."
But a lot of times, knowing the right stuff only leads to the next barrier, which is, "How the heck to we DO it?" Various facets of site maintenance -- user experience and tracking, to name only two -- frequently compete with SEO techniques for front-burner attention.
As an example, I want to look at Circuit City's site and show a couple examples of things they're doing that aren't optimal from an organic SEO perspective, but that are probably necessary for other reasons. (I should note that I LOVE the CC site from a user's perspective. They -- and a lot of the Big Box stores, to be fair -- have a great way to narrow and expand the choices to help you find exactly the product(s) you're looking for.)
The first example, however, is one I'm fairly critical of, because I feel like whatever benefit they're gaining probably isn't worth it. The CC home page has two links to the main "TV & Home Entertainment" page -- one in the top nav, and one in the lower set of links. You can see the links in the following screen shot, and I've listed them afterward:

Here are the respective links. I've bolded the points at which the dynamic strings diverge:
http://www.circuitcity.com/ccd/categorySpecial.do?catOid=-12866&N=20012866&c=1
http://www.circuitcity.com/ccd/categorySpecial.do?catOid=-12866&N=20012866
&SESSIONLINK&cm_re=011308%20HOME%20PAGE%20A-_-navboxes%20
TV-_-TV%20and%20Home%20Entertainment
The content of the pages is identical, so you know where this is going. They're diluting the potential power of each by having a similarly named mirror.
Let's look at a more complicated example. Consider two more URLs on the Circuit City site, and I'll contrast the query strings and the page contents. I've stripped out the "http://www.circuitcity.com" portion of the URL for brevity, but I've linked to them so you can see for yourself if you want. In addition, after each URL, I've shown the breadcrumb navigation from each page so you can see the subtle difference.
Page 1: TVs that cost $500-$999 that are Sony:
URL: /ssm/Televisions/sem/rpsm/c/1/catOid/-12867/Ns/net_price
|0||accm_grs_mgn_dllr|1/link/ref/N/20012866+20012867+312867003+40000229
/link/ref/rpem/ccd/categorylist.do
Breadcrumb trail:

Page 2: TVs that are Sony that cost $500-$999:
URL: /ssm/Televisions/sem/rpsm/c/1/catOid/-12867/Ns/net_price
|0||accm_grs_mgn_dllr|1/link/ref/N/20012866+20012867+40000229+312867003
/link/ref/rpem/ccd/categorylist.do
Breadcrumb trail:

The on-page content from these two URLs is identical. After all, no matter in what order you query the database, it should theoretically produce the same products (in this case, three specific TVs). But notice (in bold) how the order of two parameters is swapped in the two URLs, in effect causing a duplication. The content (and breadcrumb navigation) is generated based on the order in which the user selects search criteria. This makes a fantastic user experience -- no doubt about it. But it's hurting them subtly because engines either crawl too many pages and dilute each one's unique potential for ranking well, or, more likely, the bots hit a nav scheme like this, turn a few corners and crawl a handful of pages, then bail because they can recognize what a sinkhole it is.
About a month ago, a WebmasterWorld thread ($upport required) discussed a topic similar to this. Member PageOneResults discussed a client's site, which offers multiple paths and entry points to specific product URLs, with the final product URL varying based on the entry point used and the path taken to that product. Following is a response to his original post, followed by his reaction:
>>If the URL depends on the route taken through the site, then you have a major problem to figure out and fix.Yes, we have a major challenge ahead of us in regards to the one example provided where there were 10 access points for one product. That takes into consideration 5 under www and 5 under non www which is what is happening.
I'm all for as many access points as can be possibly provided as to not hamper the visitor experience. And, as pointed out, as long as that product leads to the same URI from all access points, life is sweet. But, that is not the case...
One important note is that PageOneResults is aka Edward Lewis, who runs SEO Consultants (of which Intrapromote is a proud member). Edward has probably forgotten more about SEO in the last 12 hours than I have ever known, so when he asks for input, it's not due to lack of knowledge. The bottom line is, this stuff can get extremely complicated regardless of your SEO knowledge level.
Duplicate Content - Thinking Inside the 'Big Box' Stores
Posted by Erik Dafforn at 11:35 AM
| Comments (17)
| TrackBacks (0)
Printer-friendly version
January 7, 2008
All Eyes on Wikia Search Launch 
posted by Erik Dafforn in category: SEO Industry News
After more than a year since the initial news, Wikia Search officially launched this morning. I won't bore you with the reviews, which are mixed (although seldom neutral).
Probably the funniest line came from Matt Cutts, whose
...reaction is pretty simple: congrats to the Wikia crew on your public launch, and welcome to the search industry! I’m glad that you’re jumping into the search space.
This seems a little like Tom Brady welcoming his grandmother to the pickup scrimmage at the family reunion.
All Eyes on Wikia Search Launch
Posted by Erik Dafforn at 7:46 AM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
December 19, 2007
Supplemental Index, We Hardly Knew Ye 
posted by Erik Dafforn in category: Crawling and Indexing
The Google Webmaster Central Blog put another nail (the final one?) in the coffin of Google's infamous Supplemental Index just now by declaring it fully immersed into the main index:
We improved the crawl frequency and decoupled it from which index a document was stored in, and once these "supplementalization effects" were gone, the "supplemental result" tag itself -- which only served to suggest that otherwise good documents were somehow suspect -- was eliminated a few months ago. Now we're coming to the next major milestone in the elimination of the artificial difference between indices: rather than searching some part of our index in more depth for obscure queries, we're now searching the whole index for every query.
This is, in my opinion, much more significant than the prior act of simply removing the "Supplemental Index" label. The main problem has never been the label applied (or not applied) to URLs, but the fact (or at least the fear) that SI pages were being given short shrift in their efforts to contend for queries. So what's the intended result?
From a user perspective, this means that you'll be seeing more relevant documents and a much deeper slice of the web, especially for non-English queries. For webmasters, this means that good-quality pages that were less visible in our index are more likely to come up for queries.
Of course the onus doesn't fall entirely on Google here. SI pages were SI for a reason. If you think they're worth ranking for, the old rules still apply. Make sure you remove any obstacles to crawling and indexing that may remain, and try to get some additional links -- internal and external -- pointing to them.
Supplemental Index, We Hardly Knew Ye
Posted by Erik Dafforn at 12:29 PM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
December 14, 2007
Big Update at Google Analytics 
posted by Erik Dafforn in category: Web Analytics
Late yesterday, the Google Analytics team announced a major update to its free analytics package.
Taking full advantage of the upgrade requires something that I'm sure that the GA team wishes didn't have to happen -- the modification of the tracking codes on every page of your site. Basically, you'll need to change the small snippet of code that used to refer to urchin.js so that it now will reference ga.js -- Google's new JavaScript tracking file.
But not to worry. The team has assembled a 22-page Tracking Code Migration Guide (PDF) designed to, um, walk you through the process.
Beyond a simply explaining how to update your code (which shouldn't be a problem if you input the original code in the first place), the guide explains the benefits of the new system by showing additional features, such as:
- Tracking virtual page views
- Tracking downloaded files
- Tracking a page in multiple accounts
- Tracking subdomains
- Track a visitor across domains using a link
- Track a visitor across domains using a form
- E-commerce transactions
- Adding organic sources
- Segmenting visitor types
- Restrict cookie data to a subdirectory
- Control data collection settings
- Control session timeout
- Control campaign conversion timeout
- Custom campaign fields
- Using the anchor (#) with campaign data
- Setting keyword ignore preferences
- Control the data sampling rate
Some of these features already exist in one form or the other. For example, you can track file downloads by defining one of your conversions as such. But the new iteration promises more simplicity, which is never a bad thing.
Remember, as always, this is a beta release. (But you knew that, didn't you?) I haven't updated the code on our sites yet, so I can't vouch for any particular improvements. But I am eager to get into it and will certainly post any interesting tidbits right here.
Big Update at Google Analytics
Posted by Erik Dafforn at 8:40 AM
| Comments (9)
| TrackBacks (0)
Printer-friendly version
November 30, 2007
Will Google's New Linking Stance Create Innocent Victims? 
posted by Erik Dafforn in category: Link Building
Color me at least somewhat concerned about the latest revision to Google's stance on buying and selling of links. Here's the phrase that worries me:
Buying or selling links that pass PageRank is in violation of Google's webmaster guidelines and can negatively impact a site's ranking in search results.
Conventional wisdom, until now, has stated that "you can't be penalized due to who links to you; you can be penalized only because of whom you link to." Because otherwise, if you could be penalized based in inbound links, all a competitor would have to do is purchase a ton of "noteworthy" links on your behalf, right?
Isn't this reason for concern?
Will Google's New Linking Stance Create Innocent Victims?
Posted by Erik Dafforn at 2:32 PM
| Comments (4)
| TrackBacks (0)
Printer-friendly version
November 23, 2007
SEO Won't Help You... 
posted by Erik Dafforn in category: Corporate Reputation Management
... When you're Sears and your site's not prepared for the Black Friday Effect:

SEO Won't Help You...
Posted by Erik Dafforn at 4:17 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
November 12, 2007
Is Jill Whalen a Scam? Yes, She Is! 
posted by Erik Dafforn in category: SEO Humor
In my research for last week's post on click distribution across Google Sitelinks, I found something pretty funny. I was testing to see what names, when used as queries, generate sitelinks when the names themselves are not part of the domain.
About the only person in the SEO/M space who can claim that is Jill Whalen, because a query for [jill whalen] brings up a set of Sitelinks for her site, HighRankings.com. And that is very impressive.
But what concerns me is the paid ad that comes up for that query:

I don't want to give that site any real credence (so if you want to key it in, go ahead, but don't expect a link). I clicked over, expecting to at least find some valid accusations. Instead, it was more like a biography written by an eighth-grader (to be read, apparently, by sixth-graders).
But if we start to put the pieces together, I think we might find that Jill really IS a scammer. To wit:
- She offers a newsletter on Thursdays, yet I can recall several instances of issues coming out on Wednesdays, or worse yet, Fridays. LIES.
- On the High Rankings Forum, several topics are labeled as "pinned." Yet they're not really "pinned" at all, are they, Jill? Aren't they really suspended in place using some sort of code? HALF-TRUTHS.
- Jill co-founded SEMNE, or the Search Engine Marketing Network for New England. Yet sometimes it is called the Search Engine Marketing Organization for New England. So which is it? And shouldn't it be SEMNNE? Or SEMONE? Where did the other letters go, Jill? Where? DECEPTION.
I think I've made a pretty strong case. Proceed with caution.
Is Jill Whalen a Scam? Yes, She Is!
Posted by Erik Dafforn at 5:30 PM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
November 9, 2007
Web 2.0.1: Introspection and Backlash 
posted by Erik Dafforn in category: Social Media
It's hard to toss a drunken twenty-something across a room these days without him or her landing on a "social media expert" (who, coincidentally, happens to be another drunken twenty-something), but here's a collection of fogies (people over 30) who have a nice sense of history and perspective:
- Rich Skrenta, Network Effect Entrepreneurs. New technology is fine, but many recent successes came from using old technology in a new way -- and being the first one to capitalize on it.
Ebay was like this too. You could write a clone of ebay in a weekend. It's printf's and a database. But there's no point, because the trick would be how you would get everyone from over there onto your site.
- Nicholas Carr, The Social Graft. A live vivisection of Facebook's advertising announcement, with a
lightdose of snark.There is no intimacy that is not a branding opportunity, no friendship that can't be monetized, no kiss that doesn't carry an exchange of value. The cluetrain has reached its last stop, its terminus, the end of the line.
- Jill Whalen, Social Media Marketing: The New SEO? A really good take on what social media really is -- and isn't.
My fear with all the hype about social media marketing is that people new to search marketing will believe it's what SEO demands and what SEO is all about.
It isn't. Not by a long shot.
Web 2.0.1: Introspection and Backlash
Posted by Erik Dafforn at 7:58 AM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
November 6, 2007
Google Sitelinks Expansion: Early Results in Traffic Funneling 
posted by Erik Dafforn in category: Google
As you probably recall, Google rolled out an enhanced version of Sitelinks in mid-October. I thought it would be interesting to monitor the early results and see how effective the new links are.
Following is a typical example of Sitelinks. Google now shows up to eight links instead of the maximum of four that it showed only a month ago. When I refer to traffic later in the post, these Sitelink numbers are the links/URLs I'll be talking about:

The point of this post is to show you what happened (if anything) to the traffic that had traditionally funneled to either the main link or to the four Sitelinks. I plotted traffic to each of the nine links through October to see what would happen. the following charts reflect these filtered criteria:
- The query term was a single word
- The referrer was Google (organic)
- The entry page was the exact URL of the link being discussed
In addition, here are some important caveats:
- These charts are NOT all the same scale; I can't give actual visit numbers, but I will give the percent of all clicks received. The "main" link, as well as Sitelinks 1-3, pull in some serious numbers. While the traffic spikes in Sitelinks 5-8 will look pretty large, they shouldn't be construed as having the same traffic numbers. More on that as I discuss each link.
- Don't necessarily infer any proposed correlation between drop in traffic to one link and rise in traffic to another. These things are controlled by many, many more factors than the mere existence of new Sitelinks.
- The Sitelinks change was announced around 10/18, but it took a while to roll it out to all DCs. I didn't see it for any searches until at least the 25th. Keep a gradual rollout in mind when you look at the charts for links 5-8.
Okay, here we go. Following are descriptions of each link followed by a graph of the traffic to that link for October.
URL/Sitelink 0: The "main" link -- represented by "Company Name and Stuff" in the shot above. A slight drop overall, but it appeared to happen across the month, not necessarily at the same time as the Sitelinks rollout. Total traffic: 67.6% :

URL/Sitelink 1: The first true "Sitelink" link. A slight decline throughout the month, but again, not necessarily correlating to the Sitelinks rollout. Total traffic: 25.2% :

URL/Sitelink 2: Like the previous two links, it declined slightly. It looks a little more aligned with the rollout, but not completely. Total traffic: 5.1% :

URL/Sitelink 3:This link actually showed modest gains, starting about the time of the rollout. Total traffic: 1.6% :

URL/Sitelink 4:In early October, this link was already coming down from an offline push that peaked in late September. But the Sitelinks rollout didn't seem to help it, as it shows additional decline after the rollout period. One additional thing about this link: It's not what you'd traditionally think of when you type "keyword," so I attribute a lot of its clicks to curious onlookers who didn't expect to see it there. The other side of that sword is that now, the query shows four new, shiny links in the other column that will continue to drain clicks away from this guy. Total traffic: .22% :

URL/Sitelink 5:Okay, the first of the new links. From out of nowhere, it starts getting traffic on 10/17. But not that much. Total traffic: .12% :

URL/Sitelink 6:Like Sitelink 5, this one really jumped when the rollout started. It had just a few clicks before the rollout for this query, because this URL also ranks for "keyword" on its own somewhere beyond Page 2. Total traffic: .07% :

URL/Sitelink 7:In addition to its new location as Sitelink 7, this URL also lives above the fold on Page 2 for the same query. Since the Sitelinks rollout, it's on a pace to roughly triple its former traffic (for this keyword only, of course). Total traffic: .06% :

URL/Sitelink 8:This link came from nowhere, but it didn't do much. Part of it might have to do with being in the eighth spot, but more likely it's because I believe this particular link doesn't interest people who are searching for "keyword." Total traffic: .02% :

Required disclaimers. This click distribution across the nine links (main link plus eight Sitelinks) is highly variable and will change depending on what Google picks for your Sitelinks, how well the links match the query itself (and the intent of the searcher), etc.
The interesting thing for me here is not that Sitelinks 5-8 are getting clicks. That's not newsworthy. But from a behavioral perspective, it's interesting to watch how users react to links they might not have expected to see associated with their query.
One final note: These eight links are the ones that Google auto-generated. We'll be doing more posts about the ability to subtly affect the Sitelinks choices in the future.
Google Sitelinks Expansion: Early Results in Traffic Funneling
Posted by Erik Dafforn at 2:29 PM
| Comments (9)
| TrackBacks (0)
Printer-friendly version
October 19, 2007
Using Yahoo Search Assist for Keyword Research 
posted by Erik Dafforn in category: Keywords
In addition to using "typical" keyword research tools like WordTracker and Keyword Discovery, I frequently pop in to Google Suggest because I appreciate the quick interface and I like the "from the horse's mouth" approach to spotting keyword trends. While Google never really comes out and says it, I think it's defensible to suggest that the listings are ordered based on popularity. Here's a look at a Google Suggest query for [sports]:

A couple weeks ago, Yahoo announced Yahoo Search Assist, a tool similar to Google Suggest that helps refine and suggest queries based on what other people are searching for. Here is the resulting screen for a Yahoo Search Assist query for [sports]:

You should immediately see a critical difference in how the engines serve the suggestions. Google Suggest displays only those terms that begin with your search term.
Yahoo Search Assist shows terms and phrase that include your terms anywhere in the query. That's a huge improvement, and I hope Google Suggest takes a cue from that feature.
Both Google Suggest and Yahoo Search Assist contain a feature worth noting, and it can throw you if you're not paying attention. Typically, once you complete a word, that word disappears from the list of suggestions, because (I assume) the engine believes you're thinking beyond that single word. Here's a good example. At Yahoo, as you type the word vacation, you see the word vacations in the list of suggested queries:

But as soon as you completely type vacations, that term disappears from the list of suggestions:

As I said before, the same thing happens at Google. So if you typed too quickly, you might think the term "vacations" isn't popular. But nothing is further from the truth. So if you're doing quick, impromptu keyword research at either Google Suggest or Yahoo Search Assist, type slowly, because a lot can happen between keystrokes.
Using Yahoo Search Assist for Keyword Research
Posted by Erik Dafforn at 8:05 AM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
October 4, 2007
Someone's Been Playing with Wikipedia's Google Coop Feed 
posted by Erik Dafforn in category: Wikipedia
I won't even bother trying to figure out what "Hannah is a silly billy" means (other than the obvious), but it's worth noting that Wikipedia's Google Co-op feed has been compromised. Or, at a minimum, poorly maintained, as the following shot shows:

For no particular reason, I subscribe to Wikipedia's Google Co-op feed, which means that if Wikipedia has built a custom query result around a particular query -- and I search for that exact query -- then Wikipedia's result will show up first on my SERP -- above all organic results (but not above paid listings).
For the record, here's how it's supposed to work. Following is a "real" Wikipedia Co-op entry, this time for [hank aaron].

The text on the Co-op entry isn't pulled from the actual Wikipedia entry for Hank Aaron. Typically, they're custom written and uploaded through Google Co-op and delivered only to Google account holders who have subscribed to a specific organization's feed.
Someone's Been Playing with Wikipedia's Google Coop Feed
Posted by Erik Dafforn at 1:15 PM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
October 1, 2007
Descriptive Snippets for News Sites 
posted by Erik Dafforn in category: SERP Comparisons
On the heels of Google noting the importance of a strong meta description, I felt compelled to remind you that while I agree with that in theory for most sites, not all Google properties are using it the way publishers would like. Old media is a big enough ship to turn around, and it has finally swung around to see that search is important (see John's post about Times Select), so while they're feeling nimble, I want to offer them some additional advice on click-throughs from news sites such as Google News.
Descriptive snippets on news sites have a tough job. Newspapers need their descriptions to be the "hook" that entices readers like me to click through. They need to be written to fully reside in the confined character quarters that news SERPs allow them. They need to short enough to tease and convince me that I won't get the full story by skimming headlines, but they need to be long enough that I believe THIS SPECIFIC SITE has the full story.
But the problem is that Google News doesn't consistently pull descriptive data from the meta description. Instead, it tends to pull characters from the byline, wire data (if applicable), graphics captions (if applicable), and the first paragraph of the story.
Take the Houston Chronicle as a site that just doesn't get it. In looking through my customized home page of Google News, the algo determined I might be interested in the articles shown here:

The Chronicle messes up because the first text the engine sees after the author's byline is the error text belched out by the Flash sniffer. Not exactly your article's best foot forward.
In a situation like this (an algo-generated list of stories I might like), the Chronicle has the only article about Franchione (the Texas A&M football coach). So the headline itself might be enough to convince me to click through. But if I'd searched Google News for [Franchione], the Chronicle's article would be one of many, and due to the lack of description, my click would almost certainly go elsewhere.
So what's a poor paper to do, beyond making sure its no-Flash error text gets buried out of the way? Is the rest of the article set up to give your description maximum exposure? Take a look at the following headlines and descriptions and see who really gets it:


In the shot above, the green text is descriptive text about the story itself, while the yellow text is author/byline/wire information. The LA Times gets it because they bury the byline AFTER the story's lede, as shown in the shot at the left. Only the Times gets its WHOLE abstract on the SERP. The other papers' abstracts get cut off because they lead with author bylines. On the actual article page (shown at left), notice how the Times' placement of the intro paragraph followed by the byline is mirrored on the actual Google News SERP above.
Unless your story's author IS PART OF the story itself or is part of the brand (think Dowd, Ebert, Buckley, etc.), you'll need to experiment to ensure that your byline doesn't distract readers and keep them from getting the full benefit of the description you've written. Test, test, test, and make sure your readers get the most tempting view of the story you can manage.
Descriptive Snippets for News Sites
Posted by Erik Dafforn at 3:30 PM
| Comments (37)
| TrackBacks (0)
Printer-friendly version
September 24, 2007
Topix TLD Migration -- Six Months Later 
posted by Erik Dafforn in category: Crawling and Indexing
I'm a big fan of Rich Skrenta, co-founder of NewHoo (née Gnuhoo, which eventually took the more recognizable name DMOZ), co-founder of Topix.net, and -- what may be the coolest of all -- author of one of the first known computer viruses, one of the few to be written before the actual term "computer virus" was even coined.
So that's all very cool, but the search-related part of all this was how, six months ago, Topix finally purchased the .com version of its domain and decided to make the move away from .net. The Wall Street Journal, in a mainstream SEO article that actually managed to hit most of the salient points pretty accurately, highlighted Skrenta's anxiety at the global domain change:
Such a simple change, Mr. Skrenta has discovered, could have disastrous short-term results. About 50% of visits to his news site come through a search engine -- and about 90% of the time, that is Google. Some companies say their sites have disappeared from top search results for weeks or months after making address switches, due to quirky rules Google and other search engines have adopted. So the same user who typed "Anna Nicole Smith news" into Google last week and saw Topix.net as a top result might not see it at all after the change to Topix.com.
Like a lot of SEOs, at the time I wondered what was so "wrong" with the time-tested (at least in my experience) method of full-on 301 redirects from the old site to new -- especially since the code would be short and sweet, with each old .net URL going directly to its .com counterpart.
The Topix crew had apparently heard too many domain migration horror stories. On his own blog, Skrenta noted,
...there've been a whole bunch of the seo posts saying essentially "hey, it's easy to move a domain, you just 301 it." Of course I know about 301 and 302 redirects. The problem is that half of these people follow up and say "you'll only be out of the index for a few months". They also ignore the problems that big sites have. A redirect for a small site may work great, but if you have hundreds of thousands of pages or more, there are lots of cases where this caused some form of not-in-the-index-anymore doom.The number of seo consultants who claim to know how to move a 100k+ page site is much smaller than the number who have actually done it.
That last point is a good one. Conventional wisdom in SEO is frequently spawned by 5% research and 95% extrapolation, which is often the best you can do. The other dirty little secret of SEO is that when you start seeing the same sort of anecdote often enough, it's tempting to put it in the "research" column.
Still, I'd not seen or heard of the type of monumental tragedies that Skrenta was talking about (at least within the last few years), and neither had Danny Sullivan:
I still remain surprised that the 301 is that much of a problem for even a big site. I just haven't heard of that trouble, of half the people saying you'll be out or whatever. If that's what you had been hearing, I can understand your concern. But it seems a pretty straight-forward change, and it shouldn't even be a burden on the server in that you're not actually talking about 100,000 of physical redirects that have to be created and check but a change of one domain to the other.
Ultimately, Topix listened to its heart (or maybe its board) and surprised me by opting for all-out duplication, running identical content on the .net and .com sites, avoiding any sort of redirection plan. So to call it a TLD migration isn't quite accurate. it's more like Topix.net bought a summer house and called it Topix.com. Again, from the WSJ article:
Concerned about that [redirection] strategy, Topix has run its site at both Topix.net and Topix.com for awhile. One danger with that approach is that it is unpredictable; Google will see two versions of the same page and could choose to show the Topix.net page most prominently.
This course would appear to run contrary to the advice even Matt Cutts gave in the article:
Google's Mr. Cutts says the search engine should ultimately understand what is going on when a site changes its Web address. He says the best strategy is to move one section of the site to the new address and see what happens before switching the whole thing.
Skrenta has since left Topix, but the duplicated domain strategy hasn't. Go to any page on the Topix.net site, and you'll see that the exact same page exists on the .com version of the site. Currently, Google shows about 2 million Topix pages indexed on the .net TLD and about 1.2 million on the .com TLD.
So six months ago, had you asked any SEO "expert" about what to do, almost no one would have suggested the present course. But has it hurt anything? Maybe, maybe not. Topix.com has a PR6 home page with about 1.2 million inbound links, while Topix.net has a PR8 home page with almost 7.4 million inbound links. So at this point, in terms of raw accumulated power, consolidating those domains would create one very powerful site.
But would that be better than the current situation? Perhaps not. What about that "unpredictability" the WSJ (and zillions of SEOs) talk about with dupe domains? That even in the best case, Google will pick one page or the other at its own discretion -- and it might not be the one you want? Consider the SERP for [detroit local news]:

A page from Topix.net shows up in spot 8, and a page from Topix.com -- an exact mirror of the .net page -- comes in at spot 9. Not exactly Google choosing one over the other.
But if the pages are identical, why do they have different titles on the SERP? Because Topix.com is heavily linked to from DMOZ, and the .com page shown in this screen shot shows the title used on the Detroit News and Media category of DMOZ. This leads to a question: When Google pulls one page's title and/or meta description from DMOZ, does that override the duplication filter?
Detroit's hardly alone here. Do some searches yourself with a city name and "news" or "local news" and see for yourself.
There's really no moral to this story, other than every time something like this happens, the "guidelines" from engines lose more and more bite. I'm happy that (at least according to my superficial research) Topix is doing well; it's a great idea, smartly executed. But these SERPs are yet another frustrating case of mixed signals from engines.
In my opinion, in a case like this, once Topix owned the .com version, that was all that needed to happen. I don't think the .com even needed to have any content for the problem to be solved. Looking back, my advice would have been to keep all content on Topix.net and immediately set up a 301 from Topix.com to Topix.net -- the exact opposite direction that most people recommended. That way, the authority of Topix.net content would never have been in jeopardy, and any links that mistakenly found their way to Topix.com would immediately transfer link popularity (as well as the user) back to the main (.net) site. They could have even promoted the site as "Topix.com" with no major headaches. Almost no one would care (or even notice) if they were redirected, whether it was type-in traffic or someone that clicked over from a news story.
Topix TLD Migration -- Six Months Later
Posted by Erik Dafforn at 3:14 AM
| Comments (7)
| TrackBacks (0)
Printer-friendly version
September 12, 2007
302 Redirects: A Guide to Search-Friendly Usage 
posted by Erik Dafforn in category: Crawling and Indexing
Here are a few of the "absolutes" in the SEO field:
- Never use Flash
- Never use cloaking
- Never use 302 redirects
These are, of course, wild oversimplifications. Equating Flash or cloaking with bad SEO is like classifying Hank Aaron as a poor hitter because he batted cross-handed.
So when is a 302 redirect better than a 301 redirect? Here are a few examples. Keep in mind that in these scenarios, the 302 is not the only possibility to achieve the desired results. Plenty of CMS options likely exist that will do the same thing. It all depends on your setup. But in each of these cases, the 302 is certainly better than a 301. Let me say that one more time. Each of these scenarios likely has several non-redirect options that perform the same task. But the point of this post is that the 301 is not always the redirect you want, and sometimes it can hurt you if you buy into the "301 is the only good redirect" argument.
Example 1: New Products, Fresh Content. You run a cell phone information site, and one of your targeted phrases is [newest cell phones]. You have a URL called /newest-cell-phones.php, which is your users' go-to page for the latest in cellular technology.
For the last few days, the /newest-cell-phones.php page has redirected (via 302) to /lg-vx8350.php, which is the latest phone you've torn apart and reviewed. At the same time, you also have a static link in the LG portion of your site directly to /lg-vx8350.php, because you want to get that content crawled on its own as well. You're not particularly worried about dupe content issues, because tomorrow, you'll be done with your /nokia-2610.php page, and it will then become the target URL for the /newest-cell-phones.php redirect.
Example 2: Restaurant Content, Lazy-Susan Style. You run a popular restaurant with a large user base who check in each morning to see the day's menu. Because you buy local and fresh, you often have only a few days' notice about what you'll serve on any given day. You've done some user testing and found that your customers HATE the dreaded "giant PDF menu download" they find at most restaurant sites. Conseqently, you have a link on your home page directly to a URL called /todays-menu.htm. In addition, you have seven other pages:
- /sunday-menu.htm
- /monday-menu.htm
- /tuesday-menu.htm
- /wednesday-menu.htm
- /thursday-menu.htm
- /friday-menu.htm
- /saturday-menu.htm
On Monday, for example, /todays-menu.htm uses a 302 redirect to /monday-menu.htm. The next day, it redirects to /tuesday-menu.htm, and so on.
Would a 301 work here too? Absolutely not. You want /todays-menu.html to remain in the indexes and rank for terms like [restaurant name menu]. You do NOT want a URL like /wednesday-menu.htm to become associated with [restaurant name menu], because it's counterintuitive for such a URL to appear in SERPs (at least six-sevenths of the time!), and you have no control over what URL the engines will choose to replace /todays-menu.htm in the index, since you don't control what day they crawl it.
So what do these examples have in common? Here are some guidelines when deciding whether 302 is the way you want to go:
You could consider using a 302 when, for the following redirect:
URL A ---> URL B
- ...it is important that URL A be indexed and remain indexed.
- ...it's not critical that URL B be indexed, but its content is very helpful for users.
- ...you have several different URLs that might fit logically in the URL B spot above.
- ...you have put some resources into strong internal and directory linking for URL A.
As I said before, a 302 is not the only way to achieve this. Plenty of dev environments and independent scripts will enable you to achieve this also, but depending on your setup, a 302 might be the easiest way.
A little background on the often-misunderstood 302: If you spend any time slogging through HTTP header documentation (and who doesn't?), you've probably noticed that the 307 -- not the 302 -- is the true "temporary redirect." But older clients apparently don't always know how to handle a 307 properly, and the 302 does more or less the same thing.
Plenty of development environments use 302 redirects to direct users to the "appropriate" version of the home page when the root is called. For example, if you go to www.adidas.com, you'll be redirected (twice, actually) to the appropriate language and country version of the home page -- in my case, /us/shared/home.asp. This, in my opinion, is not the best environment for a 302 redirect. (See Bill Slawski's excellent analysis of 302s at domain roots from earlier this year.)
The downside to the whole approach I've laid out here is that it's sometimes hard to get good external links to "URL A," because by the time users click it, they've been redirected, and if they copy the URL from their browser, it's "URL B." Some good directories will allow you to submit a URL that redirects, as long as it redirects to a page on the same domain. But to get people to link to URL A, you'll need to make a conscious effort to give them the correct URL.
302 Redirects: A Guide to Search-Friendly Usage
Posted by Erik Dafforn at 4:55 PM
| Comments (102)
| TrackBacks (0)
Printer-friendly version
September 10, 2007
Google Book Search Results Defy Clustering, Quantity Precedents 
posted by Erik Dafforn in category: Google
Sean pointed this out to me early this morning. While we've known that Google has rolled out Book Search results in its main results column, some of the things Sean is seeing seem a bit out of whack with G's traditional clustering and placement precedents.
For example, here's the first page of results for [monopoles]. When I run the query, I don't see these results, but Sean does, along with a few other people that Sean has (mono)polled around the country:
![the first 10 results for [monopoles]](http://seoblog.intrapromote.com/google-books-serp.jpg)
Note how the results are unclustered. In other words, results from a specific subdomain typically get grouped together on the SERP for the sake of convenience, user experience, or ... well, for some reason, anyway. The results pulled from the books.google.com subdomain seem immune from the clustering behavior. And when results appear in the "regular 10" (as opposed to one-box) results, any given group of 10 results typically shows only two results from a given subdomain. This SERP shows three.
This is hardly the clustered behavior that some bloggers like Seth Godin have noticed. The examples he gives in that post are neatly organized at the top of the SERP in a one-box-style format.
If you move to the second page of SERPs (results 11-20) you'll see even more instances. In the case of [monopoles], Google Book Search holds six positions in the 11-20 group:
![results 11-20 for [monopoles]](http://seoblog.intrapromote.com/google-books-serp2.jpg)
So for at least this query (and several others he's shown me today), Google Book Search has nearly half the top organic positions. That's hard to beat.
Google Book Search Results Defy Clustering, Quantity Precedents
Posted by Erik Dafforn at 1:43 PM
| Comments (7)
| TrackBacks (0)
Printer-friendly version
August 16, 2007
IAB, DMA, and SEO: WTF? 
posted by Erik Dafforn in category: SEO Industry News
I just noticed this posted by Barry Schwartz over at SEL: The UK flavors of the Internet Advertising Bureau (IAB UK) and the Direct Marketing Association (DMA) have joined forces "to establish industry-wide search standards", as they put it in their release on the IAB's UK web site.
Every few years I see this stuff and I try -- really, I do try -- not to by cynical. But trying to qualify and quantify best practices is like sprinting like hell to get to the end of a Mobius strip. Historically, any efforts to define acceptable and unacceptable practices in SEO have been either so rigidly prescriptive as to except significant portions of successful (and lauded) SEO companies, or they've been so toothlessly vague as to allow access to anyone who can forge a backstage pass.
To which of these camps does the IAB/DMA "charter" belong? Judge for yourself: Following (in bold) are the minimum corporate qualifications found in the IAB's charter document (MS Word, 238K), with a little commentary (mine) in italic.
Many -- many -- of the industry's best SEOs are one-(wo)man shops.
That's actually not a bad benchmark. For PPC. How about the other 80% of clicks?
I'm not exactly sure what "trading" means, but I think it's a UKism for "having been in business." I certainly concede that most good SEOs have been in business for more than 6 months. But most of the lousy ones have been too.
Now we're getting somewhere. Explore the links to the membership pricing levels of the IAB UK, IAB Europe (PDF), DMA, and SEMPO.
I have nothing personally against any of these organizations, but answer this question honestly: With mass adoption of this charter by SEO companies, who benefits more -- these four membership organizations, or companies in search of a reputable SEO firm?
And in case you're still reading, thanks. Here's your reward, pulled from the original charter Word document, and delivered in the world's most accepted currency -- laughter:
Monitoring compliance
The charter will be self-policed by the SEM industry.
IAB, DMA, and SEO: WTF?
Posted by Erik Dafforn at 10:08 PM
| Comments (13)
| TrackBacks (0)
Printer-friendly version
August 14, 2007
Speedwagon Rolls Out Category Feeds 
posted by Erik Dafforn in category: Blogging

In an effort to thrust ourselves into 2005, we're happy to announce the rollout of category-specific feeds in addition to our blog's regular "global" XML feed. In other words, if you're completely addicted to corporate reputation management (feed) and Wikipedia (feed) articles but simply can't fathom suffering through another crawling and indexing (feed) lecture, you're in luck.
At the left, you'll see a portion of our list of categories. The full list is always in the left column of the site. As usual, clicking the text itself takes you to all posts in that category. But what's new is the little green RSS button to the left of the text. That takes you to the feed for that category. So to subscribe to a particular category feed, just right-click the button next to your favorite category, copy the link location, and paste it into your feed reader/aggregator.
Big thanks to an old but very effective tutorial that made it very easy to do in Movable Type.
The next step would probably be offering the ability to create "combination" feeds that let users select categories a la carte and merge the selected feeds into one master feed. I'm not quite ready to tackle that yet on an automated scale, but you could certainly set up a Yahoo Pipe and accomplish the same thing.
Speedwagon Rolls Out Category Feeds
Posted by Erik Dafforn at 7:35 AM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
August 1, 2007
Happy Fourth Birthday, High Rankings Forum 
posted by Erik Dafforn in category: SEO Industry News
Let us be one of the first "outside" sites to wish Jill Whalen and her High Rankings Forum a happy fourth birthday.
Here's a great quote from Jill from the announcement on July 30, 2003:
I know, I know. The last thing the world needs is another search engine marketing forum! But I like to think that this one will be unique because I've lined up some of the best and the brightest in the search engine marketing industry to be expert moderators. This means that people who are truly "in the know" will be answering your questions. I've met most of the moderators in person, as a good portion of them speak at the same conferences that I speak at. These guys and gals know their stuff!
I joined the forum a little "late" -- about a month after it launched. I visit the site pretty often, but I rarely contribute, for what I consider a couple good reasons. I simply don't have time to follow up on posts and join too many "conversations," and I don't want to be one of those "drive-by" posters that I find so annoying. Plus, it's not as if I have a ton to add: Jill has assembled a crack staff of moderators, and I rarely disagree with the consensus over there.
So congratulations Jill.
Happy Fourth Birthday, High Rankings Forum
Posted by Erik Dafforn at 7:12 AM
| Comments (7)
| TrackBacks (0)
Printer-friendly version
Google Supplemental Index Label Formally Dropped 
posted by Erik Dafforn in category: Google
Yesterday's announcement from Google's Official Webmaster Central Blog represents a formal declaration of what Matt Cutts hinted at in a recent SEOMoz post: The "Supplemental Result" label that used to appear for pages in Google's "junior varsity" index will no longer appear.
Do not mininterpret this as "Supplemental Result pages no longer exist." They still do, if you read the Google post with subtlety. The gist of it is that Google crawlers are now able to cruise through these pages with more frequency and reliability. This apparently negates the need for a special label, as while these two indexes are certainly not treated the same, the differences appear to be waning.
Personally, I don't really mind that this is disappearing, because even in the best cases, it wasn't always easy to determine if pages were "really" in it, and people seldom agreed on what caused pages to be there (despite several Googlers saying exactly what got you there). Still, the presence of Supplemental pages forced webmasters to figure out some better ways to organize their sites, which is the silver lining.
With a good analytics program, you have the capability of seeing how many pages on your site are performing well or not performing at all. Seeing them labeled as "Supplemental" was a shortcut to diagnosis, but its absence is hardly cause for panic.
Google Supplemental Index Label Formally Dropped
Posted by Erik Dafforn at 7:06 AM
| Comments (0)
| TrackBacks (0)
Printer-friendly version
July 30, 2007
Business.com Appreciates (Because of) Your Support 
posted by Erik Dafforn in category: Directories
One of the big stories last week was the sale of Business.com to RH Donnelly. Because the business.com domain has been one of the rock stars of domain buying & selling, I found a few other numbers you might find interesting. Following are a few of the years that business.com changed hands, as well as the price:
- 1997: $150,000
- 1999: $7,500,000
- 2007: $345,000,000
I remember the sale in 1999, and how many believed its record-setting price epitomized what Alan Greenspan called "irrational exuberance." But what I didn't remember is that the 1997 sale also set a record, at least according the article I cite above.
In terms of raw numbers, $345 million is quite a haul. But note that statistically, the domain had significantly better annual appreciation between '97 and '99 (2450%) than it did between '99 and '07 (563%).
Today, of course, business.com is more than just a domain name. I'll be very interested to see what Donnelly does with it and whether -- or how quickly -- it can recoup its investment.
Business.com Appreciates (Because of) Your Support
Posted by Erik Dafforn at 12:00 PM
| Comments (8)
| TrackBacks (0)
Printer-friendly version
July 25, 2007
How the U. of Kentucky Web Team Spent Its Summer Vacation 
posted by Erik Dafforn in category: Spam
I really don't consider myself a spam cop, but this is too good to pass up. Just one more, then I'm done; I promise.
A few days ago when I was working on a previous post about how small sites are using big, well established sites to "vouch" for them, I showed some examples of SERPs for [cialis]. One thing I didn't get a chance to explore is how the University of Kentucky shows up on page 3 at Google:

When you click over with JavaScript disabled, you get the typical UK search page:

When you click over with JavaScript turned on, you're almost immediately redirected to the Cialis landing page at extra-drug.com:

At first I thought maybe there was some hijacking going on, but a cursory look at the UK Search page shows it was an inside job. Here's the smoking gun:

So we know how the page ranks for the terms. And also note that when you click over to the UK Search page from another UK page -- even with JavaScript enabled -- you don't redirect. So there's some referrer-based server-side magic going on here too. The real nagging question here is, who is making the money? A single guy in the UK CS department, or is it shared among many of them? Surely code like this doesn't go unnoticed for long.
Like the techniques I described in the earlier post, pill sites hooking up with the college web site crowd is nothing new. Danny Sullivan blogged about it over two years ago, after Search Engine Watch users found the Stanford Daily selling text links.
The Stanford editors pleaded ignorance, but that excuse doesn't cut it in the Big 10. These Kentucky guys know what they're doing.
So what's the most interesting angle of this whole mess? The fact that it takes place on the Search page on the University of Kentucky web site. How could this be anything BUT an Oedipal glove-slap to UK's biggest search-focused alumnus?
How the U. of Kentucky Web Team Spent Its Summer Vacation
Posted by Erik Dafforn at 2:21 AM
| Comments (18)
| TrackBacks (0)
Printer-friendly version
July 22, 2007
Pharmaceutical Sites Riding In On Established Coattails 
posted by Erik Dafforn in category: Spam
While I typically find sports metaphors trite and unimaginative, this one works fairly well, so I beg indulgence. The pharmaceutical game has always been considered a pretty cutthroat environment in organic SERPs, and that's still the case today. Some sites are using the equivalent of offensive linemen to penetrate an otherwise difficult defense and make room for their own site to squeeze through the gap.
Here's the philosophy: If your site can't rank by itself for certain queries, use the broad back of an established site to knock some of the weaker sites down the SERP to make room for themselves.
Take a random but popular drug -- Cialis. The query for [cialis] shows two such instances on the first page of results, in spots 8 and 10.
![Spots 8 and 10 on a search for [cialis] are help by two respectable but non-pharm sites.](http://seoblog.intrapromote.com/cialis-01.jpg)
First case: Alexa
So how does Alexa fit into the Cialis game? The folks at pillls-deals.com (which is where pillsdeals.com redirects) spread guestbook and comment spam across 20,000+ sites and link to ... not their own site, but the Alexa profile page for their Cialis page. The Alexa page gets crawled and begins to rank for [cialis] due to factors like the anchor text dropped in the guestbook spam. If a user clicks over to the Alexa page from the [cialis] SERP, it still requires another click to get to the pills-deals.com site, so I'm curious about the clickthrough on a #8 result that requires an additional click beyond clicking away from the SERP. Still, the rankings portion of the equation seems to be working.
Second case: Technorati
Here's a double slap in the face of Google. The crew at xlpharmacy.com (or maybe just an adoring fan) set up a fake Cialis blog -- on Google's own Blogspot.com domain, no less -- and used Technorati tags to tag each post the same way: "Buy Cialis Online. FDA Approved Quality Pills. cialisxl." So queries like [cialis] and [buy cialis online] show sites like the Technorati tag page in their top results, since Technorati tag pages are crawled and indexed by Google. Again, once the user clicks over, it requires another click to get to the actual pill sales site. And even if the user clicks over from the Technorati tag page to the fake blog (cialisbuyonline1.blogspot.com), the user never even sees the blog -- instead, falling prey to a JavaScript redirect to xlpharmacy.com.
Neither of these techniques is particularly new, which is part of the problem. Pharm SERPs seems to be as useless and irrelevant as they've always been, representing one of the biggest problems that engines' anti-spam teams face right now.
Pharmaceutical Sites Riding In On Established Coattails
Posted by Erik Dafforn at 2:30 AM
| Comments (84)
| TrackBacks (0)
Printer-friendly version
July 12, 2007
Web 2.0 Product and Domain-Naming -- from 1877 
posted by Erik Dafforn in category: Corporate Reputation Management
I came across a perspective-offering gem today while listening to the Writer's Almanac (Real Audio version).

It's the birthday of the man who gave us the Kodak camera, George Eastman, born in Waterville, New York. He was working at a bank when he got interested in photography around 1877. He took his first dry plate photograph the next year with the camera that he invented—a view of the building across the street from his window. He developed this little handheld camera, and he called it the Kodak because it was easy to remember, difficult to misspell, and it meant nothing, so it could only be associated with his product.
With company (i.e. domain) names always a hot issue, especially as they pertain to search reputation management, typo traffic, and owning the SERP for your company/product searches, it's quite interesting to hear similar philosophies dredged up from yesteryear.
Stories like this always make me happy that I'm not in charge of the search presence for the Hilton hotel in Paris.
Web 2.0 Product and Domain-Naming -- from 1877
Posted by Erik Dafforn at 1:40 PM
| Comments (15)
| TrackBacks (0)
Printer-friendly version
July 6, 2007
Google Weighs in on Image Replacement (sIFR) 
posted by Erik Dafforn in category: Crawling and Indexing
On the rare occasion when an engine expresses an actual opinion on a real technique, It's a welcome, welcome sight. So imagine my glee when I read Google Webmaster Central Blog's take on dealing with Flash.
While John lamented the mixed signals just last month, we've been asking the question internally for years: From a best practices standpoint, does image replacement stand safely in the DMZ of glorified CSS, or does it boldly encroach the characteristic of "showing engines one thing and users another"?
And that's just the beginning. The real problem, when you're using image replacement, is not the insertion of stylized copy, but instead, what you do with the HTML text you're replacing. Some systems simply let it lie underneath the script-spawned Flash layer, while some use "hidden" status in CSS, while still others pull it off the visible screen and hard-code it somewhere in the -5000px range -- each of which is detectable and grounds for a good spanking if your motives are anything but pure. Traditionally, that left us with the worries of trusting the algorithm to detect our motives.
But forget all that, because today we know, and we know it based on the way all such things are Known -- because it's mentioned in an official Google blog, midstream in a list of "practical suggestions" about how to deal with Flash:
sIFR: Some websites use Flash to force the browser to display headers, pull quotes, or other textual elements in a font that the user may not have installed on their computer. A technique like sIFR still lets non-Flash readers read a page, since the content/navigation is actually in the HTML -- it's just displayed by an embedded Flash object.
This proclamation, coming on the heels of Independence Day, is fitting, because no longer are we bound by the tyranny of not knowing on whose side of the fight sIFR truly sits.
Google Weighs in on Image Replacement (sIFR)
Posted by Erik Dafforn at 7:20 AM
| Comments (34)
| TrackBacks (0)
Printer-friendly version
July 3, 2007
SEO Speedwagon Enters its Third Year 
posted by Erik Dafforn in category: SEO Industry News
Over the weekend, SEO Speedwagon celebrated its second birthday, which I suppose means we're beginning to enter the "terrible twos."
With any luck, we'll be able to effectively deal with problems that surround typical two-year-olds, such as the following:
- Increasing our vocabulary (more categories!)
- Effectively dealing with our waste (pages in the Supplemental Index)
- Learning how to share (better linking out to SEO resources)
- Handling growth (Intrapromote is adding staff -- and that means more bloggers!)
- Trying not to annoy you by constantly asking "why?" (we are inquisitive, after all)
For some historical perspective, here are some of the issues that were on the plate in the summer of 2005, when we started blogging:
- Adwords introducing geo-targeting and dropping five-cent minimum bids
- Click fraud reaching $1B annually (by some estimates)
- Ask (Jeeves) and MSN publicizing their imminent PPC systems
- Google's quarterly profits growing four-fold over the same period in 2004
Looking back, we've acquired a really strong and loyal group of readers, and we really appreciate the feedback we receive. Here's to a strong third year.
SEO Speedwagon Enters its Third Year
Posted by Erik Dafforn at 12:17 PM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
June 26, 2007
Checking Supplemental Index Status for URLs in Large Sites 
posted by Erik Dafforn in category: Crawling and Indexing
For sites with fewer than 1000 pages, it's possible (if not monotonous) to see which URLs are in Google's Supplemental Index. Simply run a site: command for your domain (example) and scroll through the results pages until you start to see "Supplemental Result" next to some of the URLs.
But what if your site has 50,000 pages and the supplemental results don't start until the final 10,000? Even the fairly common site:domain.com *** -view query isn't totally accurate, and it's still subject to the 1000 URL display limit.
Depending on which case you find yourself, it can be either tedious or impossible to detect whether a specific URL is Supplemental.
Using our blog site as an example, suppose I suspect -- but can't confirm -- that an old post about Yahoo Sitemaps is in the SI. A simple info: query doesn't tell you whether the URL is supplemental or not. For example, the following shot came from the query:
[info:http://seoblog.intrapromote.com/2006/11/an_update_on_ya.html]

Instead, a quick way to check Supplemental status is to pull a unique string from the URL in question (such as a folder or filename) and tack it into an inurl:-filtered site: query. In other words, the following shot came from this query, in which I added the filename (minus extension) into the inurl: command:
[inurl:an_update_on_ya site:seoblog.intrapromote.com]

In this result, note the Supplemental Index status.
The bottom line is to find an inurl: string that will quickly filter down the site: query results so that your specific URL shows up quickly.
Checking Supplemental Index Status for URLs in Large Sites
Posted by Erik Dafforn at 8:50 AM
| Comments (31)
| TrackBacks (0)
Printer-friendly version
June 19, 2007
High Rankings Seminar in Denver - June 28-29 
posted by Erik Dafforn in category: SEO Industry News
Jill Whalen asked us to mention her upcoming seminar in Denver next week, June 28-29. I'm getting around to it a little late, but there's still room if you're a) going to be in Denver next week and b) need rock-solid SEO advice from one the best known names in the biz.
Jill's been doing SEO since the late '60s, loves chocolate, and is giving away two free tickets for non-profit groups (more details). By themselves, those three qualities are okay. But put them together, and that's a good show.
High Rankings Seminar in Denver - June 28-29
Posted by Erik Dafforn at 3:50 PM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
June 12, 2007
A Quick Route to the Google Text Cache 
posted by Erik Dafforn in category: Crawling and Indexing
I am constantly looking at cached copies of web pages through Google's cache. It's a wonderful way to double-check that your content is showing up the way you want it to and quickly tell if Googlebot has noticed any recent changes you've made. It's also a great way to sniff out some rather "brave" techniques on behalf of your competition.
As a primer, you can view the cached version of any web page by typing cache:URL in a Google search box, where URL is any full URL string. For example,
cache:www.united.com
will show a cached copy of the home page of United.com:

But beyond the normal cached version, I prefer to look at the text-only cache. The pink rectangle above shows the link to the text-only cached version. Clicking this shows you a version of the page much more like what a robot really sees.
But sometimes because of the site layout, the box at the top of the cached page doesn't appear. Consequently, the link to the "text only" version of the cached copy is hidden -- as in this sample shot from the BMW USA home page:

When this happens, the text-only cached version is still not hard to find. Simply append
&strip=1
to the URL of the regular cached version, and you'll see the text-only cache.
Now to take this to the next level (if you're a Firefox user ... and you are, right?), here's how to use Firefox Quicksearch Bookmarks to find the cache of a page so fast you'll amaze your friends and stun your competition.
(If the concept of Quicksearch Bookmarks is new to you, get some background here and here. It's a way to search any site from the Firefox address field. Trust me: If you search a lot, you will LOVE this.)
When you create a new Quicksearch bookmark, here's the data to use:
Name: Google Text Cache
Location: http://72.14.209.104/search?q=cache%3A%s&strip=1
Keyword: tc
So typing this in Firefox's Address field:
tc www.yahoo.com
will show you this. Pretty cool, huh?
A Quick Route to the Google Text Cache
Posted by Erik Dafforn at 2:24 PM
| Comments (4)
| TrackBacks (0)
Printer-friendly version
June 5, 2007
SEO Case Study: Press Release Archives 
posted by Erik Dafforn in category: Crawling and Indexing
We recently worked on the press release archive for a pretty large company and I wanted to show an informal case study about what happened.
Here's the initial structure:
Each of the smallest document icons represents a specific press release. The issues with this architecture are fairly obvious when displayed graphically. It's a linear linking format, with the main press page showing the most current releases. You need to click a "next" button to hit a page linking to releases 11-20, 21-30, etc. etc. So the oldest releases were literally dozens of clicks away from the home page.
With this setup, here are the baseline stats. I can't use real numbers, but I'm hopeful that your algebra knowledge is current enough that variables will suffice:
- Total releases indexed: X (representing about 6% of total possible pages)
- Search traffic (monthly visits from search engines where the landing page is a press release): ~Y visits / month
Here's the modified structure:
The main press page now has child pages devoted to releases from specific years, as well as pages devoted to releases based on their subject area. Consequently, each release has links from at least two internal pages, and no release is further away from the home page than 3 clicks.
Stats 45 days after implementation:
- Total releases indexed: 16.8X (representing >98% of total possible press releases)
- Search traffic (monthly visits from search engines where the landing page is a press release): ~2.4Y visits / month
A couple important notes:
Don't fool yourself. This is content that people actually search for in pretty significant numbers. If you write press releases just to write press releases -- and your content doesn't have the pent-up demand to justify it -- don't expect results like these. Jill Whalen wrote a smart article about this very topic at Search Engine Land.
Sitemaps aren't a back door. For 6+ months prior to the change implementation, all press release files were included in a Google XML sitemap file. So don't expect a sitemap feed to dramatically increase your indexing if your site's architecture doesn't back it up.
SEO Case Study: Press Release Archives
Posted by Erik Dafforn at 3:50 PM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
May 25, 2007
Keyword Research: Use Data -- and Your Head 
posted by Erik Dafforn in category: Keywords
We often run into contrdictory data when we use multiple tools for keyword research. Take the difference, for example, between predictions in Keyword Discovery and WordTracker.
KWD predicts that the term [ford f150] will be typed about 1665 times per day, while WordTracker predicts 1298. Actually, that's pretty close, considering that we sometimes see a factor of 5 or 10 between the two services. Take [cardiology], which weighs in with "predict" counts at 1485 (WT) and 209 (KWD).
So which one's right? That part's easy. Clearly, they're both way off. In my opinion, both services utilize a sample size so small that the level of extrapolation required to estimate a "predict" count sends the possible error percentage through the roof.
So does that mean I don't use them? Hardly. I use them both. But I also use my head.
WordTracker claims that the ratio of searches for [ipod nano] to [video ipod] is about 15:1, respectively. Keyword Discovery shows a ratio of about 17:1. To me, that's the critical factor. Across and within their respective samples, [ipod nano] is more frequently searched for by roughly the same margin. I don't need to know a valid daily prediction for this data to be helpful.
(While the two services surely do not agree on all term ratios like they did here, this was my first try. I didn't "shop around" plugging in terms until I found terms that showed similar ratios. Your mileage may vary, of course.)
Another thing to remember is sample location. If you're playing around with keyword tools and see that no one searches for [liquor stores cleveland] but plenty of people search for [liquor stores atlanta], don't necessarily believe that the northern Buckeyes have adopted a philosophy of temperance. (I think we all know that's not true.)
Be aware that sample data in keyword tools isn't always perfectly geographically distributed. In my experience, markets of similar sizes often have search patterns that are very similar -- with the exception of regionally specific keywords. In other words, you might very well find more people searching for [tanning beds] in Bismarck than in Scottsdale, since there's less natural sun in the Dakotas.
You get the point. While sometimes the information in keyword tools does qualify as "good news," I certainly wouldn't call it "gospel."
Keyword Research: Use Data -- and Your Head
Posted by Erik Dafforn at 1:36 PM
| Comments (5)
| TrackBacks (0)
Printer-friendly version
May 16, 2007
Google Universal - a Quick Look at Google's New UI 
posted by Erik Dafforn in category: Google
Today Google rolled out "Google Universal," a major update to its SERP interface. For me, it literally happened at about 45 minutes ago, as I was looking up some restaurant information.
In the following query for [mentos experiment], the first thing you'll notice is that the links to Web, Images, Video, News, Maps, and "more" are now at the very top left of the screen (see pink highlight below). The "more" link drops down to offer (as expected) more options.

Note the Google Video thumbnails near the bottom of the shot. Also highlighted is pink here is an additional set of options -- possibly. In the shot above, only "Web" results are available in this search. But look at the following search for [adidas stan smith], which shows the additional options of Products and Images:

Clearly, this is a little bit of server-side decision-making on Google's part: Show additional vertical search options only when the index warrants it. Still, this could potentially get a little too "busy," as the "Images" link right above the Google logo points to the same page as the "Images" link below the logo, which seems redundant.
Basically, Google's new interface dumps every conceivable data format onto the SERP and lets you decide which direction you want to take it. On first look, it's a pretty nice upgrade, which, along with the new release of Google Analytics last week, gives SEOs something new to chew on for a while.
Note: Don't bother telling me that Danny Sullivan already did a post on this that was earlier, longer, and funner. I get it.
Google Universal - a Quick Look at Google's New UI
Posted by Erik Dafforn at 10:25 PM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
May 3, 2007
SEO Lessons from Jurassic Park 
posted by Erik Dafforn in category: Web Analytics
I get a kick out of it when SEO and Analytics overlap into other areas of life.
There's a great passage in Crichton's Jurassic Park (I don't think it made it into the film) that is applicable in all sorts of business and personal situations.
I can't find my copy of the book anywhere, since it's been about 15 years (!) since I read it. So I'll have to paraphrase. Here's the scene: An outside auditor is watching a Jurassic Park computer guy monitor a specific type of dinosaur within a certain part of the park from his computer workstation.
Here's the paraphrased scene. Remember (as if it will be hard) that I'm no Crichton:
Auditor: Hey Computer Guy. What's up?
Computer Guy: Hey Auditor. Just checking on how many T. Rexes we have in Area 8H.
Auditor: Cool. How do you do that?
Computer Guy: Oh, you know -- each one emits some fiction novel-based signal that is picked up by my computer sensor here.
Auditor: Cool. How many T. Rexes do you have?
Computer Guy: Twelve.
Auditor: Cool. How do you know that?
Computer Guy: Well, according to our records, we're supposed to have 12. When the computer counts the signals and finds 12, it tells me everything's okay.
Auditor: Cool. Why don't you try searching for 25 of them?
Computer Guy: Okay. Hey now, that's interesting...
Auditor: What happened?
Computer Guy: It found 25.
Auditor: Cool. Why don't you try searching for 50?
Computer Guy: Okay. OH CRAP.
Auditor: What's wrong?
Computer Guy: Can you lock that door over there?
AND ... SCENE.
So what does this have to do with SEO or web analytics? It means that getting the right answer is very important, but only when you're asking the right question or looking for the right data.
- If you're going after a "trophy phrase" -- and you actually start ranking for it -- you might think you've hit the jackpot, when in reality, some keyword research would reveal a much smarter strategy.
- If you're buying traffic to hit a certain visit or pageview goal -- and you hit those marks -- you might think you've achieved something, while sales languish.
- If you're pursuing a link-building strategy based on PageRank or raw numbers -- and you get that SuperLink or hit the right IBL count -- you might expect your sales to skyrocket, while instead you get a bunch of curious tire-kickers who do nothing but suck bandwidth.
So remember: When you're looking for dinosaurs, find out how many there are, period. Don't just stop when you get to 12.
I hope you've enjoyed this edition of SEO Morality Theatre.
SEO Lessons from Jurassic Park
Posted by Erik Dafforn at 4:40 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
April 23, 2007
Wikipedia Traffic Since Adding Nofollow 
posted by Erik Dafforn in category: Wikipedia
Many have wondered what effect it might have on your site's search traffic to tag nearly every outbound link with the "nofollow" attribute. According to Alexa, Wikipedia hasn't suffered. Looking at the following graph, with the red vertical line showing the rough date on which Wikipedia added "nofollow" attributes to its outbound links, one could draw the superficial conclusion that a global "nofollow" addition has had neither particularly positive effects (i.e., the "PageRank hoarding" theory) nor negative effects (i.e., the "if Wikipedia doesn't trust its links then Google won't trust Wikipedia's pages" theory):

While this graph supposedly reflects all traffic, Hitwise suggests that Wikipedia gets over 50% of its traffic from Google. So theoretically, a large hit in Google traffic would appear on this chart.
It might be more accurate to look at Wikipedia traffic from the source itself. Pulled from this page (a very cool resource), we see a traffic graph (measured in bits/sec -- not visits or pageviews) that similarly confirms no traffic loss following the "nofollow" implementation:

An interesting footnote: This graph shows incoming traffic (e.g., new articles, picture uploads, comments, etc.) below the X-axis (the red horizontal line near the bottom), while outgoing traffic (typical file requests, etc.) are above the X-axis. It's very cool that they show this.
Wikipedia Traffic Since Adding Nofollow
Posted by Erik Dafforn at 2:16 PM
| Comments (18)
| TrackBacks (1)
Printer-friendly version
April 17, 2007
Putting SEO into Perspective 
posted by Erik Dafforn in category: SEO Companies
In random fits of uncontrollable self-importance, SEOs (myself included) sometimes get quite a rush when they consider the power they believe they wield over SERPs. But in a recent conversation, I had a chance to put that power into perspective.
A few months ago, I was talking with a VP at a large agency who handles PPC (along with having a hand in offline media) for a large organic client of ours. We were lamenting that despite the strongest rankings ever, along with really current keyword research and a constantly fine-tuned PPC campaign, traffic was lower than expected.
And it wasn't just us. He had some nice, subscription-only Yahoo Buzz Index charts showing that industry-wide, there was less demand than usual for the important terms we targeted, even after adjusting for seasonality.
Finally, he shrugged it off. "Well," he said, "I guess what I need to do is to get more people searching for this stuff."
And he did.
SEO capitalizes on what people search for. This guy dictates it. Now THAT'S power.
Putting SEO into Perspective
Posted by Erik Dafforn at 11:48 PM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
April 10, 2007
Del.icio.us Cloaking Update, More on Google Link Data 
posted by Erik Dafforn in category: Social Media
Last August, I wrote about how Del.icio.us was cloaking its robots.txt file, showing engines one version (which gave them full access) and showing users another (which appeared to restrict crawling and indexing). In addition, it was showing a set of robots meta tags to users, but not showing them to regular users.
Here's an example of what Del.icio.us was doing back then, at the page meta tag level:
Following is the famous meta tag from the Del.icio.us "SEO" tag page - the meta tag that makes everyone think the page won't be crawled:
But if you set your user-agent to Googlebot, here's what you see:
Since then, Del.icio.us has stopped one of these two techniques. The site still cloaks at the page level -- showing the robots meta tags above to users, but not to engines. But the robots.txt issue (discussed in the first paragraph above) has been changed. Now everyone sees the same version, with all major engines given these crawling parameters:
Allow: /
Disallow: /inbox
Disallow: /subscriptions
Disallow: /network
Disallow: /search
Disallow: /post
Disallow: /login
Disallow: /rss
The /subscriptions and /rss lines above, for those keeping score, are new since August.
Also note that Del.icio.us has used the "nofollow" link attribute for quite a while -- possibly since its inception. As a result, the cloaking matter is moot to many people, because to them, who cares if a page is crawled or indexed if the OBL aren't given any weight anyway?
The other reason I'm writing about Del.icio.us today is due to a comment on a recent post about Google Webmaster linking data. Offhandedly, I mentioned to "remember that Google reports nofollowed links" in its reports of incoming links to specific URLs, and I'm not sure a lot of people realize this.
(Important: Now, the "nofollow" I'm talking about is the link attribute, not the robots meta tag.)
So let me rephrase:
Just because Google sees and reports a link coming into your site does not mean that link does you any good.
As an example, I've looked through many Google link reports and gone to the specific page linking in to our site or our clients' sites. Links such as the following will show up in Google link reports, but according to everything Google has said over the past two years, the links aren't helping you:
- Del.icio.us
- Stumbleupon
- Links from comments and signatures from any blog/forum site that utilizes "nofollow"
- etc.
So again, don't take those linking reports at face value, at least to the point of making an assumption that all links are beneficial, even when the site they come from is highly respected and authoritative. Certainly, they're important for the potential traffic, but not for building your site's link popularity.
Del.icio.us Cloaking Update, More on Google Link Data
Posted by Erik Dafforn at 10:56 AM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
March 29, 2007
A Tale of Two Link Counts 
posted by Erik Dafforn in category: Link Building
The following doesn't tell the whole story, but I think it's an important chapter. The image at right shows a section of an inbound link report from Google Webmaster Tools.
This is the "external" link report -- that is, measuring links from outside domains. I've highlighted the inbound link counts for two deep URLs. On first glance, you might expect the first one (638) to be a dominant force in driving traffic, but you'd be wrong. Here's some deeper data on both URLs:

URL 1: 638 inbound links
- The 638 inbound links represent 14 total domains. (For the purposes of this analysis, I'm saying that foo.blogspot.com and bar.blogspot.com are distinct domains.)
- 615 of the links come from an ROS (run-of-site) blogroll link on one personal blog (blogspot.com).
- Of the remaining 23 links, 14 come from eight other blogs at sites like blogspot, livejournal, or blogsome.
- Of the remaining nine links, four come from social bookmarking-type sites, most of which "nofollow" their outbound links. (Remember that Google reports nofollowed links too.)
- The final five links come from a total of three separate blogs on unique domains.
When you break down the links, there's not a great deal of substance there. The page is a poor traffic driver, although it's premature to blame that entirely on the quality of the incoming links.
URL 2: 38 inbound links
- The 38 inbound links represent 33 unique domains.
- Only one domain in the list of 33 is easily identifiable as a blog host (livejournal.com)
- About half of the remaining 32 are low-quality and/or scrapers.
- The other half of the remaining 32 are decent sites whose foci match the point of the page on my client's site.
Week in, week out, URL 2 is the site's top entry page, outperforming even the home page. Again -- not necessarily because of the quality of the links. But this reinforces the point that quantity of inbound links cannot make up for lack of quality.
A Tale of Two Link Counts
Posted by Erik Dafforn at 9:08 PM
| Comments (4)
| TrackBacks (0)
Printer-friendly version
March 23, 2007
NBC to Give Internet Domination a Second Try 
posted by Erik Dafforn in category: Social Media
Search Engine Land has the story about NBC and News Corp. (aka Fox) and their joint plans to create "the largest Internet video distribution network ever assembled with the most sought-after content from television and film" (their words, not mine).
In addition,
AOL, MSN, MySpace and Yahoo! will be the new site’s initial distribution partners. Their users, who represent 96 percent of the monthly U.S. unique users on the Internet, will have unlimited access to the site’s vast library of content.
Notice anyone missing from the list of initial distribution partners?
At launch, full episodes and clips from current hit shows, including Heroes, 24, House, My Name Is Earl, Saturday Night Live, Friday Night Lights, The Riches, 30 Rock, The Simpsons, The Tonight Show, Prison Break, Are You Smarter than a 5th Grader and Top Chef, plus hits from the studios' vast television libraries, will be available free, on an ad-supported basis.
You might think that NBC and Fox are crazy to go after YouTube, but then you realize that Jeff Foxworthy is the secret weapon. I'm no online video expert, but is the popularity of YouTube due to its ability to show things like copyrighted clips of the Are You Smarter Than a 5th Grader?
Isn't it more due to showing things like the guy who takes a picture of himself each day for 6 years, or would-be Norwegian beatboxers with too much time (and electronic equipment) on their hands?

At any rate, I'm glad NBC is giving it another shot. You might recall NBC's "other" venture into online dominance. In late 1999, the media giant launched NBCi, the, ahem, "Yahoo killer" of the day -- a portal/search engine that quickly shot out of the gate and in less than a year accrued exactly 0.0% of search engine market share.
One wonders whether the new NBC/Fox video site will offer all episodes of Emeril's short-lived NBC sitcom for free, or whether those will reside in the premium section. Regardless, those with a diet heavy in schadenfreude will be watching the launch closely.
NBC to Give Internet Domination a Second Try
Posted by Erik Dafforn at 4:46 AM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
March 19, 2007
Google's Supplemental Index: Questions and Inconsistencies 
posted by Erik Dafforn in category: Crawling and Indexing
What I want to discuss in this post is Google's unreliable method of showing which pages are and are not in the Supplemental index.
What I don't want to get into with this post, beyond a very superficial level:
- Whether pages being in the Supplemental index is bad (or merely not good)
- How pages end up in the Supplemental index
Let's just say that all things being equal, I'd rather have 10,000 pages in Google's main index as opposed to its Supplemental index.
Using the method (that is fairly commonly accepted, in my opinion) of determining which pages from a site are in Google's Supplemental index (site:yoursite.com *** -view -- first read at SEOBook -- see References), I decided to pick a random URL -- in this case, our blog's article archives in the "Crawling and Indexing" category:

So clearly -- at least according to Google (who should be the authority) -- this page does sit in Google's Supplemental index. Right?
But let's refine the query to be a mere listing of the site contents -- site:yourdomain.com. To find the URL I discussed before, I needed to scroll to page 37 of the results:

This time, it doesn't show the "Supplemental" label. But is this a contradiction? Maybe in a straight-up site: query, the Supplemental label doesn't appear.
No, that's not it either, because if you click over to one more page of results -- page 38 in this case -- Supplemental results DO start to show up with the Supplemental label. Here (some on page 38, and all on page 39 and beyond), most of our pages are labeled as Supplemental:

So there's an irrefutable contradiction. Some Google results call this page Supplemental, and some don't. Why is that? Is it a data-center thing? Have some machines in the server farm not yet received the proper memo?
But let's cut to performance. For the query [crawling and indexing], the page ranks #1 at Google (even with no account sign-in). It doesn't bring a ton of traffic, but it does bring some:

So some would say, "There's your answer. It doesn't matter, because it shows up in SERPs and brings traffic. Quit overthinking it."
That's true -- IF the page is truly a Supplemental page. But because we have mixed signals, it's not so clear. Is this a Supplemental page that performs well despite its Supplemental status, or a Main index page that performs well and happens to be sometimes mislabeled as "Supplemental"? That's an important question I can't answer right now.
To follow up this post, I'll try to hit another angle: When a URL that is consistently labeled as Supplemental does not show up in search results, despite the fact that it's the most relevant post for that query.
Resources, Notes:
- Barry Schwartz first (to my knowledge) cracks the issue with this post last September. The technique worked briefly and sporadically.
- Adam Lasnik clarifies some Supplemental concepts at Google Webmaster Help (Google Groups).
- SE Roundtable brings up more questions.
- Aaron Wall brings it up again, with the added refinement of subtracting a nonsense string, which is more or less its current form.
- In pre-emptive response to one of my favorite audience niches (the beside-the-point nitpickers), I have adapted our robots.txt file in response to the third screen shot above. It now excludes search results and and stray comment previews.
Google's Supplemental Index: Questions and Inconsistencies
Posted by Erik Dafforn at 3:20 PM
| Comments (7)
| TrackBacks (0)
Printer-friendly version
March 16, 2007
Google Webmaster Tools Beefs Up Anchor Text Report 
posted by Erik Dafforn in category: Web Analytics
As if you needed another reason to verify your site within Google Webmaster Tools, that specific team announced early this morning that they've improved the report that shows incoming anchor phrases that point to your site. Get to the report in the Webmaster Tools area in the Statistics tab, then by clicking Page analysis:

Previously, the report showed only individual words that made up anchor text phrases. Now, the reports shows up to 100 specific phrases themselves, which is significantly more helpful:

Data like this ranges from interesting to very helpful. It offers great insight to the behavior of people who link to you, since many people probably think incoming anchor text focuses mainly on company or site names. In addition, it might lend some guidance about some strange referring keywords you've seen in your analytics reports - as well as why you might not be seeing some specific referring phrases that you want.
Google Webmaster Tools Beefs Up Anchor Text Report
Posted by Erik Dafforn at 9:55 AM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
March 9, 2007
Apple Falls Far from the Search Tree 
posted by Erik Dafforn in category: Organic SEO
About a month ago, I did a quick review of Midomi, the audio search engine that accepts singing or humming as a search query.
The results were good. I hummed a tune, and Midomi recognized it immediately. It was "Love is Blue."
So here's the thing: When Midomi picked the tune, it gave me the option of purchasing Al Martino's version for 99 cents. But what if Al Martino's version wasn't the one I wanted?
You can probably guess the next thing I did. I booted up iTunes and punched in [love is blue], which turned up 150 results - no fewer than 45 of which had the exact title I was looking for:

Listening to a few clips, it was obvious that Al Martino's version was not the one I wanted. Instead, I wanted the version from Paul Mauriat and His Orchestra, and iTunes, not Midomi, promptly got my $.99.
While it's true that iTunes got my money, it had to work for it. I booted up the iTunes music store because I knew that querying any search engine for [love is blue] would not bring up any iTunes search results. And even it if had, I'd have to jump-start the ten-ton beast to buy it anyway. (Let's be honest. It has some nice features, but running iTunes simply to buy and play your music files is like using a Buick Roadmaster to get from the kitchen to the living room.)
Right now, iTunes (the data warehouse/ecommerce portion - not the front-end music player) is a little bit like AOL was in 1994 - an isolated island of content, cut off from the rest of the world with proprietary technology and firewalls. And for the good of humanity, you really want to smack people who constantly rave about how good it is.
Still, there's been a meager attempt to have the iTunes Store inventory live on the web. And I mean meager. Apple has set up a sort of web-based parallel universe to tie its iTunes database to the web. Let's say it falls short of its potential.
While you won't find it simply for [grease soundtrack], you can find these Apple URLs painfully limping along the HTTP turnpike if you filter your searches by site, such as [grease soundtrack site:apple.com]:

A couple tiny problems with that approach:
- The pages currently rest on the soft, red velour divan of the Supplemental index
- Nobody filters queries by site anyway
- The on-page widget fails at its raison d'etre - it's an iTunes detector that cannot detect iTunes:

(At least it doesn't work with Firefox. The Apple page had better luck sniffing iTunes when I ran it through IE. But that still doesn't change the fact that the iTunes library is almost totally invisible in search engines.)
Here's the bottom line. Apple engineers, if you're reading, hit "pause" on your Nano and read the rest intently:
Steve Jobs thinks only 3% of the songs on the average iPod come from iTunes. Understandably, he wants more. Joe Wikert believes that Jobs wants to increase his market share by abolishing DRM. He might be right, but come on, Steve - there's a much easier way:
Start showing up at the top when people search for [grease soundtrack]. Or [foreigner 4]. Or [green day american idiot]. Or [pink floyd meddle]. Get real and understand that DRM isn't your biggest problem. Neither is Sony, BMG, Warner, or EMI. (Maybe you've been fighting Microsoft so long you have some David/Goliath issues, but you need to get over them.) Your biggest problem - and ironically, the easiest to overcome - is Amazon.
Here's what you need to do:
- Port your entire iTunes database - songs, artists, reviews, everything - to the web with a REAL crawlable architecture. And while you're at it, publish the lyrics too.
- Get in bed with the browsers. Not your iTunes pseudo-browser evangelists. Real browsers. IE and FF.
- Get those browsers to help you build an extension, plugin, widget - whatever you want to call it - just like Flash or Quicktime. But this plugin serves as a bridge between your data store and the web. This plugin makes it possible to purchase songs from the iTunes database without requiring iTunes to run. If you're on the same machine as your music library, it will download songs also. If you're not on the same machine as your library, the song will download to that machine the next time you start iTunes on that machine. Make sure this plugin works.
- Give each track a REAL URL devoted to that track only. Give it content - not just cover art and an iTunes sniffer.
- Build that content by ensuring that every comment and review posted on iTunes gets written onto the web-based page too.
- Leverage the existing enthusiast base by getting links from sites that already rank for band names, music genres, and lyrical content. You can drive links to every song, tv show, movie, and podcast in your catalog by creating an affiliate program similar to Amazon Associates, which pays up to 10% commissions. (By the way, this will help you sell a few billion songs in the process.)
- Create and actively promote an API that lets other sites feature your 30-second music clips. (This will help you sell a few million more.)
- Have iTunes spit out indexable URLs on demand (those same URLs you created in bullet 4) so bloggers and journalists can link to them easily.
Apple's iTunes is the envy of the music database world. It has more content and more popularity than just about anyone. When Apple realizes that these two ingredients can spell total search domination, the company will have a real issue on its hands - where to keep all that money.
Apple Falls Far from the Search Tree
Posted by Erik Dafforn at 2:38 AM
| Comments (12)
| TrackBacks (0)
Printer-friendly version
February 22, 2007
Steve Jobs and Apple - 1997 and Today 
posted by Erik Dafforn in category: Corporate Reputation Management
This post has very little to do with search other than a) I found an interesting article while searching for some Apple information at Google, and b) I continue to plod along on a large post that will discuss how Apple is missing out on tremendous revenue opportunities via search.
I'm neither a particular fan nor foe of Steve Jobs, although I think he's extraordinarily smart and savvy. I was doing some research and found a great article from Business Week that's almost exactly 10 years old. In 1997, Jobs had just returned to a languishing Apple as an "advisor" to then-CEO Gil Amelio.
The article is full of tidbits that seem either silly or ironic, including quotes from Jobs like "They want me to be some kind of Superman. But I have no desire to run Apple Computer. I deny it at every turn, but nobody believes me." Later that year, of course, he became Apple's CEO.
Some parts in particular are quite prophetic:
These days, every doing at Apple is examined for the Jobs factor--a management change could be a power play, a strategy shift might be proof Jobs is remaking the company, a new product direction becomes confirmation that the good old days of ''insanely great products'' are returning.
While progress was slow at first, the stock chart since then (along with a line of products that speaks for itself) tells the tale:
Steve Jobs and Apple - 1997 and Today
Posted by Erik Dafforn at 11:23 AM
| Comments (7)
| TrackBacks (0)
Printer-friendly version
February 16, 2007
Selling SEO: How I Lost a Big Potential Client 
posted by Erik Dafforn in category: SEO Companies
To the potential client with whom I recently talked, who said that organic SEO will be critical to the success of his yet-to-launch web site, who needed specific tips about how to optimize for long-tail phrases in an ultra-Flash-heavy environment, I have something to say.
I am sorry.
I am sorry that I offended you when I asked what your site was going to be about. I am sorry that I asked what industry you were going to target. I am sorry that I asked about your intended audience. I am sorry for all those things, even for volunteering to sign an NDA; the login on the home page should have warned me that the contents are super-secret -- too secret to be exposed, even with a contractual obligation to keep my mouth shut.
Maybe you've been burned before. Who hasn't? But let me tell you what I won't do with any information a potential client gives me:
- I won't tell anyone outside my company.
- I won't steal your idea.
- I won't start a competing web site.
- I won't call your competitors and tell them what you're up to.
- I won't tell you you're stupid or that your idea sucks.
Now, let me tell you what I will do with the information you give me:
- I will do some preliminary keyword research on your topic, so that I can discuss -- intelligently -- what your target audience is searching for, and how they're phrasing it.
- I will see how other sites in your market are building their sites to see how -- or whether -- they are integrating best-practices SEO into their sites, and how the engines are reacting to it.
- I will see who links to sites like yours to get an idea of how your future site can compete out of the gate.
- I will take a look at what you've developed so far to let you know if you've done anything that will cause grave harm to your organic efforts.
- I will tell you, honestly, whether I think we can help you.
So again, please accept my apologies. I wish you the best of luck in finding that SEO company who will give you the exact advice you need without having even the slightest notion of what you do.
Selling SEO: How I Lost a Big Potential Client
Posted by Erik Dafforn at 4:20 PM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
February 1, 2007
Midomi Nails It on the First Try 
posted by Erik Dafforn in category: Vertical Search Engines
Few people know that Bill Gates uses one of my inventions.
A long time ago, I told my wife that I had a killer idea spawned from a lifetime of shoveling snow in the Midwest. When pouring a driveway, you should immerse low-current heating coils into the concrete or asphalt to melt snow as it fell. Did I act on it? Of course not.
A few years later, as Bill and Melinda Gates built their estate on Lake Washington in Seattle, I read that he was employing just such a technology.
I didn't learn my lesson then, and only a couple years ago, I had another "thought invention" that I never put into practice. This one was a search engine that used your PC's microphone and allowed you to hum a song whose name you couldn't remember. Scanning patterns and frequency ratios, it would match your humming and tell you (assuming you're not tone-deaf) what song you were thinking of.

Well, a few days ago, Dr. Watlington via Search Engine Watch informed me I'd been scooped again. Midomi is the exact engine I had envisioned.
Currently on a crusade to put a name to the tune of several easy-listening hits I remember from my childhood, I decided to give it a try. All you need to do is give Midomi's Flash interface "permission" to use your computer's microphone (and camera, if you want to give them a visual record of your making a fool of yourself).
In ten seconds, I'd hummed a few bars, and ZING -- out popped "Love is Blue" -- including a link to purchase Al Martino's version, as well as links to other users' versions of this song.
Midomi's methodology is intriguing. The engine doesn't compare my humming with the actual professionally recorded songs. It compares my humming with the humming or singing of other users who actually knew what song they were humming or singing. So this way, the engine doesn't have to compete with professional background music or vocals of the studio tracks when it looks for a match.
I plan to follow up this post with some thoughts about how Apple is missing out on some significant search traffic. But for now, I'll ask Apple: Why haven't you bought this engine yet and integrated it into iTunes?
Midomi Nails It on the First Try
Posted by Erik Dafforn at 1:30 PM
| Comments (8)
| TrackBacks (0)
Printer-friendly version
January 27, 2007
Taking an Ad-Targeting Lesson from Seinfeld 
posted by Erik Dafforn in category: User Behavior
One of Jerry Seinfeld's older, better standup routines involved a laundry detergent ad - specifically, its ability to remove blood stains easily.
Jerry went on to suggest that if your clothing routinely requires the removal of blood stains, maybe picking the right detergent isn't your biggest concern.
A couple days ago, I was searching for a list of top Google subdomains, and on a simple search for [google], I saw this Adwords ad on the page:
![This ad came up in a search for [google]](http://seoblog.intrapromote.com/google-debt-serp-crop.jpg)
Here's a full shot of the results page in a new window.
Now this is probably a glitch, and I've been able to reproduce it only a few times, so I really don't want to go into the technical aspects of why or how it happened.
Instead, I am intrigued by the possibility that it's intentional, and that the folks at Compare.com have come to the brilliant conclusion that maybe - just maybe - if you're a person typing the word "google" into a Google search box, you might have problems stemming beyond computers, including but not limited to management of personal finances. I think it's a pretty safe bet.
Taking an Ad-Targeting Lesson from Seinfeld
Posted by Erik Dafforn at 3:20 PM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
January 21, 2007
One text link? Is that all it takes for Page 1? 
posted by Erik Dafforn in category: Link Building
I was looking at rankings for auto terms and noticed this SERP for [2007 Ford Explorer]. Notice the site in the #4 spot, 2007fordexplorer.com:

Curious to see whether that site was an official Ford site or just an enthusiast site, I clicked over. It's neither, apparently. Just the words "2007 Ford Explorer" on an otherwise blank page.
So how can it rank for that phrase with just the domain name and that title and simple body copy going for it? Must have a ton of high-quality inbound links, right? Not exactly.
![One inlink to the site ranking for [2007 Ford Explorer]](http://seoblog.intrapromote.com/yse-ford-exp.jpg)
A few things of note here. First, slightly off-topic, is that Yahoo is clearly reading CSS files, just as a few people are discussing about Google right now. But that's not important.
What's impressive is that the page (according to Yahoo, at least, which is about the most accurate source) is that it has just one external link pointing to it. Clicking the Inlinks (1) link shows us the page that's linking in:

And on that page? You probably guessed it - nothing but anchor text to various other pages with only the year and model name (or other similarly shallow text) as body copy:

And this page full of text links has only one incoming link - from its root page. As far back as I cared to search, nothing but garbage links. Maybe I've been working too hard...
One text link? Is that all it takes for Page 1?
Posted by Erik Dafforn at 11:00 AM
| Comments (9)
| TrackBacks (0)
Printer-friendly version
January 12, 2007
Future-Proofing Your Site by Resolving at the Folder Root 
posted by Erik Dafforn in category: Search Engine Friendly Design
It's become pretty common advice over the last several years to tell site owners to have their home pages resolve at the "root" (e.g., http://www.domain.com/) as opposed to some form of "home page" filename like home.asp, index.html, etc. Google's getting pretty good at getting those canonicalization issues figured out, but I'm sure it doesn't mind a little bit of help.
But this advice holds true throughout your site for a different reason than mere canonicalization. Suppose you have a site with a directory structure like Virgin Atlantic's. Here's a sample category URL for its flight search page:
http://www.virgin-atlantic.com/en/us/flighttimes/index.jsp
Decent URL (even though I'm not a big fan of the /en/us/ quagmire) - at least it's not littered with dynamic arguments like it could be. But if I were working for that company, I'd recommend having that page resolve at the root of the final folder, instead of at index.jsp. Why? The earlier mentioned canonicalization issue is one reason. We don't want Google crawling both
http://www.virgin-atlantic.com/en/us/flighttimes/index.jsp
and
http://www.virgin-atlantic.com/en/us/flighttimes/
and thinking they're different pages. But Virgin Atlantic has already covered this. In fact, the latter URL mentioned above results in a 404. So for that site, canonicalization/duplication problems aren't an issue.
Here's the big reason I'd recommend that the URL resolve at /flighttimes/: Because in a year or two from now, when the company rolls over into a new platform, the filename and/or the file extension of that folder's home page will most likely change. When that happens, they'll be mired in 301 redirects from each directory's old home page to its new one - all the way across the site. That won't waste hundreds of staff hours, but it will consume a few.
Contrast that with a platform rollover from .jsp to .aspx, .net, or .cfm, or whatever, when the page resolves at the folder's root. In that case, we'd go from this url:
http://www.virgin-atlantic.com/en/us/flighttimes/
to this one:
http://www.virgin-atlantic.com/en/us/flighttimes/
Get it? They're the same. No lag time whatsoever. No 301s to "wait out" while they sort through their processes. No URL changes for engines makes the SEO process much smoother. Knowing how to fix SEO issues is great, but knowing how to prevent them from happening is better.
Future-Proofing Your Site by Resolving at the Folder Root
Posted by Erik Dafforn at 4:19 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
January 4, 2007
How to Calculate Keyword-Based Conversion Numbers in Google Analytics 
posted by Erik Dafforn in category: Web Analytics
Google Analytics is great for assigning goals to certain events, showing you the referring keyword that triggers those events, and displaying what keywords have the best conversion percentage. But how do you determine the exact number of conversions that took place? If G1 is the download of a file, and G2 is the sale of a book, how do you determine the number of downloads or the total number of books sold, sorted by individual keywords?
You can easily determine a hard number of conversions (over a specified date or date range) in the Goal Verification menu:

But assigning raw conversion numbers to specific keywords is a bit trickier. Not rocket science, but it takes a little work. Here's what I do.
- First, create a Google Analytics keyword report. You have to determine conversions-by-keyword on a source-by-source (i.e., domain-by-domain) basis. So navigate to All Reports -> Marketing Optimization -> Visitor Segment Performance -> Referring Source. Pick a specific source (I chose "google [organic]"), then specify the Keyword report as shown here:

Note: To produce a keyword report, the source must be labeled as [organic] by Google Analytics.
- Once you've created the keyword report, export it into Excel, which should give you something like this. Note that I've added some column head colors, and that I've replaced actual keywords with "Keyword 1," etc., to protect client privacy:

- So now we know that Keyword 1 converted 1.53% of the time, over 652 visits. But how many raw conversions is that? It's a simple calculation, and we'll add it into the first available column to the right. The following shot gives the formula:

This forumla simply takes the number of visits from each keyword and multiplies by the conversion percentage, then divides by 100 to account for the percent. Note: This specific formula works only if the existing columns appear as shown here. You'll need to change the [-1] or [-3] as necessary if you have more or fewer columns in your spreadsheet.
I'm a little surprised that the raw number of conversions isn't already a part of the keyword report. It's valuable data and would be easy to add to the programming. Fortunately, unlike some full referring URL strings, conversion-by-keyword data is available with only a few clicks.
How to Calculate Keyword-Based Conversion Numbers in Google Analytics
Posted by Erik Dafforn at 1:02 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
December 12, 2006
Finding Referring URLs with Google Analytics - Sort Of 
posted by Erik Dafforn in category: Web Analytics
About a month ago, I was very happy to see the official Google Analytics blog come out with a post about finding the exact referring URL. It's one of the reports I use most regularly, and it has always surprised me that it's buried so deeply in the menu structure of Google Analytics. I was nearly giddy when I saw what the post was going to discuss, but then disappointed, because it didn't address my bigger concern, which I'll discuss further down.
A little background: Google Analytics makes it very easy to see which domains are referring traffic to your site. But finding specific referring pages within those domains is a little trickier - thus the post from the Analytics blog.
The following shot shows how to find specific referring pages within a specified domain. Here, I'm drilling down to see the specific page(s) at Sitepoint that sent traffic to our site:

That post solved only part of the problem. Google Analytics still does a poor job of showing the exact referring URL when it contains a dynamic string. vBulletin PHP pages are a good example. The following shot shows what happens when I click the Content selection in the menu above:

As you can see, Google Analytics doesn't report any dynamic arguments after the page name. This is a problem, because just seeing this page name does almost no good at all. Tens of thousands of Sitepoint pages begin with this filename. What I need is for the report to show this:
/forums/showthread.php?t=419917
instead of simply this:
/forums/showthread.php
A current thread at the Google Analytics Help group page is called Wishlist. I've posted here, requesting this feature. But there are many, many, many instances of similar requests that have gone unrequited. If you think this feature is important, I urge you to add your thoughts to the Wishlist thread.
Finding Referring URLs with Google Analytics - Sort Of
Posted by Erik Dafforn at 6:37 AM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
December 7, 2006
Smells Like ... Thursday Potpourri 
posted by Erik Dafforn in category: SEO Industry News
Just a few items of interest on a dark and stormy night...
- Segmentation of the SEO/SEM convention audience? As Doug noted, Danny Sullivan is heading out on his own - and not quietly. He and his team have launched a new media company, news & blog site, and a webcast site. But the one that caught my eye was the new convention he's planning, Search Marketing Expo. The first SMX event will be in June 2007, last for two days, and "will be especially geared toward advanced search marketers" - a distinction that separates it from the ever-expanding Search Engine Strategies franchise. At four days long and up to five sessions deep, SES is on the verge of suffering from the all-things-to-all-people syndrome. With Danny leaving, SES seems ripe for becoming, as Yogi Berra might have said, "so crowded nobody goes there anymore." Alan Meckler also weighs in.
- Happy Birthday, SE Roundtable. A belated third birthday wish to SER. Nearly two years ago, as we scoped out the editorial vision for our own blog, one of the potential routes was a SEO-news-as-it-happens approach. It was quickly put aside, however, because SER was already doing it, and doing it so well.
- SES Coverage. Speaking of SE Roundtable, make sure to catch its coverage of the current SES show, including an especially erudite post from our old pal, Amy Edelstein.
- Adsense and Al Qaeda. First spotted at WebGuerrilla and followed up at Search Engine Journal is what promises to be a very sensational story - one on which Webmaster Radio has apparently done much research - regarding terrorist groups using Google Adsense and Adwords programs to fund their organizations. No one (at this point) is accusing Google of actual complicity, but this will surely ignite further debate on click fraud and finding the balance between privacy and openness in the PPC money trail. Stay tuned.
Smells Like ... Thursday Potpourri
Posted by Erik Dafforn at 3:54 AM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
November 27, 2006
Evidence of Yahoo Crawling Google Sitemaps 
posted by Erik Dafforn in category: Crawling and Indexing
Given Yahoo's recent promise that it would begin to support the Google Sitemaps protocol, it's a bit anti-climactic to document evidence now, but I promised a follow-up.
Back in October, before the "big 3" officially admitted that they would read the same type of sitemap files as a benefit to site owners, I had my suspicions and ran a test to see if and when Yahoo would actually pull a URL from a Google sitemap and add it to the Yahoo index.
I created this orphan page and put it on the blog server. I added the URL to our Google Sitemap file and told Yahoo about the file via the YSE interface. Over Thanksgiving, using a text string query, I noticed that the file had been crawled by Slurp and was now appearing in the main Yahoo index:

Having been too busy to keep a close eye on it that week, I scurried over to YSE to check further and noticed that the file did indeed appear in the list of pages on our blog:

Note that the crawl date for the file - November 16 - is only a day after Yahoo announced its support for the protocol. That's impressive. I submitted the current sitemap file on November 7, and it was processed on the 8th. It's possible that my test file was crawled even before the 16th, since that's only the last crawled date - and I wasn't paying much attention to it during that week.
Regardless, hat's off to Yahoo for making good on their promise - and quickly.
Evidence of Yahoo Crawling Google Sitemaps
Posted by Erik Dafforn at 11:13 AM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
November 15, 2006
Are You Giving Away Links You Don't Know About? 
posted by Erik Dafforn in category: Link Building
Recently, I was looking through a client's list of indexed pages at Yahoo Site Explorer. (Get ready for another "All Hail YSE" post.) I noticed what looked like a lot of junk pages, and I found a site vulnerability that many sites could potentially have.
Link spammers had been attacking the site with an interesting attempt to get more links to their sites:
- Use the site's internal search feature to create a search results page that "searched" for links to the spam sites
- Get my client's site to output a search results page that links to the spam site
- Link to that spammy search results page to get it crawled and indexed
If none of that makes sense, here's an example. Let's say the spammers were trying to create links to Apple Computer (they weren't). They go to your internal search box and type the following:
![]()
...and then hit Submit.
Their goal is that your site outputs a search results page that includes text showing the search term. For example, this is what they want the search results page to say:
Search Results for iPod stuff
Next, they link to the page from their own site (or some site in their ugly network) and it gets crawled and indexed. And voila - they have a new link pointing to their site - from yours.
The spammer's plot failed for several reasons - one of which is that my client's site does not output a heading (or any text) that lists the search term.
But many sites do. So be careful and make sure that if you have a internal search engine that outputs unique search URLs that contain the query string, that someone's not indexing more of your site than you'd like.
Based on the client's unique needs, fixing the issue isn't as easy as you might think. We're looking for ways to ensure that this doesn't happen in the future, including some creative uses of robots.txt, changing form methods, and some contact with Yahoo.
Are You Giving Away Links You Don't Know About?
Posted by Erik Dafforn at 5:12 PM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
November 6, 2006
An Update on Yahoo Sitemaps Optimization 
posted by Erik Dafforn in category: Yahoo
Reaction to my recent post on optimizing Yahoo sitemaps has been mixed, ranging from "that's amazing" to "you're crazy and your testing methods are shoddy - that will never work!" (thanks for writing, Mom). So I'm trying to take an honest look at the actual probability that Yahoo is able to pull (and subsequently index) URLs found in a Google-style sitemaps file.
The original site I referred to in the post is still showing signs of increased indexing from Yahoo, and the sitemap file I've told Yahoo to use is the same sitemap.xml that I created for Google.
This alone, obviously, does not prove that Yahoo is pulling URLs from the sitemap.xml file. In addition, there are a few other reasons to be skeptical:
- About a month ago, in the YSE forum, the Yahoo rep ("Mr. Slurp") said flat-out, "we currently do not support Google's sitemaps protocol."
But does that mean that Yahoo can't even open the file, or merely that it doesn't recognize and work with the various tags within the file, such as <.lastmod>, <.changefreq>, <.priority>, etc.?
- Following on that point, on the feed submission page, Yahoo says "For any URL (directly submitted or obtained from a feed) our crawler will extract links and find pages we have not discovered already. We will automatically detect updates on pages and remove dead links on an ongoing basis."
So should this statement not apply to URLs such as www.site.com/sitemap.xml?
- When I submitted the sitemap file to Yahoo, it was "processed" within an hour of uploading and gave no indication of error or incompatibility.
But why should I expect such an error message? Sometimes all you get is an error if the page throws a 404, but little more.
I am currently running some tests that should prove definitively whether Yahoo can (and will) extract URLs from an xml sitemap. It could take a few weeks, but I'll certainly share my results here.
An Update on Yahoo Sitemaps Optimization
Posted by Erik Dafforn at 11:49 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
October 31, 2006
Optimizing Sitemaps Feeds for Yahoo 
posted by Erik Dafforn in category: Yahoo
If you're submitting sitemap feeds to Yahoo, consider using the exact same file you use for your Google feeds (often sitemap.xml or sitemap.xml.gz by default).
Until recently, I'd been using another of Yahoo's recommended formats, urllist.txt (due to its minimal file size), but I hadn't been watching the output code as closely as I should have. I'd been exporting the sitemap.xml file directly to urllist.txt.
As it turns out, this can create bloat even in a text file, because (depending on the program you use to create it), your Google sitemap.xml file contains many URLs you might not actually want to be crawled.
To clarify, I create many Google sitemap.xml files and tell Google to "check" but not "crawl" the incidental graphics files (used in design, nav, and so on). But upon export to urllist.txt, my program was simply listing these graphics files in the list to be crawled, just like all html files. That more or less tripled the size of the file, with two-thirds of the content being URLs I didn't even care about.
As a result, I deleted the reference to urllist.txt in Yahoo Site Explorer, and instead told it to fetch sitemap.xml, and within a week, the index count at Yahoo tripled. (Note that we've been working on a few other things for this site too, so I'm not necessarily claiming a 1:1 relationship here. But I know my change didn't hurt.)
Also follow this thread at YSE forums, where later, "Mr. Slurp" offers a user some keen insight into how Yahoo interprets typical "home" pages such as default.htm, etc. I guess the moral of the story is, canonicalization is in the eye of the beholder - never exclude when you can redirect.
Optimizing Sitemaps Feeds for Yahoo
Posted by Erik Dafforn at 11:59 PM
| Comments (7)
| TrackBacks (0)
Printer-friendly version
October 30, 2006
Searching for Horror Movies on TiVo 
posted by Erik Dafforn in category: Misc
Okay, it's not technically SEO, but it's still "search."
It's time to utilize TiVo to find some good horror classics from your local cable/satellite provider. If you're in the mood for something scary but don't know exactly what you want to watch, here's how to find all the horror movies shown in your area over the next two weeks:
- Start at TiVo Central.
- Select Find Programs.
- Select Search by Title.
- Select Movies.
- Select Horror.
- Scroll down through the alphabet listing and select the 0 (zero).

The list of all horror movies will appear in the right column. Scroll through the entire list to see what sounds good during this spooky season.
Searching for Horror Movies on TiVo
Posted by Erik Dafforn at 7:59 AM
| Comments (7)
| TrackBacks (0)
Printer-friendly version
ISBN-13 and SEO for Publishers 
posted by Erik Dafforn in category: User Behavior
Just a note to those who still publish on paper (and that's a lot of you): Even though ISBN-13 doesn't officially go into effect until January 1, it's not too early to start integrating your second set of book numbers into your site's copy, page titles, and internal & external linking strategy.
ISBN searches typically don't make up your bread and butter, but put together, it's a nice long set of crumbs - especially if you're one of the first ones to get indexed with the new data.
ISBN-13 and SEO for Publishers
Posted by Erik Dafforn at 7:37 AM
| Comments (0)
| TrackBacks (0)
Printer-friendly version
October 25, 2006
SES Registration - Now Five Times as Open! 
posted by Erik Dafforn in category: SEO Industry News
Say, does anyone know whether registration is now open for Search Engine Strategies Chicago 2006?

Oh, wait. Forget I asked.
SES Registration - Now Five Times as Open!
Posted by Erik Dafforn at 12:54 PM
| Comments (5)
| TrackBacks (0)
Printer-friendly version
October 24, 2006
Help Google Tackle Vertical with a Custom Search Engine 
posted by Erik Dafforn in category: Google
The big buzz this morning (Threadwatch, SEW Blog, WebmasterWorld) appears to be Google's announcement that it's allowing users to create custom engines based on the main Google index.
On the technical side, this is little more than using a subset of Google's index to narrow the number of specific sources that appear on a Google results page. In other words, if you're a dentist hoping to create the Next Great Dental Search Engine, you could tell Google to include only the sites you want, perhaps such as the American Dental Association, your own site, a few hand-picked dental blogs (I'm sure they exist ... right?), and so on.
This way, when someone searches for [plaque] on your custom dental engine, you won't get any trophy shops popping up in the results.
This is nothing new. Sites like Rollyo have been doing it for a while (although they're somewhat parasitic, using Google's index as a back end). But Google offers more horsepower than the others, letting your trusted users help you build the list of resources your engine uses, as well as letting you add thousands of sites into your custom index.
So let's call this what it really is. I've talked before about the potential power of small, vertical engines vs. large, catch-all indexes like Google's. It's pretty much understood that Google owns the latter segment. With this announcement, Google is offering its users the distinct privilege of enabling Google to own the first.
Matt Cutts pretty much sums up the potential here (emphasis added):
I do think that this launch will kick off a lot of opportunity that not everyone will see or understand at first. For example, the first person to make a truly kick-butt search engine about biking will likely start to attract volunteers and traction and first-mover attention, and could very well become the authority search for that niche. I think that this launch could kick off a wave of search over a long tail of niches; rather than a big vertical like “health,� someone could make a search for the much much smaller “health at every size� movement.
So Google is leveraging the power of communities to let communities themselves build their own vertical search sites. And Google runs its ads alongside these highly-targeted, loyal-user sites.
Smart and efficient. Sounds like in addition to C++ Programming Fundamentals, a few Googlers have been reading Tom Sawyer too.
Help Google Tackle Vertical with a Custom Search Engine
Posted by Erik Dafforn at 8:20 AM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
October 19, 2006
Using Google Analytics Bounce Rates to Gauge Site Stickiness 
posted by Erik Dafforn in category: Web Analytics
Buried deep in the guts of Google Analytics is a report called "Entrance Bounce Rates." It's off the beaten path of "Referring Source" and "Total Visits," so it doesn't always get a lot of attention. But it offers valuable information about visitors' habits on your site. Here's how to access the report:

The Entrance Bounce Rates report shows you a list of "entrance" URLs for your site (those URLs that people used to enter the site, whether via a search engine, third-party link, etc.) and the percentage of visitors who left your site after viewing only that page. Thus, if your bounce rate for a page is 100%, that means each person who entered the site on that page viewed that page only, then left the site. Like golf, the lower the number, the better.

If you don't know what to look for, the numbers can be confusing, or worse, useless. But when you filter the data by content area, things begin to make sense. For instance, if we wanted to measure bounce rates on this site for articles written in 2005, we need only enter that folder in the filter box, hit the plus sign, and we have our data.

The filter button is a toggle. When you press the green plus sign once, it becomes a red minus sign. If you press this, it enables you to see the bounce rates for every page except those in the filtered directory:

So comparing the stickiness of the articles written in 2005 vs. those written in 2006 happens in only a few seconds. Use this method across multiple categories of your site to see the rates in your case studies, executive bios, pages within a certain product or service area, and so on. If people leave one area of the site more frequently than they do in others, why is that? Did you offer a call to action there? Did you give them further opportunity to find out more?
Answering these questions requires some time and perhaps some tough content decisions, but it's an effective way to gauge the effectiveness of certain segments of your content - and in turn, create a more compelling, sticky, and (ideally) profitable site.
Using Google Analytics Bounce Rates to Gauge Site Stickiness
Posted by Erik Dafforn at 10:34 AM
| Comments (27)
| TrackBacks (0)
Printer-friendly version
October 17, 2006
Google Earth Gets Duped - by Google Earth 
posted by Erik Dafforn in category: Google
Conventional SEO wisdom has generally arrived at the point that claims duplicated content won't necessarily hurt you, but it won't really help you either. If you present Page X on two unique URLs, so goes the lore, engines don't know which version to pick, so they'll probably just pick one, although you don't necessarily have control over which one they'll decide to include.
That is, unless the engine owns Page X.
I was doing a little research on Google Earth, so I typed what I figured would be the correct URL: www.google.com/earth. Google is usually really good about guessing what people will type, and if that person is wrong, redirecting him to the proper page. But I wasn't wrong, because Google Earth did resolve at that address.
But I clicked around for a while, and wouldn't you know it, before long I was on the earth.google.com subdomain, and I was pretty sure I hadn't been redirected.
So www.google.com/earth/ and earth.google.com are identical. But, no big deal, right? After all, won't Google simply decide which of its pages to show in a query for [google earth]?
Not necessarily. Following is results page for that query (notice the listings in red boxes):
![The first and seventh results for [google earth] go to the same page](http://seoblog.intrapromote.com/google-earth-serp2.jpg)
The bottom line is, despite the fact that Google says "Don't create multiple pages, subdomains, or domains with substantially duplicate content," Google gives itself double the exposure on the results page because it has the same content sitting on two different URLs. Kids, if you try this at home, don't expect the same results. (And wear a helmet.)
And as a neat parlor trick, the two different entries for the page (again, boxed in red) even have faux differences. The top listing shows the DMOZ description for Google Earth. The lower listing shows copy pulled from the page body.
Oh, and notice that boxed in yellow, the exact same thing happens with Wikipedia. The only difference is an internal redirect on the Wikipedia site between /Google_Earth and /Google_earth. Get it? When you're Wikipedia, a character in lowercase is enough to get you a dual listing - to the exact same content.
Google Earth Gets Duped - by Google Earth
Posted by Erik Dafforn at 11:31 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
October 12, 2006
What Google Sitemaps Can and Can't Do 
posted by Erik Dafforn in category: Crawling and Indexing
On the heels of my complaint that Google is sending mixed messages about how it interprets the "nofollow" link attribute, I need to give credit where it's due. Earlier in the week, Matt Cutts asked his readers to report genuine SERP bugs. Not the type of "bug" characterized by your competitor's site ranking higher than yours, but results that return truly crazy results.
The post itself was fairly unremarkable because its purpose was merely to narrow his definition of "buggy." But in the comments section, when asked by a reader why the reader's site showed only two pages indexed at Google "when the site has several more pages that are search engine friendly and a Google Sitemap," Cutts dropped a nugget of gold:
The fact is that if you want Google to crawl you deeply (more than the 1-2 urls), you do need to have some links. Submitting a sitemap to Google lets us know those urls exist, but sitemaps are also not a back door; if no one at all in the whole web links to your domain at all, Google won’t crawl you as deeply.
Sounds like a great excuse to plug a link building service, but that would be crass. Instead, I'll just mention that a Google Sitemap is great for truncating the crawling time required for a site, but it's not a shortcut to ranking well, and as this quote states, it's not even a shortcut to getting indexed.
Per his request, I did leave a comment about the [therapy products] query that Sean caught in June. He said he'd pass it along.
What Google Sitemaps Can and Can't Do
Posted by Erik Dafforn at 11:25 PM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
October 10, 2006
High Rankings Seminar in Dallas/Ft. Worth - Oct. 19-20 
posted by Erik Dafforn in category: SEO Industry News
Colleague and fellow chocolate lover Jill Whalen asked us to post the details of her upcoming SEO seminar.
For the uninformed - if there are any left - Jill has been doing SEO since long before it was called "SEO." She's a regular speaker at SES and has been since Larry and Sergei were studying for their SATs. (That's probably an exaggeration, but it makes a good line.)
Here are the specs:
What: High Rankings® Search Engine Marketing Seminar
When: October 19 & 20, 2006
Where: American Airlines Training and Conference Center in Dallas/Ft. Worth
Why? To deliver the proven strategies and techniques of search engine optimization that will make your site work harder than it ever has before.
Get full details at Jill's site.
Discount: If you use INTRAPROMOTE as your discount code when registering, you'll save 25% off the seminar's sticker price.
Note that we do not receive any sort of referral fee, nor would we ask for one. Our recommendation is not for sale. We mention this only because Jill's that good, and sites of any size will benefit from her program.
High Rankings Seminar in Dallas/Ft. Worth - Oct. 19-20
Posted by Erik Dafforn at 3:34 AM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
October 5, 2006
Keyword Research with Keyword Discovery: A Few Tips 
posted by Erik Dafforn in category: Keywords
If you're smart, your SEO project begins with solid keyword research. We've subscribed to Wordtracker since (it seems like) the late '60s, but we've also subscribed to Keyword Discovery for about a year.
This post shares a few simple tips into how to interpret and maximize the data you get from Keyword Discovery.
Dealing with Strange Numbers in Results: Sometimes a term search results in some strange numbers. For example, in the following graphic, you'll note that a lot of people seem to search for odd terms, such as 10025 cotton shirts and discount women27s hanes t shirts.

You'll see odd results like this when hexadecimal codes aren't properly translated into their HTML symbol counterparts. On a hex-to-HTML chart, 25 correspondes to the % sign, and 27 is a single prime (more commonly used as an apostrophe). So the actual terms here are 100% cotton shirts and discount women's hanes t shirts.
If you run into some confusing instances, look at the table on a page like this one to convert hexadecimal codes to their actual HTML characters.
Finding the Demand Trend: Early on, in the Analyze pane, you could get a nifty graph of the last 12 months' demand for a specific word or phrase. I was blown away when I noticed that the feature seemed to disappear. Thankfully, it didn't; it was merely relocated. Now, to find the demand graph, you need to click the specific number of results in the Searches column in the Research pane immediately after you hit the Search button. For example, in the graphic above, you'd click the number that I've highlighted in pink. Following is a sample demand graph for a specific phrase:

This data isn't always perfect. For example, the spike in April looks a little suspect, or it might correspond to a large TV ad campaign or other offline project. But if you trust your data, we feel the best time to make changes to your pages or test some optimization is during the beginning of a lull - so that you can have time to refine your changes before the next seasonal growth spurt hits.
Negative Filters: You can filter out terms from the search box to save time. For example, if you're looking for terms related to social or business networking in Los Angeles, you might enter the following string at the search box:
los angeles networking -computer -computers -it
Currently, you can filter out up to five terms. After that, if you need to delete additional veins, I recommend exporting to Excel and doing additional custom sorts to find the irrelevant terms.
Plurals: KD doesn't yet handle plural forms well. For example, if you search for terms with computer, you'll need to do a similar search for the same terms with computers as well. According to the KD support forum, they're working on it.
Flushing a Project: Depending on how you allocate your projects, you might (as I did) find yourself taking forever to delete the contents of a permanent project, 100 keywords at a time (the program's max). I don't know why it took me so long to figure out, but a much smarter way of working is to simply delete the project and immediately create a new project with the same name.
The growth of Keyword Discovery has (in my opinion) forced WordTracker to make some improvements of its own. In a followup post, I'll discuss some of Wordtracker's latest enhancements.
Keyword Research with Keyword Discovery: A Few Tips
Posted by Erik Dafforn at 6:57 AM
| Comments (4)
| TrackBacks (0)
Printer-friendly version
October 3, 2006
This is Why Users Mistrust PPC Ads 
posted by Erik Dafforn in category: PPC
I'm often a little wordy in my posts, so as an experiment, I'm going to try to get my point across almost entirely with screen shots (and Alt tags, of course). Here's the scene: I wanted to buy a Spirograph for my daughter. Let's see how that goes.
Step 1. The search:

Step 2. The click:

Step 3. The PPC landing page:

Step 4. The internal search:

Step 5. The internal search result:

Step 6. Checking Adwords policies:

Step 7. Whatever.
This is Why Users Mistrust PPC Ads
Posted by Erik Dafforn at 11:16 PM
| Comments (11)
| TrackBacks (0)
Printer-friendly version
September 27, 2006
Revisiting the Many Faces of Nofollow 
posted by Erik Dafforn in category: Crawling and Indexing
About a month ago, in my post about Del.icio.us cloaking its robots meta tags, I got into a great discussion about the relative functions of the "nofollow" robots meta tag vs. the "nofollow" link attribute.
Jason Dettbarn, in the comments, said
I know one is global and one is used for individual links. But end result, what is the difference? Both tell robots not to follow the links.The reason I bring this up, is because del.icio.us still has the "nofollow" link attribute on the individual links, regardless if your user agent is "normal" or Googlebot or whatever. So the link juice still doesn't pass, regardless of the meta tag.
to which I replied,
As for the nofollow meta tag and the nofollow link attribute having the same effect, that's not my understanding. As I understand it, the nofollow meta tag tells the bot to literally not crawl the target page, while the nofollow link attribute does NOT instruct the bot to avoid crawling the link, but instead, tells it to merely not pass link popularity (or PR, or however you want to think about it).So having a nofollow meta tag on a page does (or should) put a stop to indexing pages linked from that page, while having the nofollow link attribute enables indexing of links on the page but does not allow them to pass popularity.
Jason set me straight by pointing me to an earlier Cutts post that described the two as being similar in functionality. With that, I felt a little foolish, although it didn't negate the main point of my post, which remains that Del.icio.us is misleading its users.
Fast forward. I feel vindicated today, because while I still wasn't right (at least in terms of what Matt Cutts says, which I'll consider authoritative), I certainly wasn't the only one who believed that the two "nofollow" attributes have different purposes.
In an interview with John Battelle yesterday, Matt Cutts once again equated the attributes. This morning, Danny Sullivan asked for clarification, saying
Let's back up. You can put a meta robots tag on your pages with the value of "nofollow," as described here. This tag, about 10 years old now, long predates any concerns about link selling skewing search results or the nofollow attribute. It is supposed to tell a search engine not to follow any links on a page, for purposes of indexing those links....
Now on to the nofollow attribute. Created in January 2005, it was a way to flag particular links to search engines as those a site owner doesn't explicitly approve of. It was never defined as a means to telling search engines not to actually "follow" the link. It was more a way to say that you don't endorse the link. In fact, to my knowledge, Yahoo and perhaps others will still "click on" or follow links even if they make use of the nofollow attribute.
I doubt that Matt Cutts misunderstands Google's methodology in dealing with the two "nofollow" attributes, so I'm officially changing my beliefs on their usage. But I do feel better knowing that I'm not crazy, and that others (of significant influence and industry knowledge, no less) were lured into believing as I did.
Revisiting the Many Faces of Nofollow
Posted by Erik Dafforn at 10:05 AM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
September 20, 2006
How Will Ask Profit? Should've Asked the Prophet 
posted by Erik Dafforn in category: Ask
Our own Sean Bolton goes on a rant now and again, and more often than not, he's right.
Last December, he took Ask to task for not being particularly realistic in continuing with its homegrown PPC program. In particular, he had this to say:
Perhaps Mr. Jeeves should ask himself a few questions:
- Does it make sense for me to continue PPC when I can just earn similar or possibly better revenue by just leveraging the existing relationship with Google for AdWords rev?
- Will I do an effective enough job in PPC sales and customer service to some day kill my relationship with Google and keep all the green to myself?
- Why do I have less than 6% market share in the search engine war?
I kept this in mind when reading today's MediaPost article, Diller: Ask.com To Continue Outsourcing Paid Search, which specifically states that back when IAC purchased Ask,
...one of the company's priorities was developing its own paid search platform for advertisers. But Barry Diller said Tuesday that the company has since changed its strategy. Now, he said, IAC is focusing more on drawing consumers to the site than selling its own pay-per-click ads to marketers."Queries will build revenue," Diller said at a Goldman Sachs investor conference. He said the company's goal is to capture 10 percent of search queries--up from around 2 percent on Ask.com now and 5-6 percent considering other offerings.
Sean's above saying he told us so, but I'm not. He told us so.
How Will Ask Profit? Should've Asked the Prophet
Posted by Erik Dafforn at 4:06 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
September 19, 2006
Search Engine Optimization as Defined by the US Government 
posted by Erik Dafforn in category: Misc
When our former PPC Director, Adam Lasnik, took a job at Google working with Matt Cutts, we knew it was Google's gain.
But I'm not above making sure his time is well spent. Just yesterday, Adam was to have taught an SEO seminar at Catholic University. The target audience? Government employees.
But do government agencies really need SEO help? I decided to query a few .gov sites to see what they say about how to optimize their sites. (I fully acknowledge that many of these might be officially outdated, but they are still live pages.)
- Here are a few tips on search engine usage (PDF) from the Department of Energy's Oak Ridge Office:
- Use two or three search engines since no one will cover all Web sites
- Keep up with information about the Web by surfing various sites and talking to friends
- The US Government's Export Portal (PDF) challenges you to keep a close, regular watch on your conversions:
How will I know that my site is successful? Look at your goals every three to six months. Have you met them? If so, is it time to create new, more challenging goals?
- FirstGov reminds us (cached version of a PPT file) that "MSN and Yahoo! obey robots exclusion more often than Google." Ouch. Also, in case you were going to send it a Christmas card, "Google’s algorithm is called Page Rank."
I hope Adam had a full house (sounds like he did) and that the participants were able to take away a great deal of information. I'm considering hitting universities next to see what they're teaching about SEO. I'm sure Adam would love to head back to school to star in his own lecture series.
Search Engine Optimization as Defined by the US Government
Posted by Erik Dafforn at 5:08 PM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
September 13, 2006
Site Verification Headaches with Yahoo and Google Sitemaps 
posted by Erik Dafforn in category: Yahoo
In the spirit of Doug's most recent post about Google Webmaster Central, I wanted to add a few notes about both it and its Sunnyvale counterpart, Yahoo Site Explorer's recently updated webmaster area.
Yahoo Site Verification. I spent about an hour this morning preparing to verify about 20 sites for a very large client. One thing that's REALLY annoying about Yahoo's site verification process is that each site requires a unique text file - complete with unique filename and unique 16-character text string within the file - uploaded to the root.

Now, of course you can't create all 20 verification files, dump them into an email message, and send them to the client for uploading, because the client won't know what file goes with what site. So I created a folder for each file and zipped all the folders into one Zip archive.
I also added an Excel sheet with columns for the site, filename, and character string, because more than once, I've sent Yahoo authentication files over and over, only to have the recipient complain that the attachment didn't make it through. Apparently, many zealous mail clients look askance at curiously named, 16-byte file attachments. With the Excel file, I had a failsafe record of each verification file's contents in case they needed to be recreated by the client.
I'm sure Yahoo has a reason for giving each user a different authentication filename AND character string for EACH site that needs to be authenticated. I'm just not sure what the reason is.
Contrast this with Google Verification. First, I have to be honest and admit that I'd verified about a half dozen sites through Google before I realized that each time the server spat out an authentication file, it was the exact same file each time. Few people understand that with Google, your unique verification file (tied to your personal Google account) is your backstage pass to any concert you want. You can view the stats for any site that hosts your verification file in its root, and a site can host verfication files for as many people as need access to the stats.
So verify one site, then keep that verification file in a place you'll remember. From then on, you don't need to go through the process of having Google spit out the same info again and again, each time you want to verify a new site. Just upload your file to the root and Verify.
Like Yahoo's verification files, Google's also suffer from Napoleon Complexes - in fact, with no recommended content at all (just unique filenames), email clients are even more suspicious of them, because at 0 bytes, they're infinitely smaller than Yahoo's 16-byte files. While Google doesn't specifically demand that your file contain text, it doesn't discriminate against files that do. So here's a tip: Add some nonsense text to your Google verification file, and I think you'll find it more easily passable through email.
Site Verification Headaches with Yahoo and Google Sitemaps
Posted by Erik Dafforn at 11:34 PM
| Comments (29)
| TrackBacks (0)
Printer-friendly version
September 5, 2006
MSN Loves Blogs Too 
posted by Erik Dafforn in category: Blogging
We (and our ilk) often talk about how Google favors, at least algorithmically, the blog format. But MSN is right up there in terms of giving preferred treatment to blogs.
"Preferred treatment," of course, is a misnomer and a bit of a joke. After all, the preferential nature goes both ways. Well optimized blogs give engines what they like, and engines respond in kind.
A client started a blog recently that contains posts built of press releases, industry news, pointers to other articles across the web that highlight how his industry's technology is utilized around the world, and occasional links to new content on the main client site. All new material - nothing reprinted.
This client also retains the services of a very smart host/web dev consultant who wrote some nice code to query the blog database, pull the five most recent posts, and link to them statically from the main site's home page. (Often, a "syndication" technique like this would use scripted links to pull the blog's most recent entries.)
The blog began pulling long-tail Google queries within a couple weeks of its first post. But a month later, MSN was out-referring people to the blog. In fact, the blog had taken over MSN's very top spot for a two-word phrase that the client's main site had formerly held.
Too many times, site owners leave it at that - "Search engines love blogs" - and don't grasp that the logical next step is to treat all sites, not just blogs, like blogs. Constant content generation. Generous linking. Smart, keyword-based nomenclature. Be sure to give your "bread and butter" content - everything about your business - the benefit of an archetecture that gets the engines' attention.
MSN Loves Blogs Too
Posted by Erik Dafforn at 11:54 PM
| Comments (1)
| TrackBacks (0)
Printer-friendly version
August 30, 2006
Del.icio.us Leaves a Bad Taste 
posted by Erik Dafforn in category: Link Building
If this has been covered already, let me know. If so, I'll graciously provide attribution.
"Social tagging," the process of users sharing bookmarks and feedback about specific sites and pages, is near the top of the list of cornfields on which SEOs are trying to erect slick new subdivisions.
As social media sites have gained popularity, many SEOs have lamented the fact that Del.icio.us uses the robots meta tags nofollow, noindex, and noarchive as a way to avoid spam. If links don't pass popularity, then they won't be abused, so the theory goes. (Don't confuse this nofollow with link attribute nofollow.)
This has left many people wondering why a query for [site:del.icio.us] shows about a million and a half pages indexed, and why the site ranks for queries like [seo] and [popular]. Some people believe it's due to incoming linkage and Google's tendency to show URLs in results even though Google has been told not to index them.
For better or worse, the truth is much simpler. Google was never told to not index Del.icio.us pages. YOU were told that GOOGLE was told not to index pages. But Google? They never got the message, because Del.icio.us has been using user-agent delivery (yes, cloaking) to tell you one thing, and engines another.
Following is the famous meta tag from the Del.icio.us "SEO" tag page - the meta tag that makes everyone think the page won't be crawled:

But if you set your user-agent to Googlebot, here's what you see:

Where did those highlights go? Only her hairdresser knows for sure.
The robots.txt file for the site is no different. Here's the file for standard user-agents:

I left some extra whitespace in the screen shot to show that nothing follows the code lines.
User-agents Googlebot and Slurp each get additional lines in their versions of robots.txt. Following is what Google sees:

What annoys me about this process is not that Del.icio.us is trying to put one over on Google or Yahoo. (The latter would be especially odd, given that Yahoo owns Del.icio.us), but that Del.icio.us is trying to put one over on YOU. Certainly Google and Yahoo know what's going on. Millions of pages don't magically appear when valid noindex tags are in place. Del.icio.us wants to be a popular destination, wants its search engine rankings, but it doesn't want all the riff-raff that popularity brings. Old-school cloaking that a 10-year-old could detect isn't a way to achieve that.
Del.icio.us Leaves a Bad Taste
Posted by Erik Dafforn at 4:52 PM
| Comments (11)
| TrackBacks (0)
Printer-friendly version
August 22, 2006
When User Behavior Strays from Query Volume 
posted by Erik Dafforn in category: User Behavior
It's important but often overlooked: User behavior is rarely as steady and predictable as keyword research might lead us to believe. Eye-tracking studies, statistics about SERP clicks ("the first result gets X% of all clicks"), etc. are helpful if you understand that they're aggregates and not absolutes.
If you rank in the top slot for two phrases, and one of them is three times more searched-for than the other, you might assume that, all things being equal, the more popular term will deliver three times the traffic.
That's rarely the case, especially when those two queries straddle the border between branded and non-branded.
Following are some examples that we've noticed in multiple industries, for multiple clients. The companies and queries are fictitious; it's the types of queries, however, that matter - [product type] vs. [brand + product type]. The raw numbers - queries per day and monthly traffic - are irrelevant. Instead, it's the ratios we're watching.
Note that in the first example, [conflators] is searched for about 4.5x as often as [merrick conflators] (despite the fact that in the conflator world, Merrick is tops). At Google, Merrick ranks #1 for both terms. Yet [merrick conflators] delivers about twice the traffic of [conflators]:

In our second example, Simonaire is well known in the flot scram industry, but probably not as well known as Merrick is in the conflator biz. Still, Simonaire ranks #1 for both [simonaire flot scrams] as well as [flot scrams]. The non-branded term has about 30 times the query volume, but again, delivers only about half the traffic of the branded term.

Note: The query volume figures were pulled from Keyword Discovery. Wordtracker data varies slightly but is similar.
The conclusions of this non-scientific study aren't so easily drawn, but here are some observations and speculations:
- The point of this analysis is not to dissuade brands from going after single-word product queries. They should, however, realize that the percentage of clicks they receive from a top slot might not be what they expect.
- These results imply that people searching for [conflators] are not very far along the information cycle yet and might actually want to avoid a specific brand at this stage in their research, opting instead for a comparison site, wiki-style information site, consumer-focused FAQ site, etc.
- It's tempting to tweak titles, descriptions, and content to try to appear more cross-brand informational and capture more of the [conflators] traffic. But I don't recommend doing it at the expense of your branded traffic, because click for click, I believe a branded click is more valuable than a non-branded click.
- The traffic from the product-only searches sticks around 50-60% of the duration of the branded visitors, and they view about 75% as many pages in a visit. So we're gaining mind share a few at a time, and we certainly don't mind that their first look at the industry comes from our clients.
When User Behavior Strays from Query Volume
Posted by Erik Dafforn at 4:10 PM
| Comments (5)
| TrackBacks (0)
Printer-friendly version
August 10, 2006
Yahoo Redirects Site: Queries to YSE 
posted by Erik Dafforn in category: Yahoo
Sometime in the last 10 hours or so, Yahoo started redirecting users (this user, at least) from search.yahoo.com to siteexplorer.search.yahoo.com when used on conjunction with a site: query at the search box:

One possible reason for this is that as I mentioned yesterday, Yahoo has really been beefing up its Site Explorer area, and it's the ideal place to run such queries.
Another reason might be to help balance the server load, although I doubt that site: queries are a serious threat to the Yahoo server farm.
In a somewhat related move, Yahoo seems to have phased out the sitedomain: command - both from a regular search box and from YSE. Is this new? I typically use site: at Yahoo, so this could have happened some time ago.
UPDATE: It looks like others (including SEW) noticed this in testing a few weeks ago. It does appear, however, that today marks more widespread implementation, as the Yahoo Search Blog has just posted a description of what sorts of queries do and do not get redirected.
Yahoo Redirects Site: Queries to YSE
Posted by Erik Dafforn at 10:31 AM
| Comments (0)
| TrackBacks (0)
Printer-friendly version
August 9, 2006
SEO Quick Hits: NYT Mocks AOL, Inaccessibility Turns 10 
posted by Erik Dafforn in category: SEO Industry News
Everyone's either at SES this week or too busy to attend (I fall into the latter camp), so I wanted to give you a fast, dim sum-style post today. If the first bite doesn't taste good, just move on to the next plate.
NYT + AOL = FUBAR
Gotta love the New York Times. While you might have heard about the fiasco involving AOL releasing the search queries of over 600,000 "anonymous" searchers, here's the kicker: In less than a day, the Times looked at the search queries of one particular searcher and identified her.
Perhaps next, the Wall St. Journal will both identify another user and decry the anti-privacy implications of identifying users.
Flash Turns 10
In a Wired article today, the tenth anniversary of the release of Flash, Michael Calore interviews Robert Tatsumi, one of the program's two inventors. Personally, I love Flash, in sort of the same way that exterminators love termites: job security.
Yahoo Expands Site Explorer
I've said again and again how much I love YSE. Yesterday, the team announced upgrades to the service, including the ability to "claim" your site to find additional information, upload sitemap feeds, and see when those feeds were last accessed. It's a lot like Google's new "Webmaster Central" area (formerly known simply as Google Sitemaps), which is a nice indication that Yahoo is equally committed to good webmaster relations. And it's pretty fast too; it authenticated me instantly and fetched my sitemap feed in under an hour.
I'll keep an eye on its reporting and see if there's any effect on indexing and let you know anything I find.
SEO Quick Hits: NYT Mocks AOL, Inaccessibility Turns 10
Posted by Erik Dafforn at 1:53 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
August 2, 2006
More SEO Tips for Domain Management 
posted by Erik Dafforn in category: Search Engine Friendly Design
If you work in online marketing for a medium to large company - and you joined the company sometime after, say, 1996, here's a quick quiz: Do you know how many domains your company owns? If you're directly involved in your company's current domain strategy sessions, you might, but my guess is that you don't.
Many of our client contacts have no idea how many domains their companies own, and what's worse, frequent inquiries across their organization often lead nowhere. If you fit the profile of someone who should know more about your company's domains but doesn't, it certainly doesn't mean you're doing your job poorly. Keep in mind that securing domains for business purposes (such as copyright protection and basic phonetic variance) predates search engine optimization - not to mention search engines - by several years. The web dev mercenaries who built your site and went on a dot-com shopping spree in 1994 are long gone, and bills from the registrar come sporadically, representing a gradual increase in domain ownership.
If you're curious about whether you have multiple domains diluting your search engine presense, a fast diagnostic is to search for a specific string of text that should appear only on your site. Copy about 7-10 consecutive words from a page on your site, then search for that exact string - in quotation marks - at various engines. Traditionally, this has been a great way to find sites that steal your content. But it's also equally effective at detecting your crimes against yourself - and your SE visibility potential.

Despite a glamorous Hollywood visual metaphor, Googlebot and duplicate sites aren't a good mix.
I talked with a business owner this week who had a decent idea how many domains he owned, but he had no idea that having each one mirror his "main" site was a bad idea. Google had partially indexed about nine different domains. Yahoo knows about two. MSN knows about one. All engines show at least two variations of canonical problems, including home page (index.asp vs. root) and subdomain (www vs. non-www) duplicate indexing. This is always one of the very first things we look for when we're starting a comprehensive site evaluation, and very, very few companies have all their domains wrangled correctly, down to the last 301. So the lessons bear repreating: Give the engines what they want, but give it to them only once, or else you risk looking suspicious - even if you're old-school innocent.
(Note: Last December, I touched on a few points of search engine-friendly domain management, including wildcard subdomains and relative vs. absolute links in a nav scheme.)
More SEO Tips for Domain Management
Posted by Erik Dafforn at 11:49 PM
| Comments (4)
| TrackBacks (0)
Printer-friendly version
July 25, 2006
Conversions and Query Length 
posted by Erik Dafforn in category: User Behavior
At Search Engine Watch yesterday, Barry Schwartz noted that OneStat posted a study showing the breakdown of query length for July 2006. For search marketers, this is great user behavior intelligence that can benefit both organic and PPC campaigns.
I decided to do a very unscientific mutation of the report. I overlaid conversion data from a campaign I'm working on against the OneStat query-length data, just to see what would happen:

(source of pink data: OneStat. source of blue data: Intrapromote.)
If you remember much about calculus, you know that if the two lines overlap exactly, that doesn't mean that two-word phrases, for instance, convert better than three-word phrases. Instead, if the two lines overlap exactly, it means that all query lengths convert at relatively equal rates. For example, if 35% of all queries and conversions come from two-word queries, and if 15% of all queries and conversions come from four-word queries, then two-word and four-word queries convert at the same rate.
Therefore, the noteworthy locations on the graph are where the two lines diverge most dramatically. In my example, one- and two-word phrases convert at higher rates than their respective query volume rates.
Practically speaking, what does this mean to a search campaign?
In my case, it's an indicator that two-word phrases are a revenue-rich target because they have both the raw search numbers and the conversion rate to pay off. Single-word terms have great conversion rates but far fewer raw searches.
Note that this is extremely unscientific. It pits worldwide query string length data against conversion data in a specific vertical. So I certainly need to crunch more numbers. But it gives some good hints about where to look to increase traffic and conversions.
Conversions and Query Length
Posted by Erik Dafforn at 11:30 PM
| Comments (5)
| TrackBacks (0)
Printer-friendly version
July 19, 2006
Google, Hoops and Hat Tricks 
posted by Erik Dafforn in category: Keywords
In doing some keyword research this morning, I was checking to see what Google considers to be synonyms of various sports-related keywords.
I frequently use the tilde (~) character in such searches to find ways to vary text but still keep the pages relevant for specific terms and concepts. Google has this to say about the tilde search on its search refinement page:
If you want to search not only for your search term but also for its synonyms, place the tilde sign ("~") immediately in front of your search term.
So I was a little surprised when I did a tilde search for [~nhl] to see that Google considers NBA synonymous with NHL:

I expected to see National Hockey League, and Sports didn't surprise me. But NBA? That's a stretch. (Strangely, searching for [~nba] doesn't return NHL as a synonym, as you might expect.)
I did more tilde searches for abbreviated sports leagues, such as [~mlb], [~wnba], [~nfl], and a few others. In all cases, Google typically returned as synonyms the full name of the league or another related term - but no surprises.
So who knows more about hockey than Canada? Logic would then suggest that google.ca would fix this American error - right? Almost. While google.ca still shows NBA in its results for [~nhl], NBA.com appears at number 2 instead of number 1, where google.com puts it.
Google, Hoops and Hat Tricks
Posted by Erik Dafforn at 2:50 PM
| Comments (6)
| TrackBacks (0)
Printer-friendly version
July 12, 2006
Vertical Search vs. Big Search: Drawing Battle Lines 
posted by Erik Dafforn in category: User Behavior
It's late; let's lay it on the table:
In the near future, who can offer the user more - the always-innovating "big" engines, or a slew of vertically focused niche engines?
- Slack Barshinger and SearchChannel say verticals[*] (PDF - registration req'd):
While Yahoo!, Google and the like will continue to dominate the scene and - in aggregate - comprise the bulk of the online consumer's share of mind and media consumption, a myriad of vertical search engines are emerging to address the particular informational and research needs of niche audiences and professions.
- Search Insider (Aaron Goldman @ Media Post) says the Big Engines:
Clearly, until search results can be better customized on the general engines, many searchers will prefer (and find value in) going to an engine or directory tailored specifically to their needs. But think about how far the Big 4 [**] have come in just the past couple years in terms of personalization and tools for refining search queries. The time is not long before the general engines will be able to deliver results as relevant as today’s vertical engines–if not more, when overlaid with past browsing behavior, social networking, tagging, etc.
- Alan Meckler, while not saddling up for this particular battle, has said enough in the past to make me think he'd pick the verticals.
- This is only a guess, but I think the big engines would say "the big engines."
Personally, I'll go with verticals. First, Internet history is littered with the pink slips of people who argued against Alan Meckler. Second, if they're in the game for the long term, people will find the vertical engines and stick with them. The GYM will always win in raw numbers of the great unwashed, but don't underestimate the drawing (read: earning) power of a tightly focused audience of enthusiasts and advertisers.
Goldman argues, rightly, that users will consider it a pain to hop from vertical engine to vertical engine to get into the specific data silo they're looking for. But user fickleness is a sword that cuts both ways. Is the user any more likely to delve into the "big" engine's personalization settings? Might that not entail getting (gasp) an account with the engine? Maybe we should ask the .5% of the internet population who knows what Froogle or Y!Q is, or maybe the 3% who know how to use any engine's advanced search features. Google can attribute much of its success to its visual simplicity and having avoided the temptation of cluttering up ("portalizing," if you will) its prime real estate. Requiring the user to dig past that to opt in and configure personalized searching would be robbing Peter to pay Paul.
[*] I suppose you should keep in mind that SearchChannel is a developer of niche search engines, which might account for some of their exuberance. Still, the report is worth downloading, if for no other reason than the nice directory of vertical engines that makes up the back 60% of the report.
[**] In case you're curious, #4 in this instance is Ask.com.
Vertical Search vs. Big Search: Drawing Battle Lines
Posted by Erik Dafforn at 11:49 PM
| Comments (4)
| TrackBacks (1)
Printer-friendly version
July 11, 2006
Are you B2C or B2B? Are you sure? 
posted by Erik Dafforn in category: User Behavior
Wendy Davis at MediaPost shared some interesting numbers earlier today (pulled from a JupiterResearch report) about how small businesses use the web for online shopping.
According to the report,
Sixty-two percent of those that make online purchases said familiarity with the vendor is among the most influential considerations; 46 percent said the same for online research and 39 percent said that advice of friends and business associates plays a major role. (Respondents were asked to choose up to three factors that influence online shopping.) E-mails and coupons were influential for just 21 percent of small businesses’ online purchases.
In the quote above, I've emphasized the key factors that drive employees to select an online vendor:
- Familiarity with the vendor. How strong is your web presence? Does your name consistently appear for searches within your niche?
- Online research. Do you own your online reputation?
- Advice of friends and business associates. What's your track record for keeping customers happy, and giving potential customers a reason to come back when they're more motivated (i.e., further along the purchase track?)
One of the report's major findings was that "almost eight in 10 small businesses, 79 percent, shop online regularly, compared to 65 percent of online consumers."
The end result is a blurring of the lines between B2B and B2C. In other words, while you can be pretty sure that an order of 8000 boxes of thumb tacks are a "business" purchase, there's also a pretty good chance that when Mark in Memphis orders a microwave oven, he might need it for the company break room. And maybe Mark's company is growing, so he might need an espresso machine soon.
What does this have to do with Search?
- Do your page descriptions and web copy (and thus, your search results) discuss corporate relationships? Corporate accounts? Bulk discounts? Despite the type of business you're in, are you friendly to both the big "B" and the big "C"?
- Does your PPC dayparting (changing bid strategy based on time of day) make (perhaps faulty) assumptions about who's coming to your site at 2 pm?
Search results mean very little if the user clicks over and doesn't find what she's looking for - either specific products, or even a subtle vibe. Ensuring that your site appeals to people when they're both on and off the clock, despite what you think you know about your vertical, is never a dumb move.
Are you B2C or B2B? Are you sure?
Posted by Erik Dafforn at 11:17 PM
| Comments (2)
| TrackBacks (0)
Printer-friendly version
July 5, 2006
A SERP of One's Own 
posted by Erik Dafforn in category: Misc
With apologies to Virginia Woolf, a bit of SEO potpourri today, revisiting topics both old and recent.
Google Finally Knows Us. One of Tom's pet peeves has always (and I mean always) been that a Google search for [seo speedwagon] brought up Google's famous "did you mean...?" line, suggesting a typo. (Apparently, there's a band with a similar name.)
Finally, the confusion seems to be over; a query for [seo speedwagon] gives the user just that - like it or not.
Of course it's impossible to accurately define the "tipping point" at which Google decides that a query no longer needs spell-check assistance, but it's likely a combination of the following:
- Age of the subdomain that lists the query term
- Number of backlinks and the anchor text used
- Number of times a specific phrase is searched for, as measured by the engine
- Sheer number of times the phrase exists on the Web
- A comment from the CEO's mother
Now that the Speedwagon issue is resolved, I'll leave it to Tom to sort out his similar issues with Tom Huston.
Hardly Therapeutic. I'm a little surprised that no one has picked up on Sean's find from last week - namely that Google, in a mid-SERP "see also" result, suggests that the user try Yahoo when searching for [therapy products]:

An inside joke? Perhaps. Regardless, we'll not speculate as to why Sean was seeking therapy products in the first place, but instead hope that he found relevant results.
A SERP of One's Own
Posted by Erik Dafforn at 11:44 PM
| Comments (16)
| TrackBacks (1)
Printer-friendly version
June 27, 2006
When Do We Turn Down a Prospective SEO Client? 
posted by Erik Dafforn in category: SEO Companies
Cab fare to nowhere is what you are
A white line to an exit sign is what you are
Or so begins a now-nearly-two-decades-old song by Paul Carrack, called "Don't Shed a Tear."
We received a Request for Proposal this week that caught our eye. The product and/or service offered by the site was extremely vague. The copy hinted around at an industry where we don't spend too much time. Here's why:
- A quick reverse-IP check (paid account req'd) showed 1160 other sites on this site's IP address. Not a big deal in today's world of shared virtual hosting, of course. Except that each of the sites is exactly the same. Same graphics. Same copy. Same Everything. The only difference was the domain names.
- All 1161 sites are hosted by a company that not only hosts sites, but offers a full "internet marketing solution" for all its hostees.
- A quick check of a random text string from the site shows the text duplicated across 14,600 sites at Google, and nearly 3000 at Yahoo. So it wasn't exactly written from scratch.
Sorry, but taking on this client would be a huge waste of her money and our time. This site network has more strikes against it than the Brooklyn Dodgers facing Don Larsen on that fateful day in 1956.
There's a very strong likelihood that the person who sent us the RFP has no idea how many sites are out there identical to hers. Or else she knows all too well, and she wants a leg up on (all 1160) of them. Either way, no thanks.
When Do We Turn Down a Prospective SEO Client?
Posted by Erik Dafforn at 12:05 PM
| Comments (7)
| TrackBacks (0)
Printer-friendly version
June 21, 2006
SEO Speedwagon Makes the Sherpa Blog Short List 
posted by Erik Dafforn in category: Blogging
We want to express our gratitude for the nomination of SEO Speedwagon to Marketing Sherpa's list of Best Blogs on Search Marketing.
We're in some great company this year, including last year's winner, Barry Schwartz's SE Roundtable. Despite what Barry says, we're pretty sure he wants the award to go to someone else this year, so true to our mission, we'd like to recommend a site of which we're quite familiar, and of which we're quite proud: Us.
So vote today! (Please.)
Seriously, though, following are the list of blogs in our category. Some you're probably already familiar with, and some might be new:
- Search Engine Lowdown
- Search Engine Roundtable
- Aaron Wall's SEO Book
- Top Rank
- iBlogMarketing
- SEO Speedwagon
- Kieden Blog
- Search Views
- Kelvin Hui
- Make Easy Money with Google and Adsense
Check out the entire list, as well as the blogs in the other categories.
SEO Speedwagon Makes the Sherpa Blog Short List
Posted by Erik Dafforn at 2:33 PM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
June 14, 2006
Just a Second - Was That an Ad? 
posted by Erik Dafforn in category: Old Media
Via Threadwatch, I read an interesting bit about Clear Channel considering one-second radio spots for advertisers. One second. Despite the interjectory worst-case scenarios about the character limit in one second of air time ("GoToHooters!"), if well placed, they might just cause the listener enough momentary imbalance to be relatively memorable.
My first thought (alas, my cross to bear) was wondering how such an advertising vehicle would integrate with search. Not too badly, I think. Assuming you approached the campaign correctly (that is, backwards) - finding a unique, memorable, interest-piquing, easy-to-spell phrase, then making sure your organic and PPC positioning was pre-loaded prior to the radio spots, you could probably get a lot page views.
(And maybe even some desired actions, assuming your landing page was tailored to users with one-second attention spans.)
Stepping back, part of the bigger story here is that traditional radio is finding less and less solace in the overstuffed wing chairs at the Old Media Country Club. The terror troika of satellite radio, podcasting, and internet radio are stealing eardrums, and advertisers have a bad habit of noticing things like that.
Still, whether commercials are 30 seconds, or one second, or a photon burst directly into the cerebral cortex isn't really the issue. New media is about pull, and it's going to take more than the audio equivalent of pop-up ads to keep advertisers in FM, where day after day, the morning drive team slings out
Nothin' but blues and Elvis
And somebody else's favorite song
Just a Second - Was That an Ad?
Posted by Erik Dafforn at 11:52 PM
| Comments (0)
| TrackBacks (0)
Printer-friendly version
June 7, 2006
Google Analytics Data Blackout 
posted by Erik Dafforn in category: Google
Google Analytics appears to have missed about five hours' worth of data from yesterday. The following shot shows a representative picture that appears for all my clients who utilize GA's statistics. The outage was from 1pm through 5pm, or might be different depending on your time zone.
No one's saying much about this yet. We'll have to wait to see whether the data is merely not there yet or whether it's gone for good.

Google Analytics Data Blackout
Posted by Erik Dafforn at 10:17 AM
| Comments (3)
| TrackBacks (0)
Printer-friendly version
June 5, 2006
Google Adds to Online Office Suite with Spreadsheet 
posted by Erik Dafforn in category: Google
Coming only a few months after it purchased the Writely online word processor, Google has added another application to its online application suite with the announcement of an online spreadsheet.
While the online spreadsheet concept is new for Google (and the Reuters article implies that it's new to everyone), plenty of alternatives already exist, including iRows, Zoho, NumSum, and spreadsheet-database mashups like Trackslife.
More and more, when we post about Google, it's not always easy to answer the question, "What does this have to do with 'Search'?" Google's move into things like desktop gadgetry sometimes distracts us from the bigger picture. But an online office suite really isn't a distant leap away from the search box - or at least the Desktop search box. Assume that any spreadsheet you create with Google Spreadsheets will be fully indexed by your personal Google Desktop search, and that you (and anyone with whom you collaborate, chat, gmail, or otherwise share sheet-based information) will have quick access to it from any computer on which you've signed into your Google account.
The forums and blogs have plenty of armchairistics and first-round critiques, ranging from "great idea" to wondering who in their right mind would trust sensitive data to a web storage system. Few from the latter camp seem to remember a similar brouhaha when we learned companies like Amazon would store our credit card data (!) within their server farms.
I eagerly await Google's imminent version of PowerPoint and its likely ability to bore large masses of people on a much more efficient, scalable level.
Google Adds to Online Office Suite with Spreadsheet
Posted by Erik Dafforn at 11:35 PM
| Comments (5)
| TrackBacks (0)
Printer-friendly version
May 30, 2006
Former Intrapromote Client Openlist Sold for $13M 
posted by Erik Dafforn in category: Local Search
Over the years I've watched industry titan Alan Meckler tout vertical search again and again and again. So it was entirely appropriate that a former Jupiter analyst would co-found a vertical search engine that now serves as a model for next-generation search platforms.
Openlist, a former client of Intrapromote, was just purchased by Marchex for $13 million in cash and stock. (See the Marchex press release for details.) The Openlist site itself helps users select hotels, restaurants, and local attractions using content aggregation technology that narrows down search results to the most granular level imaginable, completely redefining the concepts of relevance and user satisfaction. But the real story is that Openlist technology is widely transferable and scalable, which is what will ultimately benefit many of Marchex's 250,000 sites. (You read that number correctly.)
I had the opportunity to work closely with Openlist founders Matthew Berk and Bejul Somaia and can honestly say that Marchex would be getting a bargain at many times the $13 million price.
Former Intrapromote Client Openlist Sold for $13M
Posted by Erik Dafforn at 11:47 PM
| Comments (0)
| TrackBacks (0)
Printer-friendly version
May 24, 2006
SEO Considerations for AJAX Development 
posted by Erik Dafforn in category: Search Engine Friendly Design
If web development is even remotely within your periphery, you've probably heard of AJAX, which stands for Asynchronous JavaScript and XML. While most serious developers realize the hype is overblown, even the most caffeine-addled code monkeys are impressed with some recent AJAX applications.
Calling AJAX "new" is a little misleading, and it's no different from calling gin & tonic poured over Cap'n Crunch a "new" breakfast treat. The ingredients have been around forever; only the unique combination is recent.
Explained simply, the key benefit of AJAX applications is their ability to work in the background to supply data to the client browser and provide a relatively seamless "application" experience instead of the click-wait, click-wait game of traditional web pages.
The "J" in AJAX (JavaScript) has been a stumbling block for developers with an eye for search-engine friendliness, but that need not be the case. While it's true that engines typically ignore scripted data, good AJAX programs can occasionally come out of their JS trances long enough to feed even the most demanding bots. Following are some notes about AJAX development as it pertains to smart SEO.
The Problem: Not Enough Unique URLs
In my opinion, the single greatest SEO issue with AJAX is the tendency (although not necessity) of AJAX applications to not judiciously create unique, bookmarkable (and therefore indexable) URLs.
I'll use Google Maps as an example, not only because it's used in this excellent AJAX backgrounder written by an IBM engineer, but because Google Maps has come to be known as the "classic" AJAX application. If you have brand awareness like Google, you don't necessarily need too many deep, internal URLs, because everyone remembers and links to "maps.google.com". But for the rest of us, getting many internal pages indexed is critical. Like the IBM article mentions, the fact that Google put the "Link to this page" feature on the Maps page shows that they understand the need for unique URLs pulled from within the application. Depending in what you're doing with AJAX, you'll derive a ton of SEO benefit from a similar philosophy.
A secondary point is that once you've created the capability to create unique internal URLs, you'll need to post them somewhere



