HTML Form Selections Being Indexed
I had a client write in and ask a question:
"How do you prevent an engine from indexing a huge list of form selections?"
Initially I started to put together my own list of different ways to block, move or get rid of huge amounts of worthless text but then I dug a little deeper and realized this particular client's situation wasn't a problem at all.
For one thing the pages in question are buried deep in the SERP’s and only show in the index for this query because they support one instance of the “keyphrase”.
My reply…
The reason the form selection(s) are being indexed in these SERP's is because the form HTML is the only place the "keyphrase" is mentioned on those pages. Google is taking the only text around that “keyphrase” and using it in the SERP. If the "keyphrase" wasn't in that form selection (which is read by Google as text), that page would never even come up for that query.
1. Create content on the page that supports the “keyphrase” and Google will use it instead.
2. Change form application to a non-crawled script such as java. Then you can hold the text in a separate file. This option will decrease your indexed page count for that query because the one instance of the word that Google was using for the SERP will now be gone. Unless by chance there is some obscure link out there pointing to the page with “keyphrase” inserted.
3. Not a huge concern. We are more worried about what comes up in the first few pages of the SERP’s… not the last few.
Yahoo has code to stop the indexing of text snippets but Google still does not.
As complicated as stuff can get sometimes, it’s nice to see the simple things come across my desk every once in a while.