Competent / Internet / PubCon: The Duplicate Content Zone

Cache-Control: max-age=3600, must-revalidate Date: Wed, 17 Aug 2022 21:00:13 GMT Expires: Wed, 17 Aug 2022 22:00:13 GMT Last-Modified: Thu, 16 Nov 2006 16:30:12 GMT

PubCon: The Duplicate Content Zone

Thursday 16 November 2006

A PubCon session entered a place beyond indexing and search traffic: The Duplicate Content Zone, where websites sometimes go and are never seen again. WebProNews tagged along as the session hosts played the Rod Serling role for the audience.

Too much duplicate content on a website will drop it in the SERPs faster than the Tower Of Terror at Disney World plummets its riders. Only you don't have Matt Cutts dressed in a bellhop outfit pulling the lever in Orlando.

Bill Slawski not only makes me envious with his patent coverage, but the fact that he's a short drive from steamed blue crabs when they're in season. He touched on the topic of printer-friendly pages, which many sites make available as a convenience for their visitors.

These pages should go in a separate folder, and protected from spidering by a relevant entry in the site's robots.txt file.

If the same page has different URLs, be sure to use 301 redirects to help visitors along to the desired page.

Slawski noted that duplicate content happens sometimes when one site takes content from another. This infringement could end up costing a site publisher in terms of duplicate content penalties. He recommended contacting the site owner and its host before embarking on more serious legal action or a DMCA notice.

While many dynamically generated sites use session IDs to track a visitor's session, these should not be served to indexes that visit. Some spiders ignore these by default, but if one's site has session ID pages showing up in a search engine then some steps to stop this will need to be taken.

Yahoo's Tim Converse illustrated the point by noting Yahoo won't even index a crawled site if it is determined to be a duplicate.

They look at approximate copies as well, not just word-for-word ones. Being similar does not necessarily mean a site will be excluded from the index.

Not all duplication is evil. Hosting content in HTML and Microsoft Word format for visitor choice would be an example, as would syndication of content. Abusive stuff like scraper sites and weaving content from different pages to make a new page will get one in trouble.

Google's Brian White said his company filters content in a number of pipelines. Anyone hoping for additional insight will be disappointed, as White did not provide details of how this is done.

If other sites showing up in Google are scraping one's content, Google can help under the DMCA law. They provide a contact page with more information about DMCA takedowns.

Using the DMCA can be more perilous than anything in the Duplicate Content Zone. Ask a lawyer for advice before handing out a takedown notice.

---
Tag: PubCon

Competent

Top menu

Left menu

Right menu

PubCon: The Duplicate Content Zone

Comments

You are not allowed to create comments.

Competent

Top menu

Left menu

Right menu

Not logged in

Новое на сайте

Google Mind Melds With Trekkies

Microsoft Extends a Hand To Mozilla

Firefox 2.0: Mozilla's Tabs Overfloweth

Реклама

Статистика

Ссылки

PubCon: The Duplicate Content Zone

Comments

You are not allowed to create comments.