User topics

Article topics

Log in Sign up

Publishers punt new web crawler blocking standards

A long-awaited new standard designed to give webmasters more control over how search engines and newsreaders access their content will be unveiled in New York today. After a year-long pilot the Automated Content Access Protocol (ACAP) will be launched at the headquarters of the Associated Press. It aims to improve on the …

COMMENTS

House rules Send corrections

This topic is closed for new posts.

Thursday 29th November 2007 12:54 GMT Anonymous Coward

Or...

Just make a really horrible puke coloured website so nobody will bother reading the conent - like this: http://www.the-acap.org/

0 0
Thursday 29th November 2007 13:38 GMT Mark

Regarding Copiepresse

They also sued Google when they stopped indexing their site.

Missed that bit out. Maybe Orlowski has to agree your copy before it can be printed...

0 0
Thursday 29th November 2007 14:40 GMT Chris Williams

Re: Regarding Copiepresse

Erm, it's not in the copy because it's not relevant.

I mention the Copiepresse-Google News case as one of a few examples of publishers worried about how their content is used, which is why the APAC project was started - relevant.

The same publisher sues because the dominant search engine doesn't index them in tit for tat action - interesting, but irrelevant to this story.

See you at the tinfoil hat shop, anyway.

0 0
Thursday 29th November 2007 15:22 GMT Mark

tinfoil not wanted

I *like* to see how people are trying to control me!

0 0
Thursday 29th November 2007 15:48 GMT Gareth

Solution looking for a problem?

Doesn't HTTP already support an "expires" header? Can't a page be unpublished from a site or moved into an authenticated area of the site when it should no longer be visible to search engines?

robots.txt is a small, elegant, simple solution for keeping crawlers away from non-content areas which still need to be publically accessible (ie. Javascript files, site templates, etc).

This seems like rather a lot of committee-designed overly complex, redundant cruft which is forcing the hand of the "old media" way of doing things.

0 0
Thursday 29th November 2007 17:33 GMT Mark

Re: Solution looking for a problem

The problem was that the content producers didn't want to go to the effort of understanding and obeying the current standards for the internet (such as, if your webserver is asked for a page, your webserver handing it over means you're allowing the recipient to copy it).

They want all the web indexers to change THEIR stuff to make their life easier.

0 0
Thursday 29th November 2007 17:33 GMT DV Henkel-Wallace

Yeah, yeah, been there before

Err, ACAP was defined in RFC-2244 exactly 10 years ago in November 1997?

I predict today's ACAP will be just as popular and effective as ACAP was 10 years ago.

0 0

This topic is closed for new posts.

The Register Biting the hand that feeds IT

About Us

Our Websites

Your Privacy

Situation Publishing

Copyright. All rights reserved © 1998–2024