# robots.txt file for DD virtual server # in principle: allow all robots access - except: # - Gulliver of Northern Light # - InternetSeer # - PicSearch # - IBM Almaden crawler # - NameProtect's bot NPBot # explicitly allow the Google Adsense bot # for the rest: don't follow "counted" external links User-agent: AboutUsBot Disallow: / User-agent: Gulliver Disallow: / User-agent: InternetSeer Disallow: / User-agent: InternetSeer.com Disallow: / User-agent: sitecheck.internetseer.com Disallow: / # Ban Picsearch.org's bot: SE for images only - http://www.picsearch.com/menu.cgi?item=Picsearch # NOTE: If your site has been indexed by Picsearch and you do not wish to be included in the Picsearch index, please e-mail Picsearch at remove@picsearch.com and provide the full URL you wish to have removed. Picsearch will promptly deal with your request and remove your site along with the thumbnail references. User-Agent: psbot Disallow: / # Ban almaden's crawler: Info sold, not for my benefit; "For more information please refer to http://www.almaden.ibm.com/WebFountain" User-agent: http://www.almaden.ibm.com/cs/crawler Disallow: / # Ban NPBot - see http://www.nameprotect.com/botinfo.html; does seem to respect robots.txt, but check IPs! # see also: "carfac" on http://www.webmasterworld.com/forum11/1832.htm and http://weblog.bergersen.net/archives/000540.html User-Agent: NPBot Disallow: / # allow Google bot for AdSense User-agent: Mediapartners-Google* Disallow: # all other robots User-agent: * # don't follow "counted" links Disallow: /go/