# # --- Notes for Bots and Webcrawlers --- # # - Please crawl our website using the crawl-delay in our robots.txt file (never more than 1 page every 5 seconds) # - Identify yourself with a unique agent, preferabably using the word bot within your agent name. # - Refer to our robots.txt and ignore any pages using meta tag ROBOTS with NOINDEX or NOFOLLOW # - Please do not process images # - Response time is best between 0100 and 0700 GMT # # --- Copyright --- # # - Unless your webcrawler is gathering data for a search engine, you must read our copyright notice (at the bottom of each page) # - We regularly search the internet for images and text that originated from us, using text pattern matches, image water marking and other methods. # - If you are in violation of international copyright laws, and we will prosecute to the full extent of the law. # Site archiver that keeps a cashe for several years User-agent: ia_archiver Disallow: / # an index of media on the web User-agent: MLBot Disallow: / # Please do not index these pages User-agent: * Disallow: /system/ # Crawler parameters - 19Jan17: this is fairly old, google ignores it Crawl-Delay: 7 Sitemap: https://www.babycollection.co.uk/site_map_bc.xml /* Removed as very old: #24jun18 - allow google to load javascript to work out how pages load # Please do not index these pages User-agent: * Disallow: /css/ Disallow: /js/ Disallow: /system/ # Image bots: google, yahoo and PicSearch User-agent: yahoo-mmcrawler Disallow: / User-agent: psbot Disallow: / # Czech Republic bot for //jyxo.cz/ User-agent: jyxobot Disallow: / */