Googlebot

Googlebot is the crawler of the search engine Google. These are downloads and this makes discoverable via web and image search from Google to a computer program, the text and images on the World Wide Web.

Operation

Between downloading a file version and the upgrade of Google's index with the contents of this new version in the case of a change are usually a few days. How often Googlebot visits a page depends on, among other things, on how many external links point to this site and what is their PageRank value. In most cases, however, Googlebot accesses on average only once every few seconds on a website.

To keep the number of hits on the page to be indexed as low as possible each crawl is first stored in a cache used by all Googlebots. If a page of several bots within a specified time period visited, the query can thus be served from the cache.

Googlebot notice the file robots.txt and the Robots statements in HTML meta tags.

Dynamic page content

Page contents contained only behind PHP sessions or behind variables, Googlebot can only with difficulty or not to index date. This is because the bot usually lack the necessary variables, nor the associated parameters are known. Google is currently working as far as adapting the Web crawler that it can also cover additional content that stay hidden until now behind several AJAX requests. The aim is for such content also can be detected, which reloads a website dynamically. Is also planned that the Web crawler POST requests sent to a web page. The problem is that POST requests can perform user actions unintentionally.

Identification

Googlebot identified, depending on the task, among others, the following User-Agent identifiers:

Googlebot/2.1 ( http://www.google.com/bot.html ) Mozilla/5.0 (compatible ); Googlebot/2.1; ( http://www.google.com/bot.html ) Googlebot-Image/1.0 Another Google crawler is used to download pages in order to identify relevant ads within the Google AdSense program. He identifies himself as follows:

Mediapartners-Google/2.1 verification

Some Web users and crawlers give themselves over these identifiers mistakenly called Googlebot from, in the hope that a site operator to Googlebot provides particularly good or ad-free content.

To determine whether it is in fact Google's crawlers in a visitor, Google recommends using the Domain Name System. First, the IP address of the visitor is translated by means of an inverse query in a domain name that should end on googlebot.com. Then we checked with a regular DNS query (forward lookup) if you get the original IP address of the visitor again.

Swell

Session-ID

273124