Back to Stream

Google and the Marching Robots

  • Posted: Sun 19 Aug 2007

Robots.txt GoogleOn Friday, I called my bffs at Google to get the answer straight on the robots.txt file. If you have never heard of a robots.txt file, it is a simple text file that is placed on the root directory of most websites. Through this very short code, as you can see placed in my directory, it can make a huge impact on your traffic and search engine rank:

User-agent: *
Disallow:

This is telling Google to spider all and disallow nothing on my server. It tells Google’s web robots and other search engines what part of the website or all should be publicly viewable.

The Google Webmasters tool was the first time I really became aware of the file. Google provides webmasters a handy basic and beta tools to help web developers get the most out of their SEO keywords and help websites spider effectively.

By not using this file, it will show up on the webmasters tool as a “404 page”. Google obviously uses complex algorithms to find the links itself, but this is the ultimate tool to direct the search giant where to go.

But the real question when I called Google was I wanted to make sure that it was safe to do this with wordpress activated websites. Now there is a lot of controversy of the best way it should be done.

It is OK with to show all with a WordPress installation. Sensitive files and passwords are stored on the actual database and should not be stored on the server. So it is ok to display the whole root server, unless you have information you don’t want to be displayed.

For example, this is the reason why your Facebook profile doesn’t spider in Google (from Facebook Robots.txt File):

User-agent: *
Disallow: /profile.php
Disallow: /album.php
Disallow: /photo.php
Disallow: /p.php
Disallow: /feeds/

You can also check out any website robots.txt file by going to http://rootdirectory.com/robots.txt.

There are several other variables that can be optionally included, you can take a look at them here. If you are concerned about organizing your content and making sure Google doesn’t pick up irrelevant information using WordPress, here is a great video that describes what you can do.

  • 7
  1. Krystal McManus

    awesome site! i bookmarked it 🙂

    Aug 19th, 2007 from South Carolina

  2. Joey Primiani

    Thanks Krystal!

    Aug 20th, 2007 from Silicon Valley/NYC/The Future

  3. Kate

    I like your site. Thanks for your tips in google and robots.txt

    Aug 21st, 2007 from oslo/boston/brooklyn

  4. Joey Primiani

    Thanks Kate. 🙂

    Aug 21st, 2007 from Silicon Valley/NYC/The Future

  5. Landon Morelli

    Numerous thanks for creating the work to talk about this, I really feel very more than it as well as love studying read a lot more about this topic matter.

    Jul 26th, 2012

  6. Maribeth Kozeliski

    I’m truly warm the theme/design with the net web website. Do you come upon virtually any web browser being compatible troubles?

    Aug 14th, 2012

  7. Richard Skorupa

    1st time We’ve frequented your web website. I really like that. Thank you for anything you do.

    Aug 15th, 2012