Sub-domain vs Sub-directory for crawler blocking

I have google a lot and read many articles but they got mixed reactions.

I'm a little confused as to which is the best option if I want a certain section of my site to be blocked from indexing by search engines. Basically I do a lot of updates for my site and also develop for clients, I don't want all the "test data" I upload for preview to be indexed to avoid duplicate content.

  • Should I use a subdomain and block the entire subdomain

    or

  • Create a subdirectory and lock it with robots.txt

    .

I'm new to web design and a bit unsure about using subdomains (read somewhere that this is a small advanced procedure and even a small mistake can have big consequences, moreover, Matt Cutts also mentioned something similar ( source ):

"Id recommend using sub directories until you start to feel pretty confident with your site's architecture. At that point, you'll be better off making the right decision for your own site."

But on the other hand, I hesitate to use robots.txt

as well as any file access.

What are the pros and cons of both?

At this point I am under the impression that Google treats them the same and it would be better to go to the c subdirectory robots.txt

, but I would like to get a second opinion before "taking the plunge".

+3


source to share


1 answer


Either you ask bots not to index your content (→ robots.txt), or you block everyone (→ password protection).

For this solution, it doesn't matter if you are using a separate subdomain or folder. You can use robots.txt or password protection for both. Please note that the robots.txt file must always be placed at the document root.



Using robots.txt is not guaranteed, it is only a polite request. Polite bots will respect him, others won't. User users will still be able to visit your "banned" pages. Even bots that honor your robots.txt (like Google) may still link to your "forbidden" content in their searches (they won't index your content, though).

Using a login mechanism protects your pages from all bots and visitors.

+1


source







All Articles