Understanding GoogleJOX bypass Googlebots

I have gone through google documentation and countless blog posts on the subject, and depending on the date and source, there seems to be some conflicting information. Enlighten your wisdom to this humble peasant and everything will be fine.

I am creating a pro-bono site where the majority of the audience is from African countries with poor internet connectivity and the client cannot afford a decent infrastructure. So I decided to serve everything as static html files, and if javascript is available, I load the page content directly into the DOM if the user clicks on the navigation link to prevent the entire page from being loaded.

My client routes look like this:

//domain.tld/#!/page

My first question is: Does googlebot translate this into:

//domain.tld/_escaped_fragment_/page

or //domain.tld/?_escaped_fragment_=/page

?

I made a simple server-side router in php that builds the requested pages for googlebot and my plan was to redirect //d.tld/_escaped_fragment_/page

to //d.tld/router/page

.

But when using Google "Fetch as Googlebot" (the first time I can add) it doesn't seem to recognize any links on the page. It just returns "Success" and shows me the html of the homepage (Update: When you tell Fetch as Googlebot to //d.tld/#!/page

it, it just returns the content of the homepage without the _escaped_fragment_ magic). This brings me to my second question:

Do I need to follow the specific syntax when using hashbang links so that googlebot can crawl them?

My links look like this:

    <a href="#!/page">Page Headline</a>

      


Update1: So when I ask Fetch how Googlebot to get //d.tld/#!/page

, this shows up in the access log: "GET /_escaped_fragment_/page HTTP/1.1" 301 502 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

But it doesn't seem to match the 301 setting, and displays the master page instead. Should I be using 302 instead? This is the rule I'm using:RedirectMatch 301 /_escaped_fragment_/(.*) /router/$1


Update2: I changed my plans and will include googlebot as part of my non-javascript backup tactics. So now all links point to the router /router/page

and then change to /#!/page/

onLoad using javascript. I'll delay the question a bit in case anyone has a brilliant solution that might help others.

+3


source to share





All Articles