How do I read the html content of a specific url using a Firefox addon?

Question

How do I read the html content of a specific url using a Firefox addon?

I want to create an addon that will load the html content of a specific url and save a specific line of that page and then navigate to that url. I've read a lot about Mozila.org about web page content, but I don't understand how to read the html content.

0

javascript html firefox-addon

Ali Mohyudin 03 Aug 14 at 22:12

source to share

3 answers

Noitidart · Answer 1 · 2014-08-04T06:38:47+0000

Here is a simple snippet that makes an XHR request, WITHOUT cookies. Don't worry about cross-origin as you are working in scope, meaning you are not coding this on a website, but as a Firefox addon.

var {Cu: utils, Cc: classes, Ci: instances} = Components;
Cu.import('resource://gre/modules/Services.jsm');
function xhr(url, cb) {
    let xhr = Cc["@mozilla.org/xmlextras/xmlhttprequest;1"].createInstance(Ci.nsIXMLHttpRequest);

    let handler = ev => {
        evf(m => xhr.removeEventListener(m, handler, !1));
        switch (ev.type) {
            case 'load':
                if (xhr.status == 200) {
                    cb(xhr.response);
                    break;
                }
            default:
                Services.prompt.alert(null, 'XHR Error', 'Error Fetching Package: ' + xhr.statusText + ' [' + ev.type + ':' + xhr.status + ']');
                break;
        }
    };

    let evf = f => ['load', 'error', 'abort'].forEach(f);
    evf(m => xhr.addEventListener(m, handler, false));

    xhr.mozBackgroundRequest = true;
    xhr.open('GET', url, true);
    xhr.channel.loadFlags |= Ci.nsIRequest.LOAD_ANONYMOUS | Ci.nsIRequest.LOAD_BYPASS_CACHE | Ci.nsIRequest.INHIBIT_PERSISTENT_CACHING;
    //xhr.responseType = "arraybuffer"; //dont set it, so it returns string, you dont want arraybuffer. you only want this if your url is to a zip file or some file you want to download and make a nsIArrayBufferInputStream out of it or something
    xhr.send(null);
}

An example using this snippet:

var href = 'http://www.bing.com/'
xhr(href, data => {
    Services.prompt.alert(null, 'XHR Success', data);
});

Kwebble · Answer 2 · 2014-08-04T10:26:39+0000

Without knowing the page and url to find on it, I cannot create a complete solution, but here is a Greasemonkey script example I wrote that does something similar.

This script is for Java articles on DZone. When an article has a link to a source, it is redirected to that source page:

// ==UserScript==
// @name        DZone source
// @namespace   com.kwebble
// @description Directly go to the source of a DZone article.
// @include     http://java.dzone.com/*
// @version     1
// @grant       none
// ==/UserScript==

var node = document.querySelector('a[target="_blank"]');

if (node !== null) {
    document.location = node.getAttribute('href');
}

Using:

Install Greasemonkey if you haven't already.
Create a script similar to mine. Set the value for @include to the page containing the found URL.
You have to determine what identifies the portion of the page with the target URL and change your script to find that URL. For my script, this is a link with the target "_blank".

After saving the script, navigate to the page with the link. Greasemonkey should execute your script and redirect the browser.

[edit] This looks for script tags for text as described and redirects.

// ==UserScript==
// @name        Test
// @namespace   com.kwebble
// @include     your_page
// @version     1
// @grant       none
// ==/UserScript==

var nodes = document.getElementsByTagName('script'),
    i, matches;

for (i = 0; i < nodes.length; i++) {
    if (nodes.item(i).innerHTML !== '') {
        matches = nodes.item(i).innerHTML.match(/windows\.location = "(.*?).php";/);

        if (matches !== null){
            document.location = matches[1];
        }
    }
}

The regex to search for a URL may need some tweaking to match the exact content of the page.

erosman · Answer 3 · 2014-08-04T04:31:16+0000

The Addon or GreaseMonkey script has a similar approach, but the addon can use Firefox's native APIs. (but this is much more complicated than scripts)

Basically, this is a process (without knowing your exact requirements)

Get the content of the remote url with XMLHttpReques()
Get the data you need with RegEx or DOMParser()
Change the current url to this target with location.replace()

How do I read the html content of a specific url using a Firefox addon?

An example using this snippet:

More articles: