PHP fatal error: cannot use object of type simple_html_dom as array

I am working on web scraping apps using simple_html_dom

. I need to extract all images in a webpage. The possibilities are listed below:

  • <img>

    Tag Images
  • if one page has css with tag <style>

    .
  • if there is an image with inline styling with <div>

    or with a different tag.

I can clear all images using the following code.

function download_images($html, $page_url , $local_url){

    foreach($html->find('img') as $element) {
        $img_url = $element->src;
        $img_url = rel2abs($img_url, $page_url);
        $parts   = parse_url($img_url);
        $img_path=  $parts['path'];
        $url_to_be_change = $GLOBALS['website_server_root'].$img_path;
        download_file($img_url, $GLOBALS['website_local_root'].$img_path);  
        $element->src=$url_to_be_change;            
    }

    $css_inline = $html->find("style");

    $matches = array();
    preg_match_all( "/url\((.*?)\)/", $css_inline, $matches, PREG_SET_ORDER );
    foreach ( $matches as $match )    {
        $img_url = trim( $match[1], "\"'" );
        $img_url = rel2abs($img_url, $page_url);
        $parts   = parse_url($img_url);
        $img_path=  $parts['path'];
        $url_to_be_change = $GLOBALS['website_server_root'].$img_path  ;
        download_file($img_url , $GLOBALS['website_local_root'].$img_path); 
        $html = str_replace($img_url , $url_to_be_change , $html );
    }

    return $html;
}

$html = download_images($html , $page_url , $dir); // working fine
$html = str_get_html ($html);
$html->save($dir. "/" . $ff);    

      

Please note that I am also changing the HTML after the image is loaded.

the download works fine. but when I try to save the HTML then it gives the following error:

PHP fatal error: cannot use object of type simple_html_dom as array

Important: it works fine if I don't use the str_replace

second loop either .

Fatal error: Cannot use object of type simple_html_dom as an array in / var / www / html / app / framework / cache / includes / simple_html_dom.php on line 1167

+3


source to share


3 answers


Guess # 1

I see a possible error here:

$html = str_get_html($html);

      

It looks like you are passing an object to the str_get_html () function while it is taking a string as an argument. Let's fix it like this:

$html = str_get_html($html->plaintext);

      



We can only guess what the content of the $ html variable that comes with this piece of code is.

Guess # 2

Or maybe we just need to use a different function in the download_images function to make your code correct in both cases:

function download_images($html, $page_url , $local_url){

    foreach($html->find('img') as $element) {
        $img_url = $element->src;
        $img_url = rel2abs($img_url, $page_url);
        $parts   = parse_url($img_url);
        $img_path=  $parts['path'];
        $url_to_be_change = $GLOBALS['website_server_root'].$img_path  ;
        download_file($img_url , $GLOBALS['website_local_root'].$img_path); 
        $element->src=$url_to_be_change;            
    }

    $css_inline = $html->find("style");

    $result_html = "";
    $matches = array();
    preg_match_all( "/url\((.*?)\)/", $css_inline, $matches, PREG_SET_ORDER );
    foreach ( $matches as $match )    {
        $img_url = trim( $match[1], "\"'" );
        $img_url = rel2abs($img_url, $page_url);
        $parts   = parse_url($img_url);
        $img_path=  $parts['path'];
        $url_to_be_change = $GLOBALS['website_server_root'].$img_path  ;
        download_file($img_url , $GLOBALS['website_local_root'].$img_path); 
        $result_html = str_replace($img_url , $url_to_be_change , $html );
    }

    return $result_html;
}

$html = download_images($html , $page_url , $dir); // working fine
$html = str_get_html ($html);
$html->save($dir. "/" . $ff);

      

Explanation: if there is no match (the $ matches array is empty), we never go into the second loop, so the $ html variable still has the same value as at the beginning of the function. This is a common mistake when trying to use the same variable instead of code where you need two different variables.

+1


source


As stated in the error message, you are dealing with an object that must be an array. You can try tpyecasting your object:

$array =  (array) $yourObject;

      



This should fix the problem.

0


source


I had this error, I solved it by using (in my case) return $ html-> save (); at the end of the function. I cannot explain why two instances with different variable names and scopes in different functions made this error. I guess this is how the "simple html dom" class works.

Just to be clear, try: $ html-> save () before doing anything else after

I hope this information helps someone :)

0


source







All Articles