How to parse / load a stylesheet from HTML

I am loading part of the HTML page:

require 'nokogiri'
require 'open-uri'

doc = Nokogiri::HTML(open('https://example.com/index.html'))
wiki = doc./('//*[@id="wiki"]/div[1]')

      

and I need the stylesheets to display them correctly. They are included in the header as follows:

<!DOCTYPE html>
<html lang="en" class="">
    <head>
    ...
    <link href="https://example.com/9f40a.css" media="all" rel="stylesheet" />
    <link href="https://example.com/4e5fb.css" media="all" rel="stylesheet" />
    ...
  </head>
  ...

      

and their naming can be changed. How to parse / load local copies of stylesheets?

+3


source to share


2 answers


Something like that:



require 'open-uri'
doc.css("head link").each do |tag|
  link = tag["href"]
  next unless link && link.end_with?("css")
  File.open("/tmp/#{File.basename(link)}", "w") do |f|
    content = open(link) { |g| g.read }
    f.write(content)
  end
end

      

+3


source


I am not a ruby ​​expert, but you can follow the steps below.



  • You can use the method .scan(...)

    provided with the type String

    to parse and get the filenames .css

    . The method scan

    will return you the names of the array style sheet files. Find more information on scan

    here
  • Then download and save the files using the Net::HTTP.get(...)

    example here
+1


source







All Articles