How can I use rake to insert / replace html section in each file?

I am using rake to create a Table of Contents from a bunch of static HTML files.

The question is how to insert it into all files from rake?

I have <ul id="toc">

in every file to aim for. All content that I want to replace.

I was thinking about using Nokogiri or similar document parsing and DOM node replacement ul#toc

. However, I don't like the idea that I have to write a Ds parser to an HTML file. What if it changes my layouts / padding etc.

Any thoughts / ideas? Or maybe links to working examples?

+2


source to share


3 answers


I got an idea similar to what Mike Woodhouse suggested. Just not using erb templates (since I wanted the source files to be freely editable also non-ruby)



  def update_toc(filename)
    raise "FATAL: Requires self.toc= ... before replacing TOC in files!" if @toc.nil?
    content = File.read(filename)
    content.gsub(/<h2 class="toc">.+?<\/ul>/, @toc)
  end

  def replace_toc_in_all_files
    @file_names.each do |name|
      content = update_toc(name)
      File.open(name, "w") do |io|
        io.write content
      end
    end
  end

      

+2


source


Could you please convert the files to .rhtml where

<ul id="toc">

      

is replaced by an erb directive such as

<%= get_toc() %>

      



where is get_toc()

defined in some library module. Write the converted files as .html (to a different directory if you like) and you're in business and the process repeats.

Or, why not, why not just use it gsub

? Something like:

File.open(out_filename,'w+') do |output_file|
    output_file.puts File.read(filename).gsub(/\<ul id="toc"\>/, get_toc())
end

      

+3


source


You can directly manipulate the document and save the result. If you restrict your manipulations to a specific element, you won't change the overall structure and should be fine.

A library like Nokogiri or Hpricot will only correct your document if it gets distorted. I know Hpricot can be trained to have a more relaxed parsing technique or to work in a more rigorous XML / XHTML manner.

Simple example:

require 'rubygems'
require 'hpricot'

document = <<END
<html>
<body>
<ul id="tag">
</ul>
<h1 class="indexed">Item 1</h1>
<h2 class="indexed">Item 1.1</h2>
<h1 class="indexed">Item 2</h1>
<h2 class="indexed">Item 2.1</h2>
<h2 class="indexed">Item 2.2</h2>
<h1>Remarks</h1>
<!-- Test Comment -->
</body>
</html>
END

parsed = Hpricot(document)

ul_tag = (parsed / 'ul#tag').first

sections = (parsed / '.indexed')

ul_tag.inner_html = sections.collect { |i| "<li>#{i.inner_html}</li>" }.to_s

puts parsed.to_html

      

This will give:

<html>
<body>
<ul id="tag"><li>Item 1</li><li>Item 1.1</li><li>Item 2</li><li>Item 2.1</li><li>Item 2.2</li></ul>
<h1 class="indexed">Item 1</h1>
<h2 class="indexed">Item 1.1</h2>
<h1 class="indexed">Item 2</h1>
<h2 class="indexed">Item 2.1</h2>
<h2 class="indexed">Item 2.2</h2>
<h1>Remarks</h1>
<!-- Test Comment -->
</body>
</html>

      

+1


source







All Articles