How can I use rake to insert / replace html section in each file?
I am using rake to create a Table of Contents from a bunch of static HTML files.
The question is how to insert it into all files from rake?
I have <ul id="toc">
in every file to aim for. All content that I want to replace.
I was thinking about using Nokogiri or similar document parsing and DOM node replacement ul#toc
. However, I don't like the idea that I have to write a Ds parser to an HTML file. What if it changes my layouts / padding etc.
Any thoughts / ideas? Or maybe links to working examples?
source to share
I got an idea similar to what Mike Woodhouse suggested. Just not using erb templates (since I wanted the source files to be freely editable also non-ruby)
def update_toc(filename)
raise "FATAL: Requires self.toc= ... before replacing TOC in files!" if @toc.nil?
content = File.read(filename)
content.gsub(/<h2 class="toc">.+?<\/ul>/, @toc)
end
def replace_toc_in_all_files
@file_names.each do |name|
content = update_toc(name)
File.open(name, "w") do |io|
io.write content
end
end
end
source to share
Could you please convert the files to .rhtml where
<ul id="toc">
is replaced by an erb directive such as
<%= get_toc() %>
where is get_toc()
defined in some library module. Write the converted files as .html (to a different directory if you like) and you're in business and the process repeats.
Or, why not, why not just use it gsub
? Something like:
File.open(out_filename,'w+') do |output_file|
output_file.puts File.read(filename).gsub(/\<ul id="toc"\>/, get_toc())
end
source to share
You can directly manipulate the document and save the result. If you restrict your manipulations to a specific element, you won't change the overall structure and should be fine.
A library like Nokogiri or Hpricot will only correct your document if it gets distorted. I know Hpricot can be trained to have a more relaxed parsing technique or to work in a more rigorous XML / XHTML manner.
Simple example:
require 'rubygems'
require 'hpricot'
document = <<END
<html>
<body>
<ul id="tag">
</ul>
<h1 class="indexed">Item 1</h1>
<h2 class="indexed">Item 1.1</h2>
<h1 class="indexed">Item 2</h1>
<h2 class="indexed">Item 2.1</h2>
<h2 class="indexed">Item 2.2</h2>
<h1>Remarks</h1>
<!-- Test Comment -->
</body>
</html>
END
parsed = Hpricot(document)
ul_tag = (parsed / 'ul#tag').first
sections = (parsed / '.indexed')
ul_tag.inner_html = sections.collect { |i| "<li>#{i.inner_html}</li>" }.to_s
puts parsed.to_html
This will give:
<html>
<body>
<ul id="tag"><li>Item 1</li><li>Item 1.1</li><li>Item 2</li><li>Item 2.1</li><li>Item 2.2</li></ul>
<h1 class="indexed">Item 1</h1>
<h2 class="indexed">Item 1.1</h2>
<h1 class="indexed">Item 2</h1>
<h2 class="indexed">Item 2.1</h2>
<h2 class="indexed">Item 2.2</h2>
<h1>Remarks</h1>
<!-- Test Comment -->
</body>
</html>
source to share