High CPU usage when generating PDFs in Rails with the Wicked_PDF gem
I'm trying to generate a PDF with Rails, but when I see it my system cpu starts max. It will initially go from ~ 2.5% and then increase to ~ 65% - $ 80% over a sustained period of time and then finally maxes out to almost display their PDF in my iframe on my page. Here are some of the messages I get while monitoring memory usage on my system:
Warning or critical alerts (lasts 9 entries)
2017-06-09 14:58:07 (0:00:04) - CRITICAL on CPU_SYSTEM (100.0)
2017-06-09 14:58:04 (0:00:13) - CRITICAL on CPU_USER (Min:72.8 Mean:83.3 Max:93.7)
2017-06-09 14:47:39 (0:00:06) - CRITICAL on CPU_USER (93.0)
2017-06-09 14:47:29 (0:00:04) - WARNING on CPU_SYSTEM (74.7)
2017-06-09 14:36:48 (0:00:04) - CRITICAL on CPU_SYSTEM (100.0)
2017-06-09 14:36:45 (0:00:10) - CRITICAL on CPU_IOWAIT (Min:78.6 Mean:85.7 Max:97.4)
2017-06-09 14:18:06 (0:00:04) - CRITICAL on CPU_SYSTEM (94.3)
2017-06-09 14:18:06 (0:00:07) - CRITICAL on CPU_USER (91.0)
2017-06-09 15:01:14 2017-06-09 14:17:44 (0:00:04) - WARNING on CPU_SYSTEM (73.8)
The gems I installed for my PDF creation are - wicked_pdf (1.0.6)
and wkhtmltopdf-binary-edge (0.12.4.0)
. And the process with code for each is as follows:
Controllers / problems / pdf_player_reports.rb
def director_report_pdf
@players = Player.where(id: params["player_ids"]
respond_to do |format|
format.html
format.pdf do
render pdf: "#{params['pdf_title']}",
template: 'players/director_summary_report.pdf.erb',
layout: 'print',
show_as_html: params.key?('debug'),
window_status: 'Loading...',
disable_internal_links: true,
disable_external_links: true,
dpi: 75,
disable_javascript: true,
:margin => {:top => 7, :bottom => 7, :left => 6, :right => 0},
encoding: 'utf8'
end
end
players / director _summary_report.pdf.erb
<div class="document" style="margin-top: -63px;">
<% @players.each do |player| %>
<% reports = player.reports.order(created_at: :desc) %>
<% if player.is_college_player? %>
<%= render partial: 'college_director_report.html.erb', player: player %>
<% else %>
<%= render partial: 'pro_director_report.html.erb', player: player %>
<% end %>
<%= "<div class='page-break'></div>".html_safe %>
<% end %>
</div>
college_director_report.html.erb
<%= wicked_pdf_stylesheet_link_tag "application", media: "all" %>
<%= wicked_pdf_javascript_include_tag "application" %>
<% provide(:title, "#{player.football_name}") %>
<% self.formats = [:html, :pdf, :css, :coffee, :scss] %>
<style>
thead { display: table-row-group; page-break-inside: avoid }
tfoot { display: table-row-group; }
/*thead:before, thead:after { display: none; }*/
table { page-break-inside: avoid; }
tr { page-break-inside: avoid; }
.page-break {
display:block; clear:both; page-break-after:always;
}
.keep-together { page-break-before: always !important; }
.table-striped>tbody>tr:nth-child(odd)>td,
tr.found{
background-color:#e2e0e0 !important;
}
</style>
<div class="row">
<div class="col-xs-6">
<span>DIRECTOR SUMMARY</span>
</div>
<div class="col-xs-6 text-right">
<%= "#{player.full_name} / #{player.school.short_name}".upcase %>
<h1><%= "#{player.full_name(true)} (#{player.school.code})".upcase %></h1>
</div>
</div>
<div class="row">
<div class="col-xs-12">
<%= render 'directors_report_player_header', player: player %>
<%= render 'directors_report_workouts', player: player %>
<%= render 'directors_report_grades', player: player %>
<%= render 'legacy_directors_report_contacts', player: player %>
</div>
</div>
directors_report_player_header.html.erb
<table class="table table-condensed table-bordered">
<thead>
<tr>
<th>Name</th>
<th>School</th>
<th>#</th>
<th>Position</th>
</tr>
</thead>
<tbody>
<tr>
<td><%= player.full_name(true) %></td>
<td><%= player.school.short_name %></td>
<td><%= player.jersey %></td>
<td><%= player.position.abbreviation %></td>
</tr>
</tbody>
</table>
UPDATE
I ran an example PDF generator using the following and CPU% is what ends up as shown below ...
<table class="table table-condensed">
<thead>
<th>Number</th>
</thead>
<tbody>
<% (1..60000).each do |number| %>
<tr>
<td><%= number %></td>
</tr>
<% end %>
</tbody>
</table>
source to share
Putting this in a controller seems counterintuitive because the moment you deploy this request, it will take a significant amount of time to create and block other incoming requests for other pages.
You have to divide this into two problems. One job that generates HTML that can be this controller and then a background job to convert that HTML to PDF.
In your controller, start the job with a DelayedJob or similar and then render a page that validates the completed job.
Then, in the background job, you are only dealing with the task of rendering HTML to PDF, not a web request. Something like this:
class RendersReportPdf
def self.call player_ids
html = ReportsController.render :director_report_pdf, assigns: { players: Player.where(id: player_ids }
pdf = WickedPdf.new.pdf_from_string html
temp = Tempfile.new("#{Time.now.to_i}.pdf")
temp.write(pdf)
temp.close
temp.path
# Probably upload this to S3 or similar at this point
# Notify the user that it now available somehow
end
end
If you do this, you can rule out that the issue is with the WickedPDF firing from your controller action, but also ensure that your site won't work if you have long requests.
source to share
So, I wanted to post my solution for future visitors, but it is based on @stef's solution - so thanks stef!
Controllers / problems / players_controller.rb
def generate_report_pdf
players = print_settings(params)
pdf_title = "#{params['pdf_title']} - #{Time.now.strftime("%c")}"
GeneratePdfJob.perform_later(players.pluck(:id), pdf_title, current_user.code, params["format"])
end
app / job / generate_pdf_job.rb
def perform(*args)
player_ids = args[0]
pdf_title = args[1]
user_code = args[2]
report_type = args[3]
generate_pdf_document(player_ids, pdf_title, user_code, report_type)
end
def generate_pdf_document(ids, pdf_title, user_code, report_type)
# select the proper template by the report type specified
case report_type
when "Labels"
html = ApplicationController.new.render_to_string(
template: 'players/board_labels.pdf.erb',
locals: { player_ids: ids },
margin: { top: 6, bottom: 0, left: 32, right: 32 }
)
when "Reports"
# ...
end
end
def save_to_pdf(html, pdf_title, user_code)
pdf = WickedPdf.new.pdf_from_string(
html,
pdf: "#{pdf_title}",
layout: 'print',
disable_internal_links: true,
disable_external_links: true,
disable_javascript: true,
encoding: 'utf-8'
)
pdf_name = "#{pdf_title}.pdf"
pdf_dir = Rails.root.join('public','uploads','reports',"#{user_code}")
pdf_path = Rails.root.join(pdf_dir,pdf_name)
# create the folder if it doesn't exist
FileUtils.mkdir_p(pdf_dir) unless File.directory?(pdf_dir)
# create a new file
File.open(pdf_path,'wb') do |file|
file.binmode
file << pdf.force_encoding("UTF-8")
end
end
So I then use an ajax call to keep checking the user specified directory for new files and I update the partial that lists the files in the directory. The only thing I don't like is now I have to have a list of user table files. I would prefer to have the file delivered to the client browser for download instead, but haven't figured out how to do this yet.
source to share