Can I create reports with Python Pandas?
I am currently using MS Access to create reports but are somewhat limited by some of the calculations I need to do.
I was looking, perhaps using Python to run reports, that is, one report per row of data that takes the column fields and puts them in text fields that look like this:
How is this possible with Python?
This is slightly overpowering Pandas, but you can easily create a PDF report from each row of your Pandas DataFrame using jinja2
(templating engine) and xhtml2pdf
(converts HTML to PDF).
First, define the structure and appearance of the report in report_template.html
:
<html>
<head>
<style type="text/css">
html, body {
width: 500px;
font-size: 12px;
background: #fff;
padding: 0px;
}
#my-custom-table {
width: 500px;
border: 0;
margin-top: 20px;
}
#my-custom-table td {
padding: 5px 0px 1px 5px;
text-align: left;
}
</style>
</head>
<body>
<table cellspacing="0" border="0" style="width:500px; border:0; font-size: 14px;">
<tr>
<td style="text-align:left;">
<b><span>Title of the PDF report - Row {{ row_ix + 1 }}</span></b>
</td>
<td style="text-align:right;">
<b><span>{{ date }}</span></b>
</td>
</tr>
</table>
<table cellspacing="0" border="0" id="my-custom-table">
{% for variable_name, variable_value in df.iteritems() %}
{% if loop.index0 == 0 %}
<tr style="border-top: 1px solid black;
border-bottom: 1px solid black;
font-weight: bold;">
<td>Variable name</td>
<td>Variable value</td>
</tr>
{% else %}
<tr>
<td>{{ variable_name }}</td>
<td>{{ variable_value }}</td>
</tr>
{% endif %}
{% endfor %}
</table>
</body>
</html>
Then run this Python 3 code which converts each DataFrame string to HTML string via jinja2
, then converts HTML to PDF via xhtml2pdf
:
from datetime import date
import jinja2
import pandas as pd
from xhtml2pdf import pisa
df = pd.DataFrame({
"Average Introducer Score": [9, 9.1, 9.2],
"Reviewer Scores": ["Academic: 6, 6, 6", "Something", "Content"],
"Average Academic Score": [5.7, 5.8, 5.9],
"Average User Score": [1.2, 1.3, 1.4],
"Applied for (RC)": [9.2, 9.3, 9.4],
"Applied for (FEC)": [5.5, 5.6, 5.7],
"Duration (Months)": [36, 37, 38]})
for row_ix, row in df.iterrows():
html = jinja2.Environment( # Pandas DataFrame to HTML
loader=jinja2.FileSystemLoader(searchpath='')).get_template(
'report_template.html').render(date=date.today().strftime('%d, %b %Y'),
row_ix=row_ix, df=row)
# Convert HTML to PDF
with open('report_row_%s.pdf' % (row_ix+1), "w+b") as out_pdf_file_handle:
pisa.CreatePDF(
src=html, # HTML to convert
dest=out_pdf_file_handle) # File handle to receive result
For the DataFrame specified in the Python code, 3 PDFs will be output. The first PDF looks like this (converted to PNG to show here):
It is certainly possible, but I don't think pandas provides that kind of functionality. You can have a look at latex where you "program" and compile documents (which itself has nothing to do with python). You can create a latex template and dynamically populate it with content in python and then compile the PDF, but it will probably take some effort to find your way into latex.
Reading CSV files with Pandas: yes, definitely possible. See: http://pandas.pydata.org/pandas-docs/stable/io.html#io-read-csv-table
Reporting with Pandas: Depends on what exactly you are looking for. Pandas has many different output writing functions, but their focus is on creating tables rather than creating entire documents. The closest output of type "document" you can get directly from Pandas is probably the HTML table output: http://pandas.pydata.org/pandas-docs/stable/io.html#io-html