How to convert XHTML nested list to pdf using iText?
I have XHTML content and I have to create a PDF file from this content on the fly. I am using iText pdf converter. I've tried the easy way, but I always get bad results after calling the XMLWorkerHelper parser.
Expected value:
- The first
- Second
- Second
- The first
Result in PDF format:
- First second second
- The first
As a result, there is no nested list. I need a solution to call the parser, not instantiate the iText document.
Please take a look at the NestedListHtml example
For this example, I take the list.html code snippet :
And I parse it into ElementList
// CSS
CSSResolver cssResolver =
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new FileInputStream(HTML));
Now I can add this list to Document
for (Element e : elements) {
Or I can list this list in Paragraph
Paragraph para = new Paragraph();
for (Element e : elements) {
You will get the desired output as shown in nested_list.pdf
You cannot add nested lists to PdfPCell
or to ColumnText
. For example: this won't work:
PdfPTable table = new PdfPTable(2);
table.addCell("Nested lists don't work in a cell");
PdfPCell cell = new PdfPCell();
for (Element e : elements) {
This is due to a limitation in the classroom ColumnText
, which has been around for many years. We appreciated the problem and the only way to fix it would be to rewrite it completely ColumnText
. This is not an item on our current technical roadmap.
Here's a workaround for nested ordered and unordered lists.
The rich text editor I am using sets the "ql-indent-1/2/2 /" class attribute for the li attributes, based on the attribute that adds the start and end ul / ol tags.
public String replaceIndentSubList(String htmlContent) {
org.jsoup.nodes.Document document = Jsoup.parseBodyFragment(htmlContent);
Elements element_UL ="ul");
Elements element_OL ="ol");
if (!element_UL.isEmpty()) {
htmlContent = replaceIndents(htmlContent, element_UL, "ul");
if (!element_OL.isEmpty()) {
htmlContent = replaceIndents(htmlContent, element_OL, "ol");
return htmlContent;
public String replaceIndents(String htmlContent, Elements element, String tagType) {
String attributeKey = "class";
String startingULTgas = "<" + tagType + ">";
String endingULTags = "</" + tagType + ">";
int lengthOfQLIndenet = new String("ql-indent-").length();
HashMap<String, String> startingLiTagMap = new HashMap<String, String>();
HashMap<String, String> lastLiTagMap = new HashMap<String, String>();
Pattern regex = Pattern.compile("ql-indent-\\d");
HashSet<String> hash_Set = new HashSet<String>();
Elements element_Tag ="li");
for (org.jsoup.nodes.Element element2 : element_Tag) {
org.jsoup.nodes.Attributes att = element2.attributes();
if (att.hasKey(attributeKey)) {
String attributeValue = att.get(attributeKey);
Matcher matcher = regex.matcher(attributeValue);
if (matcher.find()) {
if (!startingLiTagMap.containsKey(attributeValue)) {
startingLiTagMap.put(attributeValue, element2.toString());
if (!startingLiTagMap.get(attributeValue)
.equalsIgnoreCase(element2.toString())) {
lastLiTagMap.put(attributeValue, element2.toString());
Iterator value = hash_Set.iterator();
while (value.hasNext()) {
String liAttributeKey = (String);
int noOfIndentes = Integer
if (noOfIndentes > 1)
for (int i = 1; i < noOfIndentes; i++) {
startingULTgas = startingULTgas + "<" + tagType + ">";
endingULTags = endingULTags + "</" + tagType + ">";
htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey),
startingULTgas + startingLiTagMap.get(liAttributeKey));
if (lastLiTagMap.get(liAttributeKey) != null) {
System.out.println("Inside last Li Map");
htmlContent = htmlContent.replace(lastLiTagMap.get(liAttributeKey),
lastLiTagMap.get(liAttributeKey) + endingULTags);
else {
htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey),
startingLiTagMap.get(liAttributeKey) + endingULTags);
startingULTgas = "<" + tagType + ">";
endingULTags = "</" + tagType + ">";
System.out.println(htmlContent);
return htmlContent;
