Modifying the XSSFWorkbook stylesheet to remove duplicate CellStyleXfs
After finding some xlsx files on iPad, I found the problem in styles.xml file in xlsx archive is too big. I was able to view the file after manually deleting duplicate records and let excel restore the file, but for solving the problem for other files, I would rather have a programmatic solution using POI, unfortunately I was having problems trying to save the workbook after I changed the table styles.
I tried to copy the optimiseCellStyles (HSSFWorkbook workbook) format for HSSFWorkbook, but the internals are different and the offending styles are only of a certain type. After checking a number of XSSFWorkbook features, I found that
XSSFWorkbook wb; // with proper initialization
wb.getStylesSource().getCTStylesheet().getCellStyleXfs();
returned about 40,000 records, the bulk of which was
<main:xf numFmtId="0" fontId="0" fillId="0" borderId="0"/>
So, I tried to determine the location of the duplicates in the workbook and remove them by calling
XSSFWorkbook wb; // with proper initialization
wb.getStylesSource().getCTStylesheet().getCellStyleXfs().removeXf(i);
but after deleting the styles when trying to save the StylesTable.java file it throws the error org.apache.xmlbeans.impl.values.XmlValueDisconnectedException And I tried several different ways and got similar errors.
Taking a look at the code in StylesTable.java it seems that the class keeps a separate copy of the stylesheet than the CTStylesheet and the size difference after changing it is causing the problem, but I suppose I am misunderstanding how to properly track these entries, especially because
XSSFCell cell; //with proper initialization
cell.getCellStyle().getStyleXf().getXfId();
always returns a number between 1 and 5, despite thousands of entries in the XML file.
Is there a more standard way to clean up this section of the book that I'm missing?
Am I terribly misunderstanding the internal workbook or POI functions?
Any help would be greatly appreciated.
source to share
No one has answered this question yet
See similar questions:
or similar: