Prevent Word 2010 o saving: gfxdata base64 or uuencoded VML?

I am working with .docx files containing several drawing canvases with inserted images and some lines and arrows drawn in Word 2010. I am using 2010 format without compatibility mode.

Word inserts an attribute o:gfxdata

on each element v:shape

and v:group

and fills it with some encoded ascii. From what I've read, there may be a copy of the VML describing v:shape

or v:group

. I don’t know if I just don’t know what to look for, but I cannot determine what this data is for, since deleting it does not clearly affect my ability to read or edit a document in Word 2003, 2007 or 2010.

It swells document.xml to almost twice the (apparent) required size. This greatly slows down processing OpenTBS, so I would like to remove it, if possible. Does anyone know how to tell Word 2010 to keep this extra data? Or what is it for? I really struggled to find the documentation on it above this post .

Edit:

Here is a sample .docx . Document.xml is ~ 141KB and OpenTBS takes an average of 10.35 seconds to create a file that includes it like a tablet 21 times. If I remove all o: ogfxdata attributes, the file size is reduced to ~ 37KB and it only takes OpenTBS 2.99 seconds to create the same file.

Edit 2:

Upon further investigation, it seems that removing o: gfxdata may cause Word 2003 with the older Compatibilty Pack to be installed on an object with the following error:

"This is a preview of the Compatibility Pack and can open pre-released Office 2007 files. Would you like to test the new version of the Compatibility Pack?"

I was able to open the file by installing a new compatibility pack - although it prompts the user about the incompatibility and converts the file to open it. It won't damage my file, but this is what to look for.

+3


source to share


1 answer


The attribute is o:ogfxdata

poorly documented on the web. According to your research, this is some additional compatibility information.

You can remove these attributes in your template using OpenTBS. The cleanup can be done once on your template without merging, and then save the cleaned template as a new template. Or, you can clean up every time you open the template.

DOCX file cleanup:



while ($x = clsTbsXmlLoc::FindStartTagHavingAtt($TBS->Source, 'o:gfxdata', 0) ) {
  $x->ReplaceAtt('o:gfxdata', '');
  $TBS->Source = str_replace(' o:gfxdata=""', '', $TBS->Source);
}

      

Note that the class is clsTbsXmlLoc

provided with OpenTBS and is undocumented. The code should work with OpenTBS 1.8.0. (which is currently in stable beta).

I noticed that since the attributes o:gfxdata

are removed, they are not returned immediately after editing the docx.

+1


source







All Articles