Loading text from .docx into MySQL using Python-docx

I am currently using Python-docx to convert text to a .docx file in one line.

f = open(os.path.expanduser("~/documents/myFile.docx"))

document = opendocx(f)

docString = ''.join(getdocumenttext(document))

      

Then I parse the string using simple built-in Python parse methods. After the string is parsed into a list, I load that list into a MySQL database. This works great, but the only problem is I want to keep the special characters.

The database supports this special character (utf-8), but converting .docx to string is missing a lot of characters and formatting (italic, bold, etc.).

I want to be able to parse and load text with intact formatting from a .docx file.

0


source to share





All Articles