Extracting the text of a specific face of a font from a docx file

I am using python 3.4 along with a library python-docx

to work with .docx

files. I was able to extract text from the document. But my goal is to extract only text with a specific font (and change them).

I've searched for this in the docs for the past two days with no results.

Anyone have any experience with this library if they could point me in the right direction.

+3


source to share


1 answer


Currently python-docx

has the ability to apply a font using a style. You may find runs that have a particular style:

document = Document('having-fonts.docx')
for paragraph in document.paragraphs:
    for run in paragraph.runs:
        if run.style == style_I_want:
            print run.text

      

If special fonts are being applied using a paragraph style, you can use this:



document = Document('having-fonts.docx')
for paragraph in document.paragraphs:
    if paragraph.style == style_I_want:
        print paragraph.text

      

If you can read more about the details, I can be more specific.

+2


source







All Articles