Extracting the text of a specific face of a font from a docx file

Question

Extracting the text of a specific face of a font from a docx file

I am using python 3.4 along with a library python-docx

to work with .docx

files. I was able to extract text from the document. But my goal is to extract only text with a specific font (and change them).

I've searched for this in the docs for the past two days with no results.

Anyone have any experience with this library if they could point me in the right direction.

+3

python python-3.x docx python-docx

Sulav Malla 01 Sep 14 at 9:52

source to share

1 answer

scanny · Answer 1 · 2014-09-01T18:32:31+0000

Currently python-docx

has the ability to apply a font using a style. You may find runs that have a particular style:

document = Document('having-fonts.docx')
for paragraph in document.paragraphs:
    for run in paragraph.runs:
        if run.style == style_I_want:
            print run.text

If special fonts are being applied using a paragraph style, you can use this:

document = Document('having-fonts.docx')
for paragraph in document.paragraphs:
    if paragraph.style == style_I_want:
        print paragraph.text

If you can read more about the details, I can be more specific.

Extracting the text of a specific face of a font from a docx file

More articles: