How to programmatically change the URLs of images in word processing documents

I have a set of text documents containing many non-embedded images. A URL that indicates that the images no longer exist. I would like to programmatically change the domain name of the url to something else. How can I do this in Java or Python?

+1


source to share


4 answers


This is what VBA is for:



Sub HlinkChanger()
Dim oRange As Word.Range
Dim oField As Field
Dim link As Variant
With ActiveDocument
.Range.AutoFormat
For Each oRange In .StoryRanges
        For Each oFld In oRange.Fields
            If oFld.Type = wdFieldHyperlink Then
                For Each link In oFld.Result.Hyperlinks
                    // the hyperlink is stored in link.Address
                    // strip the first x characters of the URL
                    // and replace them with your new URL
                Next link
            End If
        Next oFld
    Set oRange = oRange.NextStoryRange
Next oRange

      

+1


source


Perhaps the Microsoft Office Word Binary File Format Specification could help you here, although someone who has done similar things before might come up with a better answer.



0


source


You want to do it in Java or Python. Try OpenOffice. In OpenOffice, you can insert Java or Python code as "Makro".

I'm sure there will be an option to change the urls of the images.

0


source


The VBA answer is the closest one because it is best done using the Microsoft Word COM API. However, you can use this just like Python. I myself used it to import data into a database from hundreds of forms that were Word documents.

This article explains the basics. Note that even though it wraps the COM WordDocument object, you don't need to do this if you don't want to. You can just access the COM API directly.

To document the WordDocument API, open a Word document, press Alt-F11 to open the VBA editor, and then F2 to view the Object Browser. This allows you to view all the objects and methods they provide. Here's an introduction to Python and the COM object model .

0


source







All Articles