Writing only xml declaration to a file using Python

I am writing multiple xml files in python looping through lists of strings. Suppose I have:

from xml.etree.ElementTree import ElementTree, Element, SubElement, tostring

parent = Element('parent')
child = SubElement(parent, 'child')
f = open('file.xml', 'w')
document = ElementTree(parent)
l = ['a', 'b', 'c']
for ch in l:
    child.text = ch
    document.write(f, encoding='utf-8', xml_declaration=True)

      

Output:

<?xml version='1.0' encoding='utf-8'?>
<parent><child>a</child></parent><?xml version='1.0' encoding='utf-8'?>
<parent><child>b</child></parent><?xml version='1.0' encoding='utf-8'?>
<parent><child>c</child></parent>

      

Desired output:

<?xml version='1.0' encoding='utf-8'?>
<parent>
<child>a</child>
<child>b</child>
<child>c</child>
</parent>

      

I want the xml declaration to appear once, at the top of the file. You should probably write the declaration to a file before the loop, but when I try to do that, I get empty items. I do not want to do this:

f.write('<?xml version='1.0' encoding='utf-8'?>')

      

How to write only xml declaration to file?

Edit: desired output

+3


source to share


3 answers


Before writing the file, you need to add SubItems to the tree. In your code, you have overwritten the same element and written the entire XML document at each iteration of the loop.

In addition, the XML syntax is missing a valid top-level "root" element.

from xml.etree.ElementTree import ElementTree, Element, SubElement, tostring

root = Element('root')

l = ['a', 'b', 'c']
for ch in l:
    parent = SubElement(root,'parent')
    child = SubElement(parent, 'child')
    child.text = ch

document = ElementTree(root)
document.write('file.xml', encoding='utf-8', xml_declaration=True)

      



and the output will be:

<?xml version='1.0' encoding='utf-8'?>
<root>
  <parent><child>a</child></parent>
  <parent><child>b</child></parent>
  <parent><child>c</child></parent>
</root> 

      

+5


source


I'm not familiar with Python XML Libraries, but I'll take a step back. If you do what you want your output will be invalid XML. XML must have exactly one root element .

So, you could:



<?xml version='1.0' encoding='utf-8'?>
<uberparent>
<parent><child>a</child></parent>
<parent><child>b</child></parent>
<parent><child>c</child></parent>
</uberparent>

      

Suppose you are creating a google sitemap, for example: their schema says the root element is "urlset".

+1


source


This is technically doable by simply setting the bool () flag to xml_declaration

:

parent = Element('parent')
child = SubElement(parent, 'child')
f = open('file.xml', 'w')
document = ElementTree(parent)
l = ['a', 'b', 'c']
# use enumerate to have (index, element) pair, started from 0
for i, ch in enumerate(l):
    child.text = ch
    # Start index=0, since bool(0) is Fale, and bool(1..n) is True
    # the flag will be offset
    document.write(f, encoding='utf-8', xml_declaration=bool(not i))
f.close()

      

Updated:

Since the OP realized that the desired output was incorrect in the syntax and changed the requirement, here is the usual way to deal with the xml:

from xml.etree.ElementTree import ElementTree, Element, SubElement, tostring

parent = Element('parent')
f = open('file.xml', 'w')
document = ElementTree(parent)
l = ['a', 'b', 'c']
for ch in l:
    child = SubElement(parent, 'child')
    child.text = ch
document.write(f, encoding='utf-8', xml_declaration=True)
f.close()

      

0


source







All Articles