Formatting invalid XML into nice format

Let's say there were errors in the XML message:

Well Formed

<Person><Name>Attila</Name><ID>001</ID><Age>45</Age></Person>

      

Not well formed

<Pxxxon><Name>Attila</9327><ID>001</ID><Age>45</Age></Person>

      

Are there any Java libraries or code to format a non-well-formed XML message:

<Pxxxon>
    <Name>Attila</9327>
    <ID>001</ID>
    <Age>45</Age>
</Person>

      

I understand that current Java libraries only format valid XML messages to this prefix format.

+3


source to share


2 answers


No, because what you call "Invalid" is not really well formed .

Well-formed and valid are not the same.

  • Well-formed means that the text object meets the W3C XML requirements.
  • Valid means that well-formed XML meets the additional requirements specified by the specific schema.


See Well-formed vs Valid XML for more details , but if the data is not well-formed it is not XML at all and no XML parser will be able to read it to reformat it.

Then you might ask, what about non-XML parsers? What will we answer, if it's not XML, in what format? In order for any parser to read any data, the data syntax must be defined. Simply saying that the data is like XML doesn't specify the format enough, which is why you won't find a tool that can quite print the sample data you provided.

+2


source


Sorry to post it as an answer, but I can't comment yet. I don't know if there is a lib for this. But you can try to replace characters "> <" with "> \ n <" and they check groups to add a code id. Perhaps searching for "\ n" and checking if the next two characters match the beginning of the end of the tag. If not, they add identification.



-1


source







All Articles