Is there a way to make the QXmlStreamReader handle malformed XML?

I am trying to parse some values ​​from a website. For this I use QXmlStreamReader

. After I started parsing, I got XML Error: "Expected" = ', but I got'> '. ". He breaks this distorted element:

<tbody pageStartAt >

      

I guess the reason for this is because the standard means that everything after the main tag name should have some values ​​attached to it like this:

<tbody pageStartAt="2" > - this is working.

      

My question is, is there a way to prevent this? I just want to ignore subtags that have no values. I would rather avoid using QWebKit - I think this is too much.

+3


source to share


1 answer


The easiest way I've found is using HTMLTidy (thanks to @MrEricSir for the advice). It fixes the broken XML. One downgrade is the fact that it adds unnecessary tags like / body / etc.



0


source







All Articles