XDocument.Validate does not catch all XSD errors

I have a really weird issue validating an XML document against a valid XSD using C # XDocument.Validate or XMLReaderSettings with the required configurations. The problem is that when there are errors in the XML document, the validation process fails to catch all errors under certain conditions, and I cannot find a pattern for this anomaly.

Here is my XSD:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified"
			  targetNamespace="http://www.somesite.com/somefolder/messages"
			  xmlns:xs="http://www.w3.org/2001/XMLSchema">
   <xs:element name="Message">
    <xs:complexType>
     <xs:sequence>
      <xs:element name="Header">
         <xs:complexType>
          <xs:sequence>
           <xs:element name="MessageId" type="xs:string" />
           <xs:element name="MessageSource" type="xs:string" />
          </xs:sequence>
       </xs:complexType>
    </xs:element>
    <xs:element name="Body">
       <xs:complexType>
          <xs:sequence>
             <xs:element name="Abc001">
                <xs:complexType>
                   <xs:sequence>
                    <xs:element name="Abc002" type="xs:string" />
                    <xs:element name="Abc003" type="xs:string" minOccurs="0" />
                    <!--<xs:element name="Abc004" type="xs:string" />-->
                    <xs:element name="Abc004">
                       <xs:simpleType>
                         <xs:restriction base="xs:string">
                           <xs:maxLength value="200"/>
                         </xs:restriction>
                      </xs:simpleType>
                    </xs:element>
                      <xs:element name="Abc005">
                         <xs:complexType>
                            <xs:sequence>
                              <xs:element name="Abc006" type="xs:unsignedShort" />
                              <xs:element name="Abc007">
                                <xs:complexType>
                                  <xs:sequence>
                                    <xs:element name="Abc008" type="xs:string"/>
                                    <xs:element name="Abc009" type="xs:string" minOccurs="0"/>
                                    <xs:element name="Abc010" type="xs:string"/>
                                  </xs:sequence>
                                </xs:complexType>
                              </xs:element>
                              <xs:element name="Abc011" type="xs:date" />
                              <xs:element name="Abc012">
                                <xs:complexType>
                                  <xs:sequence>
                                    <xs:element name="Abc013" type="xs:string" />
                                    <xs:element name="Abc014" type="xs:string" />
                                  </xs:sequence>
                                </xs:complexType>
                              </xs:element>
                            </xs:sequence>
                         </xs:complexType>
                      </xs:element>
                   </xs:sequence>
                </xs:complexType>
             </xs:element>
          </xs:sequence>
       </xs:complexType>
    </xs:element>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
</xs:schema>
      

Run codeHide result


And here is the XML document that is being validated against this XSD:

<?xml version="1.0" encoding="utf-8"?>
<Message xmlns="http://www.somesite.com/somefolder/messages">
	<Header>
		<MessageId>Lorem</MessageId>
		<MessageSource>Ipsum</MessageSource>
	</Header>
	<Body>
		<Abc001>
			<Abc002>dolor</Abc002>
			<Abc003>sit amet</Abc003>
			<Abc004>consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.</Abc004>
			<Abc005>
				<Abc006>1234</Abc006>
				<Abc007>
					<Abc008>Ut enim</Abc008>
					<Abc009>ad</Abc009>
					<Abc010>minim</Abc010>
				</Abc007>
				<Abc011>1982-10-17</Abc011>
				<Abc012>
					<Abc013>veniam</Abc013>
					<Abc014>nostrud</Abc014>
				</Abc012>
			</Abc005>
		</Abc001>
	</Body>
</Message>
      

Run codeHide result


Now when I inject some validation errors into the XML and validate it against the XSD, it finds all the errors as expected. Here is the error prone xml (I noted where the errors were introduced):

<?xml version="1.0" encoding="utf-8"?>
<Message xmlns="http://www.somesite.com/somefolder/messages">
	<Header>
		<MessageId>Lorem</MessageId>
		<MessageSource>Ipsum</MessageSource>
	</Header>
	<Body>
		<Abc001>
			<Abc002>dolor</Abc002>
			<Abc003>sit amet</Abc003>
			
			<!--The value for Abc004 is increased beyond the allowed 200 characters-->
			
			<Abc004>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</Abc004>
			<Abc005>
				<Abc006>1234</Abc006>
				<Abc007>
					<Abc008>Ut enim</Abc008>
					<ABC009>AD</ABC009>
					
					<!--<Abc010>minim</Abc010>  Required element removed-->
				</Abc007>
				
				<!--Date formate below is wrong-->
				<Abc011>1982-10-37</Abc011>
				
				<Abc012>
					<Abc013>veniam</Abc013>
					<Abc014>nostrud</Abc014>
				</Abc012>
			</Abc005>

			<!--the element below is not allowed-->
			<Abc15>Not allowed</Abc15>
		</Abc001>
	</Body>
</Message>
      

Run codeHide result


and here is my final xml which shows all errors:

<MessageResponse xmlns="http://www.somesite.com/somefolder/messages">
    <Result>false</Result>
    <Status>Failed</Status>
    <FaultCount>4</FaultCount>
    <Faults>
        <Fault>
            <FaultCode>ERR01</FaultCode>
            <FaultMessage>The 'http://www.somesite.com/somefolder/messages:Abc004' element is invalid - The value 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.' is invalid according to its datatype 'String' - The actual length is greater than the MaxLength value.</FaultMessage>
        </Fault>
        <Fault>
            <FaultCode>ERR02</FaultCode>
            <FaultMessage>The element 'Abc007' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'ABC009' in namespace 'http://www.somesite.com/somefolder/messages'. List of possible elements expected: 'Abc009, Abc010' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage>
        </Fault>
        <Fault>
            <FaultCode>ERR03</FaultCode>
            <FaultMessage>The 'http://www.somesite.com/somefolder/messages:Abc011' element is invalid - The value '1982-10-37' is invalid according to its datatype 'http://www.w3.org/2001/XMLSchema:date' - The string '1982-10-37' is not a valid Date value.</FaultMessage>
        </Fault>
        <Fault>
            <FaultCode>ERR04</FaultCode>
            <FaultMessage>The element 'Abc001' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'Abc15' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage>
        </Fault>
    </Faults>
</MessageResponse>
      

Run codeHide result


Here's the weird part. When I put another error at the beginning of the "Abc001" element, and also keep any other existing errors, the result is completely confused. Here is the XML with a recently introduced error:

<?xml version="1.0" encoding="utf-8"?>
<Message xmlns="http://www.somesite.com/somefolder/messages">
	<Header>
		<MessageId>Lorem</MessageId>
		<MessageSource>Ipsum</MessageSource>
	</Header>
	<Body>
		<Abc001>
			<!--newly introduced error - removed the following element-->
			<!--<Abc002>dolor</Abc002>-->
			<Abc003>sit amet</Abc003>
			<!--The value for Abc004 is increased beyond the allowed 200 characters-->
			<Abc004>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</Abc004>
			<Abc005>
				<Abc006>1234</Abc006>
				<Abc007>
					<Abc008>Ut enim</Abc008>
					<ABC009>AD</ABC009>
					<!--<Abc010>minim</Abc010>-->
				</Abc007>
				<Abc011>1982-10-37</Abc011>
				<Abc012>
					<Abc013>veniam</Abc013>
					<Abc014>nostrud</Abc014>
				</Abc012>
			</Abc005>
			<!--the element below is not allowed-->
			<Abc15>Not allowed</Abc15>
		</Abc001>
	</Body>
</Message>
      

Run codeHide result


and finally, here is the result of the check:

<MessageResponse xmlns="http://www.somesite.com/somefolder/messages">
    <Result>false</Result>
    <Status>Failed</Status>
    <FaultCount>1</FaultCount>
    <Faults>
        <Fault>
            <FaultCode>ERR01</FaultCode>
            <FaultMessage>The element 'Abc001' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'Abc003' in namespace 'http://www.somesite.com/somefolder/messages'. List of possible elements expected: 'Abc002' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage>
        </Fault>
    </Faults>
</MessageResponse>
      

Run codeHide result


Here is my C # code that I am using to test:

public async Task<IMIDPreValidationAckMessage> ValidateXmlMessage( XDocument doc )
    {
        var result = new PreValidationAckMessage();
        result.Result = true;
        result.Status = "Succeeded";

        var xsd = HttpContext.Current.Server.MapPath( "~/message01.xsd" );

        try
        {
            var uri = new System.Uri(xsd);

            var localPath = uri.LocalPath;

            var docNameSpace = doc.Root.Name.Namespace.NamespaceName;

            XmlSchemaSet schemas = new XmlSchemaSet();
            schemas.Add( docNameSpace, localPath );

            XmlReaderSettings xrs = new XmlReaderSettings();
            xrs.ValidationType = ValidationType.Schema;
            xrs.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings;
            xrs.Schemas = schemas;

            result.XSDNamespace = doc.Root.GetDefaultNamespace().NamespaceName;
            var errCode = 1;

            xrs.ValidationEventHandler += ( s, e ) =>
            {
                var msg = e.Message;
                result.Result = false;
                result.Status = "Failed";
                result.FaultCount++;
                result.Faults.Add( new Fault
                {
                    FaultCode = "ERR" + errCode++.ToString().PadLeft( 2, '0' ),
                    FaultMessage = e.Message
                } );
            };

            using ( XmlReader xr = XmlReader.Create( doc.CreateReader(), xrs ) )
            {
                while ( xr.Read() ) { }
            }
        }
        catch ( System.Exception ex )
        {
            result.Result = false;
            result.Status = "Unknown Error";
        }
        return result;
    }

      

Can someone please tell me what is wrong here?

+3


source to share


1 answer


Seems to XmlReader

stop checking the item on the first error it encounters. Here is a link to a description of the old (deprecated) XmlValidatingReader

ValidationEventHandler :

If an element reports a validation failure, the rest of the model content for that element is not validated, but its children are validated. The reader only reports the first error for a given item.



And it looks like a regular one XmlReader

(although its documentation doesn't mention it explicitly).

In the first examples, errors occur either in the inner elements (such as the invalid text value of the element) or in the last child, so they are all reported and not missing anything. However, in the last example, you introduce an error at the beginning of the root element Abc001

, so the rest of the content is Abc001

skipped along with any errors.

+1


source







All Articles