Select elements with a self or child attribute value but not "overridden" (for example, the lang attribute)

I am trying to emulate the interpretation of an attribute lang

similar to HTML or xml:lang

.

Given the following XML snippet:

<xml lang="c">
    c#0
    <para>c#1</para>
    <para>c#2</para>
    <para lang="d">
        d#0
        <para>d#1</para>
        <para lang="c">c#3</para>
        <para lang="d">
            d#2
            <para>d#3</para>
            <para lang="c">c#4</para>
        </para>
        <para lang="c">
            c#5
            <para>c#6</para>
        </para>
    </para>
</xml>

      

I am having trouble phrasing an XPath 1.0 expression that returns all nodes of a specific language, eg c

. A node is the same as a similar xpath function lang()

for an attribute xml:lang

:

  • It has an attribute lang

    with value c

    ( //*[@lang = "c"]

    )
  • -OR-
    • One of them has an attribute lang

      with value c

      ( //*[ancestor::*/@lang = "c"]

      )
    • -AND- the node itself has no attribute lang

      at all
    • -AND NOT- if any of its parent nodes have an attribute lang

      other than c

      more "near" than the parent with the attribute lang

      c

      (2.1 is "canceled").

Examples of matches with XML above and c

for lang

will give 7 nodes: C # 0 - C # 6.

<xml lang="c"> c#0 ...              (direct match, lang="c")
<para>c#1</para>                    (parent has lang="c")
<para>c#2</para>                    (parent has lang="c")
<para lang="c">c#3</para>           (direct match, lang="c")
<para lang="c">c#4</para>           (direct match, lang="c")
<para lang="c"> c#5 ...             (direct match, lang="c")
<para>c#6</para>                    (parent has lang="c", that parent is descending of 
                                     any other ancestor with lang="d")

      

I have a problem to describe this in the xpath request. Even I got better with xpath over the last year, this one really knocks me out.

No matter what I try, I am having trouble describing the oversaturated nature of an ancestor with a matching predicate over an ancestor with a non-matching predicate.

The examples given are only half the problem, since there are not only full attribute values, but also initial ones:

 starts-with(@lang, concat("c", "-"))

      

But I would be happy to see that the brute force problem is solved first. I am testing PHP ( Online demo ):

<?php
header('Content-Type: text/plain');
$xml = <<<XML
<xml lang="c">
    c#0
    <para>c#1</para>
    <para>c#2</para>
    <para lang="d">
        d#0
        <para>d#1</para>
        <para lang="c">c#3</para>
        <para lang="d">
            d#2
            <para>d#3</para>
            <para lang="c">c#4</para>
        </para>
        <para lang="c">
            c#5
            <para>c#6</para>
        </para>
    </para>
</xml>
XML;

$doc = new DOMDocument();
$doc->loadXML($xml);
$xp = new DOMXPath($doc);

$expression = '
//*[
    ancestor-or-self::*/@lang = "c"
    and (
        not(ancestor-or-self::*/@lang != "c")
        or (
            count(ancestor-or-self::*[@lang != "c"])
            < count(ancestor-or-self::*[@lang = "c"])
        )
    )
]';

$result = $xp->query($expression);
printResult($result);

function printResult($result)
{
    global $xp;

    if ($result) {
        printf("Result (%d Nodes):\n", $result->length);
        foreach ($result as $index => $node) {
            $depth = $xp->evaluate('count(ancestor::*)', $node);
            printf("#%d (%d): %s\n", $index, $depth, $node->ownerDocument->saveXML($node));
        }
    } else {
        printf("No Result, query failed.\n");
    }
}

      

+3


source to share


2 answers


Using

//*[@lang='c'
  or
   not(@lang) and ancestor::*[@lang][1]/@lang = 'c'
   ]

      

This selects any XML-document element that has an attribute lang

with a value "c"

or does not have the attribute lang

and attribute values lang

of its first ancestor which has lang, "c"

.

Simplest equivalent XPath expression :



//*[ancestor-or-self::*[@lang][1]/@lang='c']

      

Here is a snapshot of the selection taken with the XPath Visualizer :

enter image description here

+2


source


Expected XPath

//*[(descendant-or-self::*/@lang = 'c' and not(descendant-or-self::*/@lang != 'c')) or (ancestor-or-self::*/@lang = 'c' and not(ancestor-or-self::*/@lang != 'c'))]

      



Output

xml     c#0 (lang: c)
para    c#1 (lang: c)
para    c#2 (lang: c)
para    c#3 (lang: c)

      

+1


source







All Articles