XPath: Find the following siblings that don't follow the ordering pattern
This is for C code detection. I'm trying to put flags on case statements that don't have a break. The tree hierarchy looks like this when there are several lines before the break statement. This is an example in C:
switch (x) {
case 1:
if (...) {...}
int y = 0;
for (...) {...}
break;
case 2:
It is somehow represented like this:
<switch>
<case>...</case>
<if>...</if>
<expression>...</expression>
<for>...</for>
<break>...</break>
<case>...</case>
</switch>
I need to find <case>
where a <break>
exists after any number of lines, but before the next <case>
.
This code helps me find those where the break does not immediately follow the case:
//case [name(following-sibling::*[1]) != 'break']
.. but when I try to use next-sibling :: * it finds a break, but not necessarily until the next case.
How can i do this?
source to share
Choose any case
that has a next break
and either not next case
, or the position of the next is break
less than the position of the next case
. With positions determined at startup count()
for previous siblings.
//case
[
following-sibling::break and
(
not(following-sibling::case) or
(
count(following-sibling::break[1]/preceding-sibling::*) <
count(following-sibling::case[1]/preceding-sibling::*)
)
)
]
To capture other cases, those with no breaks, just type the big old one there not()
:
//case
[not(
following-sibling::break and
(
not(following-sibling::case) or
(
count(following-sibling::break[1]/preceding-sibling::*) <
count(following-sibling::case[1]/preceding-sibling::*)
)
)
)]
source to share
I think you are struggling because your XML format does not model the problem very well. It would be much easier if other operators were nested inside elements <case>
instead of being siblings, then you could just use switch/case[break]
.
With your current structure, it’s easiest to start searching <break>
and then go back to find the appropriate one <case>
. As @LarsH pointed out, there are some additional suggestions in my original expression. This cannot be changed to fix this unless you restrict it to just the first case:
switch/break/preceding-sibling::case[1]
@derp's answer is better and can find both cases with and without breaks.
source to share
I agree with @PeterHall. It would be better to rearrange the XML into something more accurate representation of the abstract syntax tree of the C grammar. You can do this fairly easily (for this case) with XSLT grouping:
<xsl:for-each-group select="*" group-starting-with="case">
<case>
<xsl:copy-of select="current-group()[not(self::case)]"/>
</case>
</xsl:for-each-group>
Then you can find cases without breaking like switch/case[not(break)]
.
source to share
Derp's answer is correct. But I'll just add another one. This selects the case elements that have a break:
//case[generate-id(.) =
generate-id(following-sibling::break[1]/preceding-sibling::case[1])]
In other words, this selects the case elements for which this is true: The context element is identical to the element of the first element preceding the next break element (for siblings only).
If you have many case arguments, this option may be faster than using count()
. But you never know for sure unless you validate it with the appropriate data using the appropriate XPath processor.
BTW, .
in is generate-id(.)
not required since the default argument is .
. But I prefer to make it explicit for readability.
source to share