Talend tXMLMap gets all shards of an element without loop

I am looking into Talend and I need XML to CSV mapping which I cannot do right.

I have a document like this (really simplified):

<Report>
    <Payment>
        <Reference Type="1" Value="wathever11"/>
        <Reference Type="2" Value="wathever12"/>
        <Amount value="222"/>
    </Payment>
    <Payment>
        <Reference Type="1" Value="wathever22"/>
        <Reference Type="3" Value="wathever23"/>
        <Amount value="123242"/>
    </Payment>
    <Payment>
        <Reference Type="3" Value="wathever33"/>
        <Reference Type="2" Value="wathever32"/>
        <Amount value="12976"/>
    </Payment>
    <Payment>
        <Reference Type="1" Value="wathever41"/>
        <Reference Type="4" Value="wathever44"/>
        <Amount value="1456"/>
    </Payment>
    <Payment>
        <Amount value="12976"/>
    </Payment>
</Report>

      

So I can't have any of the multiple links that can contain up to 4 different types.

I need to convert this to CSV like this:

ReferenceType1, ReferenceType2, ReferenceType3, ReferenceType4, Amount
whatever11    , whatever12    ,               ,               , 222
              , whatever22    , whatever23    ,               , 123242
              , whatever32    , whatever33    ,               , 12976
whatever41    ,               ,               , whatever44    , 1456
              ,               ,               ,               , 12976

      

To achieve this, I created this simple structure:

Talend xml Map

In XMLFileInput, this is the config: enter image description here

Where payment is a document

In tXMLMap ive set the document structure like this:

Report
└──Payment (loop) First case, see below
      β”œβ”€β”€ Reference (loop) Second case, see below
      |      β”œβ”€β”€ @Type
      |      └── @Value
      └── Amount
             └── @Value

      

In the output table, I create a column for each link with an expression like this:

[row1.Payment:/Payment/Reference/@Type].equals("1")? [row1.Payment:/Payment/Reference/@Value]: "" 

      

But I cannot match all links in one line. If I set the cycle at the pay level, I get this:

ReferenceType1, ReferenceType2, ReferenceType3, ReferenceType4, Amount
whatever11    ,               ,               ,               , 222
              ,               ,               ,               , 123242
              ,               ,               ,               , 12976
whatever41    ,               ,               ,               , 1456
              ,               ,               ,               , 12976

      

If the loop is set to basic, I get a new line for each link, like this:

ReferenceType1, ReferenceType2, ReferenceType3, ReferenceType4, Amount
whatever11    ,               ,               ,               , 222
              , whatever12    ,               ,               , 222
              , whatever22    ,               ,               , 123242
              ,               , whatever23    ,               , 123242
              ,               , whatever33    ,               , 12976
              , whatever32    ,               ,               , 12976
whatever41    ,               ,               ,               , 1456
              ,               ,               , whatever44    , 1456

      

And there is no line for an unreferenced element.

There must be a way to achieve the expected result. I tried to create an intermediate step by mapping each reference to another table along with a sequential number to further combine the columns into one row.

Although I am not finished, I think this might be the way to go. But the link is not the only element I need it for. I have a lot of elements where this happens in real xml and the number of intermediate tables as it would be would be large.

So I prefer to ask experts to try to find an easier way to do this.

Any ideas?

+3


source to share


1 answer


Good news: this is possible with tExtractXMLField. Check out my suggested solution:

job overview

tExtractXMLField



My output was based on your input XML file, but it is different from the expected output, please double check carefully:

.-------------------+-------------------+-------------------+-------------------+------.
|                                      tLogRow_1                                       |
|=------------------+-------------------+-------------------+-------------------+-----=|
|referenceTypeValue1|referenceTypeValue2|referenceTypeValue3|referenceTypeValue4|amount|
|=------------------+-------------------+-------------------+-------------------+-----=|
|wathever11         |wathever12         |null               |null               |222   |
|wathever22         |null               |wathever23         |null               |123242|
|null               |wathever32         |wathever33         |null               |12976 |
|wathever41         |null               |null               |wathever44         |1456  |
|null               |null               |null               |null               |12976 |
'-------------------+-------------------+-------------------+-------------------+------'

      

Bad news: This is not possible with tXMLMap in Talend 5.3.1 for me. I tried to put this in the expression field in tXMLMap: [row2.doc: / Report / Payment / Reference [@Type = 1] / @ Value] But then my work won't be generated anymore due to extra brackets

+1


source







All Articles