1

Take the following simplified XML:

(An excerpt from a very expensive (and large) XML product feed from an 3rd party industry wholesaler data provider - i.e. I have no control over its schema / format / content)

<Company>
    <Code>7786</Code>
    <Brand>
        <!-- /../ -->
        <Groups>
            <Group>
                <!-- /../ -->
                <Group>
                    <!-- /../ -->
                    <Group>
                        <!-- /../ -->
                        <Product>
                            <Pip_code>3623949</Pip_code>
                            <!-- /../ -->
                            <Other_Codes>
                                <Other_Code>
                                    <Code_Description>EAN</Code_Description>
                                    <Code_Value>5013158781351</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>shipper EAN</Code_Description>
                                    <Code_Value>503158781443</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>AMPP</Code_Description>
                                    <Code_Value>19192411000001107</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>AMPP Manf</Code_Description>
                                    <Code_Value>2061801000001104</Code_Value>
                                </Other_Code>
                            </Other_Codes>
                        </Product>
                    </Group>
                </Group>
            </Group>
        </Groups>
    </Brand>
    <Brand>
        <!-- /../ -->
        <Groups>
            <Group>
                <!-- /../ -->
                <Product>
                    <Pip_code>3265725</Pip_code>
                    <!-- /../ -->
                    <Other_Codes>
                        <Other_Code>
                            <Code_Description>Outer EAN</Code_Description>
                            <Code_Value>5013158776531</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>AMPP</Code_Description>
                            <Code_Value>11521811000001106</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>AMPP Manf</Code_Description>
                            <Code_Value>2061801000001104</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>EAN</Code_Description>
                            <Code_Value>5013158776500</Code_Value>
                        </Other_Code>
                    </Other_Codes>
                </Product>
            </Group>
        </Groups>
    </Brand>
</Company>

I have been tasked with producing the following output:

Target output

This is proving to be extremely difficult for two reasons:

  1. You will notice that <product> 3623949 exists within Groups/Group/Group/Group/ (3 levels of Group) whereas <product> 3265725 exists within Groups/Group/ (1 level of Group) - I need to select all <product> nodes regardless of their nesting level (ideally whenever parent node = <Group>)

  2. Note the <Other_Codes> node within <product> - I need to select the repeating child nodes (Code_Description & Code_Value) as two seperate columns, but note also that there could be any number of <Other_Code> nodes within <Other_Codes> and I need to select them all (rendering [i] notation useless)

I've tried my best to get this data using TSQL OPENXML, but the only way I can think of doing it requires me to duplicate code for the various Group/Group/ levels and I also have to hard code a set number of occurrences of Other_Code using [0] notation - which given their variable nature; is wrong.

SELECT  *
FROM    OPENXML (@idoc, 'Company/Brand/Groups/Group/Product',2)
                        WITH (
                          PIP_code CHAR (20) 'Pip_code',
                          OtherCodeType CHAR (20) 'Other_Codes/Other_Code[1]/Code_Description',
                          OtherCodeValue CHAR (30) 'Other_Codes/Other_Code[1]/Code_Value',
                          OtherCodeType2 CHAR (20) 'Other_Codes/Other_Code[2]/Code_Description',
                          OtherCodeValue2 CHAR (30) 'Other_Codes/Other_Code[2]/Code_Value',
                          OtherCodeType3 CHAR (20) 'Other_Codes/Other_Code[3]/Code_Description',
                          OtherCodeValue3 CHAR (30) 'Other_Codes/Other_Code[3]/Code_Value'
                  )

We would prefer to do this in TSQL (as this XML file is already being processed there for other nodes & paths not relevant here), we have SQL Server 2008, 2008 R2 & SQL Server 2014 at our disposal, but non-SQL solutions would also be helpful - we really have hit a brick wall on this one.

4

1 回答 1

2

So what you really want to do is have your selection statement be //Other_Codes - that double forward slash says you want every Other_Codes node, regardless of its location in the hierarchy, giving you one row for each Other_Codes node. Then, you can specify that PipCode is ../../Pip_code, assuming of course this hierarchy doesn't change all the time for this portion as well (but how would that even semantically make sense?).

This:

DECLARE @doc xml = '<Company>
    <Code>7786</Code>
    <Brand>
        <!-- /../ -->
        <Groups>
            <Group>
                <!-- /../ -->
                <Group>
                    <!-- /../ -->
                    <Group>
                        <!-- /../ -->
                        <Product>
                            <Pip_code>3623949</Pip_code>
                            <!-- /../ -->
                            <Other_Codes>
                                <Other_Code>
                                    <Code_Description>EAN</Code_Description>
                                    <Code_Value>5013158781351</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>shipper EAN</Code_Description>
                                    <Code_Value>503158781443</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>AMPP</Code_Description>
                                    <Code_Value>19192411000001107</Code_Value>
                                </Other_Code>
                                <Other_Code>
                                    <Code_Description>AMPP Manf</Code_Description>
                                    <Code_Value>2061801000001104</Code_Value>
                                </Other_Code>
                            </Other_Codes>
                        </Product>
                    </Group>
                </Group>
            </Group>
        </Groups>
    </Brand>
    <Brand>
        <!-- /../ -->
        <Groups>
            <Group>
                <!-- /../ -->
                <Product>
                    <Pip_code>3265725</Pip_code>
                    <!-- /../ -->
                    <Other_Codes>
                        <Other_Code>
                            <Code_Description>Outer EAN</Code_Description>
                            <Code_Value>5013158776531</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>AMPP</Code_Description>
                            <Code_Value>11521811000001106</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>AMPP Manf</Code_Description>
                            <Code_Value>2061801000001104</Code_Value>
                        </Other_Code>
                        <Other_Code>
                            <Code_Description>EAN</Code_Description>
                            <Code_Value>5013158776500</Code_Value>
                        </Other_Code>
                    </Other_Codes>
                </Product>
            </Group>
        </Groups>
    </Brand>
</Company>';

DECLARE @idoc int;

exec sp_xml_preparedocument @idoc OUTPUT, @doc;
SELECT  *
FROM    OPENXML(@idoc, '//Other_Code',1)
WITH (
    PipCode CHAR (20) '../../Pip_code',
    Code_Description CHAR(20) 'Code_Description',
    Code_Value CHAR (30) 'Code_Value'
)

Produces this:

PipCode                 Code_Description        Code_Value
-----------------------------------------------------------------
3623949                 EAN                     5013158781351                 
3623949                 shipper EAN             503158781443                  
3623949                 AMPP                    19192411000001107             
3623949                 AMPP Manf               2061801000001104              
3265725                 Outer EAN               5013158776531                 
3265725                 AMPP                    11521811000001106             
3265725                 AMPP Manf               2061801000001104              
3265725                 EAN                     5013158776500                 
于 2015-06-05T21:01:39.720 回答