How can I combine multiple XML documents into one without accumulating namespace?
I have several SVG segments in a table stored as an XML document. Now I need to select all the elements from this table and combine them into one XML document.
This is the T-SQL code I have:
declare @xml table (xmldocument xml)
insert @xml select '
<svg xmlns="http://www.w3.org/2000/svg" otherattrib="x">
<path id="789" data-objid="0000X2"></path>
</svg>'
insert @xml select '
<svg xmlns="http://www.w3.org/2000/svg" otherattrib="x">
<admin>
<g>
<path></path>
<path data-objid="0000X1"></path>
<path id="123" data-objid="0000X2"></path>
<path id="456" data-objid="0000X3"></path>
</g>
</admin>
<g>
<path></path>
<path data-objid="0000X1"></path>
<path id="789" data-objid="0000X2"></path>
<path id="abc" data-objid="0000X3"></path>
</g>
</svg>'
insert @xml select '
<svg xmlns="http://www.w3.org/2000/svg" otherattrib="x">
<path></path>
</svg>'
insert @xml select '
<svg xmlns="http://www.w3.org/2000/svg" otherattrib="x">
<path id="abc" data-objid="0000X3"></path>
</svg>'
--;WITH XMLNAMESPACES ('http://www.w3.org/2000/svg' AS svg)
;WITH XMLNAMESPACES (default 'http://www.w3.org/2000/svg')
--SELECT
--(
SELECT
--xmldocument
--,c.p.value('.', 'nvarchar(MAX)')
c.p.query('declare default element namespace "http://www.w3.org/2000/svg";.')
FROM @xml AS t
OUTER APPLY t.xmldocument.nodes('/svg//*') AS c(p)
FOR XML PATH(''), root('svg')
-- ) AS merged
But this creates
<svg xmlns="http://www.w3.org/2000/svg">
<path xmlns="http://www.w3.org/2000/svg" id="789" data-objid="0000X2" />
<admin xmlns="http://www.w3.org/2000/svg">
<g>
<path />
<path data-objid="0000X1" />
<path id="123" data-objid="0000X2" />
<path id="456" data-objid="0000X3" />
</g>
</admin>
<g xmlns="http://www.w3.org/2000/svg">
<path />
<path data-objid="0000X1" />
<path id="123" data-objid="0000X2" />
<path id="456" data-objid="0000X3" />
</g>
<path xmlns="http://www.w3.org/2000/svg" />
<path xmlns="http://www.w3.org/2000/svg" data-objid="0000X1" />
<path xmlns="http://www.w3.org/2000/svg" id="123" data-objid="0000X2" />
<path xmlns="http://www.w3.org/2000/svg" id="456" data-objid="0000X3" />
<g xmlns="http://www.w3.org/2000/svg">
<path />
<path data-objid="0000X1" />
<path id="789" data-objid="0000X2" />
<path id="abc" data-objid="0000X3" />
</g>
<path xmlns="http://www.w3.org/2000/svg" />
<path xmlns="http://www.w3.org/2000/svg" data-objid="0000X1" />
<path xmlns="http://www.w3.org/2000/svg" id="789" data-objid="0000X2" />
<path xmlns="http://www.w3.org/2000/svg" id="abc" data-objid="0000X3" />
<path xmlns="http://www.w3.org/2000/svg" />
<path xmlns="http://www.w3.org/2000/svg" id="abc" data-objid="0000X3" />
</svg>
instead
<svg xmlns="http://www.w3.org/2000/svg">
<path id="789" data-objid="0000X2" />
<admin>
<g>
<path />
<path data-objid="0000X1" />
<path id="123" data-objid="0000X2" />
<path id="456" data-objid="0000X3" />
</g>
</admin>
<g>
<path />
<path data-objid="0000X1" />
<path id="123" data-objid="0000X2" />
<path id="456" data-objid="0000X3" />
</g>
<path />
<path data-objid="0000X1" />
<path id="123" data-objid="0000X2" />
<path id="456" data-objid="0000X3" />
<g>
<path />
<path data-objid="0000X1" />
<path id="789" data-objid="0000X2" />
<path id="abc" data-objid="0000X3" />
</g>
<path />
<path data-objid="0000X1" />
<path id="789" data-objid="0000X2" />
<path id="abc" data-objid="0000X3" />
<path />
<path id="abc" data-objid="0000X3" />
</svg>
What am I missing? What am I doing wrong?
How do I fix this without resorting to varchar and then search and replace with "xmlns = ..."?
First of all: Duplicate namespaces are not wrong, just annoying and inflating the size of your result ...
Unfortunately, there is no clean way to get rid of them. Your idea of โโcasting to text and doing it at the line level isn't all that bad if you really need to (but keep in mind that casting has to go to N VARCHAR, and re-converting to XML might change your XML structurally (the order of the attributes, CDATA
- sections ...) Try this:
Brief approach
... if you really don't need anything other than the namespace at the first root <svg>
...
DECLARE @NewXML NVARCHAR(MAX)=
(
SELECT t.xmldocument.query('declare default element namespace "http://www.w3.org/2000/svg";svg/*')
FROM @xml AS t
FOR XML PATH('')
);
SELECT CAST(N'<svg xmlns="http://www.w3.org/2000/svg">'
+ REPLACE(@NewXML,' xmlns="http://www.w3.org/2000/svg"','')
+ N'</svg>' AS XML);
A more flexible approach
You can try this:
DECLARE @NewXML XML=
(
SELECT t.xmldocument.query('declare default element namespace "http://www.w3.org/2000/svg";svg/*')
FROM @xml AS t
FOR XML PATH(''),TYPE
);
SET @NewXML =CAST(REPLACE(CAST(@NewXML AS NVARCHAR(MAX)),' xmlns="http://www.w3.org/2000/svg"','') AS XML);
- You need to repeat CAST
and REPLACE
otherwise you end up with a lot xmlns=""
defining an empty default namespace for internal nodes, which was wrong ...
WITH XMLNAMESPACES (DEFAULT 'http://www.w3.org/2000/svg')
SELECT @NewXML=CAST(REPLACE(CAST((SELECT @NewXML FOR XML PATH('svg'),TYPE) AS NVARCHAR(MAX)),' xmlns=""','') AS XML);
SELECT @NewXML;
You can use STUFF()
to represent namespaces at the row level:
(Don't use WITH XMLNAMESPACES
here ...
SELECT @NewXML=CAST(STUFF(CAST((SELECT @NewXML FOR XML PATH('svg'),TYPE) AS NVARCHAR(MAX)),5,0,' xmlns="http://www.w3.org/2000/svg" ') AS XML);
SELECT @NewXML;
Result in all cases
<svg xmlns="http://www.w3.org/2000/svg">
<path id="789" data-objid="0000X2" />
<admin>
<g>
<path />
<path data-objid="0000X1" />
<path id="123" data-objid="0000X2" />
<path id="456" data-objid="0000X3" />
</g>
</admin>
<g>
<path />
<path data-objid="0000X1" />
<path id="789" data-objid="0000X2" />
<path id="abc" data-objid="0000X3" />
</g>
<path />
<path id="abc" data-objid="0000X3" />
</svg>
source to share