How to split XML array into separate rows (while upholding consistency)












6















I am working on the database dump of this exact stack exchange section. While I am working on it I have encountered one issue that I am currently unable to solve.



In the XML File Posts.xml the contents look like this



enter image description here



There are of course multiple rows, but that's how one looks like. There's already a Tags.xml file given in the dump, which makes it even more obvious that the "Tags" attribute in that picture is in fact supposed to be its separate table (many to many).



So right now I am trying to figure out a way how to extract the tags. Here's what I tried to do:



CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)
GO


↑ The table I created to test out my code. I have already filled it with the Tags and PostIds



SELECT  T1.PostId,
S.SplitTag
FROM (
SELECT T.PostId,
cast('<X>'+ REPLACE(T.Tag,'>','</X><X>') + '</X>' as XML) AS NewTag
FROM #TestingIdea AS T
) AS T1
CROSS APPLY (
SELECT tData.value('.','nvarchar(30)') SplitTag
FROM T1.NewTag.nodes('X') AS T(tData)
) AS S
GO


Yet this code returns this error



XML parsing: line 1, character 37, illegal qualified name character


After googling this error (including here), whatever people had (like extra " marks or different CHAR sets) I didn't have. So I am kind of stuck. Maybe I missed something extremely obvious from previous answers I found T_T In any case I appreciate any help and advice on how to tackle this. It's the last table I have yet to normalize.



Small Sample Data From the XML File https://pastebin.com/AW0Z8Be2
For anyone interested in the program I use to view XML files (so it's much easier to read like in that picture above). It's called FOXE XML Reader (Free XML Editor - First Object)










share|improve this question

























  • So do you need the tags one by one as a result? What is the exact result you need? Do you have sample data to work with?

    – Randi Vertongen
    Dec 15 '18 at 19:22








  • 1





    Yea I need them to be 1 by 1. This is how the data looks like in my database right now i.gyazo.com/6b7408201f18ebcbf888cc0f8b36cb27.png All I have is the XML file for my data

    – Chessbrain
    Dec 15 '18 at 19:28


















6















I am working on the database dump of this exact stack exchange section. While I am working on it I have encountered one issue that I am currently unable to solve.



In the XML File Posts.xml the contents look like this



enter image description here



There are of course multiple rows, but that's how one looks like. There's already a Tags.xml file given in the dump, which makes it even more obvious that the "Tags" attribute in that picture is in fact supposed to be its separate table (many to many).



So right now I am trying to figure out a way how to extract the tags. Here's what I tried to do:



CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)
GO


↑ The table I created to test out my code. I have already filled it with the Tags and PostIds



SELECT  T1.PostId,
S.SplitTag
FROM (
SELECT T.PostId,
cast('<X>'+ REPLACE(T.Tag,'>','</X><X>') + '</X>' as XML) AS NewTag
FROM #TestingIdea AS T
) AS T1
CROSS APPLY (
SELECT tData.value('.','nvarchar(30)') SplitTag
FROM T1.NewTag.nodes('X') AS T(tData)
) AS S
GO


Yet this code returns this error



XML parsing: line 1, character 37, illegal qualified name character


After googling this error (including here), whatever people had (like extra " marks or different CHAR sets) I didn't have. So I am kind of stuck. Maybe I missed something extremely obvious from previous answers I found T_T In any case I appreciate any help and advice on how to tackle this. It's the last table I have yet to normalize.



Small Sample Data From the XML File https://pastebin.com/AW0Z8Be2
For anyone interested in the program I use to view XML files (so it's much easier to read like in that picture above). It's called FOXE XML Reader (Free XML Editor - First Object)










share|improve this question

























  • So do you need the tags one by one as a result? What is the exact result you need? Do you have sample data to work with?

    – Randi Vertongen
    Dec 15 '18 at 19:22








  • 1





    Yea I need them to be 1 by 1. This is how the data looks like in my database right now i.gyazo.com/6b7408201f18ebcbf888cc0f8b36cb27.png All I have is the XML file for my data

    – Chessbrain
    Dec 15 '18 at 19:28
















6












6








6


3






I am working on the database dump of this exact stack exchange section. While I am working on it I have encountered one issue that I am currently unable to solve.



In the XML File Posts.xml the contents look like this



enter image description here



There are of course multiple rows, but that's how one looks like. There's already a Tags.xml file given in the dump, which makes it even more obvious that the "Tags" attribute in that picture is in fact supposed to be its separate table (many to many).



So right now I am trying to figure out a way how to extract the tags. Here's what I tried to do:



CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)
GO


↑ The table I created to test out my code. I have already filled it with the Tags and PostIds



SELECT  T1.PostId,
S.SplitTag
FROM (
SELECT T.PostId,
cast('<X>'+ REPLACE(T.Tag,'>','</X><X>') + '</X>' as XML) AS NewTag
FROM #TestingIdea AS T
) AS T1
CROSS APPLY (
SELECT tData.value('.','nvarchar(30)') SplitTag
FROM T1.NewTag.nodes('X') AS T(tData)
) AS S
GO


Yet this code returns this error



XML parsing: line 1, character 37, illegal qualified name character


After googling this error (including here), whatever people had (like extra " marks or different CHAR sets) I didn't have. So I am kind of stuck. Maybe I missed something extremely obvious from previous answers I found T_T In any case I appreciate any help and advice on how to tackle this. It's the last table I have yet to normalize.



Small Sample Data From the XML File https://pastebin.com/AW0Z8Be2
For anyone interested in the program I use to view XML files (so it's much easier to read like in that picture above). It's called FOXE XML Reader (Free XML Editor - First Object)










share|improve this question
















I am working on the database dump of this exact stack exchange section. While I am working on it I have encountered one issue that I am currently unable to solve.



In the XML File Posts.xml the contents look like this



enter image description here



There are of course multiple rows, but that's how one looks like. There's already a Tags.xml file given in the dump, which makes it even more obvious that the "Tags" attribute in that picture is in fact supposed to be its separate table (many to many).



So right now I am trying to figure out a way how to extract the tags. Here's what I tried to do:



CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)
GO


↑ The table I created to test out my code. I have already filled it with the Tags and PostIds



SELECT  T1.PostId,
S.SplitTag
FROM (
SELECT T.PostId,
cast('<X>'+ REPLACE(T.Tag,'>','</X><X>') + '</X>' as XML) AS NewTag
FROM #TestingIdea AS T
) AS T1
CROSS APPLY (
SELECT tData.value('.','nvarchar(30)') SplitTag
FROM T1.NewTag.nodes('X') AS T(tData)
) AS S
GO


Yet this code returns this error



XML parsing: line 1, character 37, illegal qualified name character


After googling this error (including here), whatever people had (like extra " marks or different CHAR sets) I didn't have. So I am kind of stuck. Maybe I missed something extremely obvious from previous answers I found T_T In any case I appreciate any help and advice on how to tackle this. It's the last table I have yet to normalize.



Small Sample Data From the XML File https://pastebin.com/AW0Z8Be2
For anyone interested in the program I use to view XML files (so it's much easier to read like in that picture above). It's called FOXE XML Reader (Free XML Editor - First Object)







sql-server sql-server-2017 xml xquery string-splitting






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 30 at 5:50









Paul White

52.9k14281457




52.9k14281457










asked Dec 15 '18 at 19:16









ChessbrainChessbrain

1686




1686













  • So do you need the tags one by one as a result? What is the exact result you need? Do you have sample data to work with?

    – Randi Vertongen
    Dec 15 '18 at 19:22








  • 1





    Yea I need them to be 1 by 1. This is how the data looks like in my database right now i.gyazo.com/6b7408201f18ebcbf888cc0f8b36cb27.png All I have is the XML file for my data

    – Chessbrain
    Dec 15 '18 at 19:28





















  • So do you need the tags one by one as a result? What is the exact result you need? Do you have sample data to work with?

    – Randi Vertongen
    Dec 15 '18 at 19:22








  • 1





    Yea I need them to be 1 by 1. This is how the data looks like in my database right now i.gyazo.com/6b7408201f18ebcbf888cc0f8b36cb27.png All I have is the XML file for my data

    – Chessbrain
    Dec 15 '18 at 19:28



















So do you need the tags one by one as a result? What is the exact result you need? Do you have sample data to work with?

– Randi Vertongen
Dec 15 '18 at 19:22







So do you need the tags one by one as a result? What is the exact result you need? Do you have sample data to work with?

– Randi Vertongen
Dec 15 '18 at 19:22






1




1





Yea I need them to be 1 by 1. This is how the data looks like in my database right now i.gyazo.com/6b7408201f18ebcbf888cc0f8b36cb27.png All I have is the XML file for my data

– Chessbrain
Dec 15 '18 at 19:28







Yea I need them to be 1 by 1. This is how the data looks like in my database right now i.gyazo.com/6b7408201f18ebcbf888cc0f8b36cb27.png All I have is the XML file for my data

– Chessbrain
Dec 15 '18 at 19:28












1 Answer
1






active

oldest

votes


















8














Does something like this satisfy the resultset?



Table & Data



CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)

INSERT INTO #TestingIdea(PostId,Tag)
VALUES(1,'<mysql><innodb><myisam>')

GO


Query



SELECT PostId, RIGHT(value,len(value)-1) as SplitTag
FROM #TestingIdea
CROSS APPLY string_split(tag,'>')
WHERE value != ''


Result



PostId  SplitTag
1 mysql
1 innodb
1 myisam





share|improve this answer





















  • 1





    Indeed it does O_O First of all, thank you! Makes my "complicated" code look like sht lol Would be kind enough as to explain how that query works? I am obvious still new with xQuery stuff. I understand that you removed the last '>' on the far right. But I don't understand how the '<' at the beginning got removed?

    – Chessbrain
    Dec 15 '18 at 19:51






  • 1





    Hey, no problem! I was lucky that you used a version of SQL Server 2016 or above. The String_Split gets, like you saw rid of the '>' tag, and creates a new row by 'cross applying' the function. So when i just get the 'value' from the string split, without applying the RIGHT() function, it will show as <mysql for example. So what i did, and there might be better solutions, is just get the value, and the length of the value - 1, and applied the RIGHT() function to that. This means that the first character of the value returned will not be returned. resulting into 'mysql' instead of <mysql.

    – Randi Vertongen
    Dec 15 '18 at 19:55








  • 1





    I am assuming this means that versions prior to SQL 2016 didn't have the string_split function?

    – Chessbrain
    Dec 15 '18 at 20:02






  • 1





    Indeed, they had to create a custom function like the one in this link: stackoverflow.com/questions/10914576/t-sql-split-string

    – Randi Vertongen
    Dec 15 '18 at 20:05











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225067%2fhow-to-split-xml-array-into-separate-rows-while-upholding-consistency%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









8














Does something like this satisfy the resultset?



Table & Data



CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)

INSERT INTO #TestingIdea(PostId,Tag)
VALUES(1,'<mysql><innodb><myisam>')

GO


Query



SELECT PostId, RIGHT(value,len(value)-1) as SplitTag
FROM #TestingIdea
CROSS APPLY string_split(tag,'>')
WHERE value != ''


Result



PostId  SplitTag
1 mysql
1 innodb
1 myisam





share|improve this answer





















  • 1





    Indeed it does O_O First of all, thank you! Makes my "complicated" code look like sht lol Would be kind enough as to explain how that query works? I am obvious still new with xQuery stuff. I understand that you removed the last '>' on the far right. But I don't understand how the '<' at the beginning got removed?

    – Chessbrain
    Dec 15 '18 at 19:51






  • 1





    Hey, no problem! I was lucky that you used a version of SQL Server 2016 or above. The String_Split gets, like you saw rid of the '>' tag, and creates a new row by 'cross applying' the function. So when i just get the 'value' from the string split, without applying the RIGHT() function, it will show as <mysql for example. So what i did, and there might be better solutions, is just get the value, and the length of the value - 1, and applied the RIGHT() function to that. This means that the first character of the value returned will not be returned. resulting into 'mysql' instead of <mysql.

    – Randi Vertongen
    Dec 15 '18 at 19:55








  • 1





    I am assuming this means that versions prior to SQL 2016 didn't have the string_split function?

    – Chessbrain
    Dec 15 '18 at 20:02






  • 1





    Indeed, they had to create a custom function like the one in this link: stackoverflow.com/questions/10914576/t-sql-split-string

    – Randi Vertongen
    Dec 15 '18 at 20:05
















8














Does something like this satisfy the resultset?



Table & Data



CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)

INSERT INTO #TestingIdea(PostId,Tag)
VALUES(1,'<mysql><innodb><myisam>')

GO


Query



SELECT PostId, RIGHT(value,len(value)-1) as SplitTag
FROM #TestingIdea
CROSS APPLY string_split(tag,'>')
WHERE value != ''


Result



PostId  SplitTag
1 mysql
1 innodb
1 myisam





share|improve this answer





















  • 1





    Indeed it does O_O First of all, thank you! Makes my "complicated" code look like sht lol Would be kind enough as to explain how that query works? I am obvious still new with xQuery stuff. I understand that you removed the last '>' on the far right. But I don't understand how the '<' at the beginning got removed?

    – Chessbrain
    Dec 15 '18 at 19:51






  • 1





    Hey, no problem! I was lucky that you used a version of SQL Server 2016 or above. The String_Split gets, like you saw rid of the '>' tag, and creates a new row by 'cross applying' the function. So when i just get the 'value' from the string split, without applying the RIGHT() function, it will show as <mysql for example. So what i did, and there might be better solutions, is just get the value, and the length of the value - 1, and applied the RIGHT() function to that. This means that the first character of the value returned will not be returned. resulting into 'mysql' instead of <mysql.

    – Randi Vertongen
    Dec 15 '18 at 19:55








  • 1





    I am assuming this means that versions prior to SQL 2016 didn't have the string_split function?

    – Chessbrain
    Dec 15 '18 at 20:02






  • 1





    Indeed, they had to create a custom function like the one in this link: stackoverflow.com/questions/10914576/t-sql-split-string

    – Randi Vertongen
    Dec 15 '18 at 20:05














8












8








8







Does something like this satisfy the resultset?



Table & Data



CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)

INSERT INTO #TestingIdea(PostId,Tag)
VALUES(1,'<mysql><innodb><myisam>')

GO


Query



SELECT PostId, RIGHT(value,len(value)-1) as SplitTag
FROM #TestingIdea
CROSS APPLY string_split(tag,'>')
WHERE value != ''


Result



PostId  SplitTag
1 mysql
1 innodb
1 myisam





share|improve this answer















Does something like this satisfy the resultset?



Table & Data



CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)

INSERT INTO #TestingIdea(PostId,Tag)
VALUES(1,'<mysql><innodb><myisam>')

GO


Query



SELECT PostId, RIGHT(value,len(value)-1) as SplitTag
FROM #TestingIdea
CROSS APPLY string_split(tag,'>')
WHERE value != ''


Result



PostId  SplitTag
1 mysql
1 innodb
1 myisam






share|improve this answer














share|improve this answer



share|improve this answer








edited Dec 15 '18 at 19:50

























answered Dec 15 '18 at 19:42









Randi VertongenRandi Vertongen

3,253822




3,253822








  • 1





    Indeed it does O_O First of all, thank you! Makes my "complicated" code look like sht lol Would be kind enough as to explain how that query works? I am obvious still new with xQuery stuff. I understand that you removed the last '>' on the far right. But I don't understand how the '<' at the beginning got removed?

    – Chessbrain
    Dec 15 '18 at 19:51






  • 1





    Hey, no problem! I was lucky that you used a version of SQL Server 2016 or above. The String_Split gets, like you saw rid of the '>' tag, and creates a new row by 'cross applying' the function. So when i just get the 'value' from the string split, without applying the RIGHT() function, it will show as <mysql for example. So what i did, and there might be better solutions, is just get the value, and the length of the value - 1, and applied the RIGHT() function to that. This means that the first character of the value returned will not be returned. resulting into 'mysql' instead of <mysql.

    – Randi Vertongen
    Dec 15 '18 at 19:55








  • 1





    I am assuming this means that versions prior to SQL 2016 didn't have the string_split function?

    – Chessbrain
    Dec 15 '18 at 20:02






  • 1





    Indeed, they had to create a custom function like the one in this link: stackoverflow.com/questions/10914576/t-sql-split-string

    – Randi Vertongen
    Dec 15 '18 at 20:05














  • 1





    Indeed it does O_O First of all, thank you! Makes my "complicated" code look like sht lol Would be kind enough as to explain how that query works? I am obvious still new with xQuery stuff. I understand that you removed the last '>' on the far right. But I don't understand how the '<' at the beginning got removed?

    – Chessbrain
    Dec 15 '18 at 19:51






  • 1





    Hey, no problem! I was lucky that you used a version of SQL Server 2016 or above. The String_Split gets, like you saw rid of the '>' tag, and creates a new row by 'cross applying' the function. So when i just get the 'value' from the string split, without applying the RIGHT() function, it will show as <mysql for example. So what i did, and there might be better solutions, is just get the value, and the length of the value - 1, and applied the RIGHT() function to that. This means that the first character of the value returned will not be returned. resulting into 'mysql' instead of <mysql.

    – Randi Vertongen
    Dec 15 '18 at 19:55








  • 1





    I am assuming this means that versions prior to SQL 2016 didn't have the string_split function?

    – Chessbrain
    Dec 15 '18 at 20:02






  • 1





    Indeed, they had to create a custom function like the one in this link: stackoverflow.com/questions/10914576/t-sql-split-string

    – Randi Vertongen
    Dec 15 '18 at 20:05








1




1





Indeed it does O_O First of all, thank you! Makes my "complicated" code look like sht lol Would be kind enough as to explain how that query works? I am obvious still new with xQuery stuff. I understand that you removed the last '>' on the far right. But I don't understand how the '<' at the beginning got removed?

– Chessbrain
Dec 15 '18 at 19:51





Indeed it does O_O First of all, thank you! Makes my "complicated" code look like sht lol Would be kind enough as to explain how that query works? I am obvious still new with xQuery stuff. I understand that you removed the last '>' on the far right. But I don't understand how the '<' at the beginning got removed?

– Chessbrain
Dec 15 '18 at 19:51




1




1





Hey, no problem! I was lucky that you used a version of SQL Server 2016 or above. The String_Split gets, like you saw rid of the '>' tag, and creates a new row by 'cross applying' the function. So when i just get the 'value' from the string split, without applying the RIGHT() function, it will show as <mysql for example. So what i did, and there might be better solutions, is just get the value, and the length of the value - 1, and applied the RIGHT() function to that. This means that the first character of the value returned will not be returned. resulting into 'mysql' instead of <mysql.

– Randi Vertongen
Dec 15 '18 at 19:55







Hey, no problem! I was lucky that you used a version of SQL Server 2016 or above. The String_Split gets, like you saw rid of the '>' tag, and creates a new row by 'cross applying' the function. So when i just get the 'value' from the string split, without applying the RIGHT() function, it will show as <mysql for example. So what i did, and there might be better solutions, is just get the value, and the length of the value - 1, and applied the RIGHT() function to that. This means that the first character of the value returned will not be returned. resulting into 'mysql' instead of <mysql.

– Randi Vertongen
Dec 15 '18 at 19:55






1




1





I am assuming this means that versions prior to SQL 2016 didn't have the string_split function?

– Chessbrain
Dec 15 '18 at 20:02





I am assuming this means that versions prior to SQL 2016 didn't have the string_split function?

– Chessbrain
Dec 15 '18 at 20:02




1




1





Indeed, they had to create a custom function like the one in this link: stackoverflow.com/questions/10914576/t-sql-split-string

– Randi Vertongen
Dec 15 '18 at 20:05





Indeed, they had to create a custom function like the one in this link: stackoverflow.com/questions/10914576/t-sql-split-string

– Randi Vertongen
Dec 15 '18 at 20:05


















draft saved

draft discarded




















































Thanks for contributing an answer to Database Administrators Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225067%2fhow-to-split-xml-array-into-separate-rows-while-upholding-consistency%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Bundesstraße 106

Verónica Boquete

Ida-Boy-Ed-Garten