Parsing JSON with SQL: how to extract a record into a JSON object?
I am looking at about 13,000 rows in a SQL Server table and am trying to parse specific values ββin a single column which is stored as json.
The json column values ββlook something like this:
..."http://www.companyurl.com","FoundedYear":"2007","Status":"Private","CompanySize":"51-200","TagLine":"We build software we believe in","Origi...
I would like to extract the value for "CompanySize", but not all rows include this attribute. Other complicating factors:
- I'm not sure how many possible values ββare in the "CompanySize" parameter.
- "CompanySize" is not always followed by the "TagLine" parameter.
The only rule I know for sure is that the CompanySize value is always a string of unknown length that follows the varchar string "CompanySize":"
and ends before the next string ","
.
Ideally, we would fully upgrade to SQL Server 2016 so that I can use SQL Server JSON support , but that's not the case.
source to share
You can do this with CHARINDEX
, as you can pass it an initial position that will allow you to get a close "
. You probably shouldn't be looking ","
, since if CompanySize
is a final property there won't be at the end of this snippet ,"
. Doing this function as a row as a table (iTVF) would be quite efficient (especially since 13k rows is almost nothing), you just need to use it with CROSS APPLY
or OUTER APPLY
:
USE [tempdb];
GO
CREATE FUNCTION dbo.GetCompanySize(@JSON NVARCHAR(MAX))
RETURNS TABLE
AS RETURN
WITH SearchStart AS
(
SELECT '"CompanySize":"' AS [Fragment]
), Search AS
(
SELECT CHARINDEX(ss.Fragment, @JSON) AS [Start],
LEN(ss.Fragment) AS [FragmentLength]
FROM SearchStart ss
)
SELECT CASE Search.Start
WHEN 0 THEN NULL
ELSE SUBSTRING(@JSON,
(Search.Start + Search.FragmentLength),
CHARINDEX('"',
@JSON,
Search.Start + Search.FragmentLength
) - (Search.Start + Search.FragmentLength)
)
END AS [CompanySize]
FROM Search;
GO
Set up your test:
CREATE TABLE #tmp (JSON NVARCHAR(MAX));
INSERT INTO #tmp (JSON) VALUES
('"http://www.companyurl.com","FoundedYear":"2007","Status":"Private","CompanySize":"51-200","TagLine":"We build software we believe in","Origi..');
INSERT INTO #tmp (JSON) VALUES
('"http://www.companyurl.com","FoundedYear":"2009","Status":"Public","TagLine":"We build software we believe in","Origi..');
INSERT INTO #tmp (JSON) VALUES (NULL);
Run the test:
SELECT comp.CompanySize
FROM #tmp tmp
CROSS APPLY tempdb.dbo.GetCompanySize(tmp.JSON) comp
Return:
CompanySize ----------- 51-200 NULL NULL
source to share
Building on @ srutzky's answer, the following solution avoids generating UDFs (although you didn't say it was a limitation, it might be useful for some).
select
c.Id,
substring(i2.jsontail, 0, i3.[length]) CompanySize
from
Companies c cross apply
( select charindex('CompanySize":"', c.json) start ) i1 cross apply
( select substring(c.json, start + len('CompanySize":"'), len(c.json) - start ) jsontail ) i2 cross apply
( select charindex('"', i2.jsontail) [length] ) i3
where
i1.[start] != 0
source to share