4

我有一个非常长且复杂的字符串,带有新的换行符 - 我很难解析。我需要能够为以下每个字段创建一个包含一列的选择查询。

理想的情况是找到new line break- 对于每一行 - 回到:冒号之前的所有内容应该是列的名称,并且之间的所有内容都:应该new ling break是字段中的数据。

所有数据都作为字符串返回,所以我只是为下面的每一行构建一个 select 语句。我不确定这是否可能。

第二种选择,硬编码并说出类似CHARINDEX ( 'Home Phone:' ,notes, 0)Where I find the home phone string 之类的内容,然后在指定字符串之后拉出 the:和after 之间的所有内容。new ling break

在这种情况下,我的查询中的每个选择项都会说 - 查找字符串“家庭电话”并提取冒号后面的内容,或者查找字符串“学校名称”等。

这就是数据的样子(在一个名为 的所有字符串中notes):

Home Phone: 1234567890  
Cell Phone: 1234567890  
Date of Birth: 01/01/1971 
School Name: James Jones High  School 
Address:123 Main Street 
School City: Queens  
School State: PA  
School Zip: 32112 
Years Teaching: 12  
Grade Levels: Middle School  
Total Students: 120  
Subject: Music:   
How did they hear:  Other, provide more info: Former partner teacher in the Middle School 
Type: Public/Charter   
Question 1: aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaa aaaaaaa aaaa aaa aaaaaaaa aaaaaa aaaaaaaa  aaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaa aaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaa aaaaaaa aaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaa aaaaa aaaaaa aaaaaa aaaaaaaaaaaa aaaaaaaaaaaa aaa aaaa aaaaa aaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaaa aaaaaaaaaaa aaaaaaaaa aaaaaaaaaaaa.   
Question 2: bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbb bbbbbbbbb bbbbbbb bbbbbb bbbbbb bbbbbbb  bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbb 
Question 3: ccccccccccccccccccccccc cccccccc ccccccccccc cccccccccccccccccccccc ccc ccccccccc cccccccccccccc ccccccccccccccccccccc cccccccccccccccccccccc cccccccccccccccccc ccccccccccc ccccccccccccc ccccccccccccccccc cccccccc

所以输出看起来像这样(在每个字段中都回答了所有长问题)。

Home Phone  Cell Phone  Date of Birth:  …   Type:               Question 1 :                Question 2:    Question 3: 
1234567890  1234567890  1/1/1971            Public/Charter      aaaaaaaa aaaaaaaaaaaaa.     bbb bbbbbbbbbb ccccccccccccccccccccccc 

我不确定这是否有意义 - 但任何和所有建议都非常感谢。

提取子字符串和新行字符的代码——但这是硬编码的。我不知道如何动态地做到这一点。

SELECT  ltrim(rtrim(CHARINDEX ( 'Home Phone:' ,notes, 0) + LEN('Home Phone: '))) as 'beggining',
        ltrim(rtrim(CHARINDEX ( CHAR(10) ,notes, 0)))   as 'ending',
        SUBSTRING(notes,(CHARINDEX ( 'Home Phone:' ,notes, 0) + LEN('Home Phone: ')),(LEN('Home Phone: '))) as 'home phone',    
FROM    table a 

谢谢!

4

3 回答 3

2

大部分功劳 (90%) 应归功于 Alex K,他提供了有关查找第 n 次出现的字符的深入答案

SQL Server - 在字符串中查找第 n 个出现

我接受了该答案,针对您的问题进行了调整,然后应用 PIVOT 将其分解为所需的行/列。如果它们始终具有相同的逻辑(每个问题/答案由换行符分隔),则此方法应该能够根据需要为尽可能多的独特问题集创建所需的输出。

--Creates temporary table for testing, ID column and second set of data
--used to ensure query works for each unique set of questions
IF OBJECT_ID('tempdb..#Results') IS NOT NULL
    DROP TABLE #Results

CREATE TABLE #Results 
    (ID INT IDENTITY(1,1) NOT NULL,
    Notes NVARCHAR(4000) NOT NULL)
INSERT INTO #Results
    (Notes)
VALUES
    ('Home Phone: 1234567890  
    Cell Phone: 1234567890  
    Date of Birth: 01/01/1971 
    School Name: James Jones High  School 
    Address:123 Main Street 
    School City: Queens  
    School State: PA  
    School Zip: 32112 
    Years Teaching: 12  
    Grade Levels: Middle School  
    Total Students: 120  
    Subject: Music:   
    How did they hear:  Other, provide more info: Former partner teacher in the Middle School 
    Type: Public/Charter '),
    ('Home Phone: test  
    Cell Phone: test 
    Date of Birth: test
    School Name: test
    Address:test 
    School City: test 
    School State: test  
    School Zip: test 
    Years Teaching: test 
    Grade Levels: test 
    Total Students: test
    Subject: test   
    How did they hear:  test 
    Type: test ');

--Recursive CTE to determine the position of each successive line break
--Used CHARINDEX to search CHAR(13) and CHAR(10) and find line breaks and carriage returns
WITH cte
AS

    (SELECT ID, Notes, 1 AS Starts, CHARINDEX(CHAR(13)+CHAR(10),Notes) AS Pos
    FROM #Results
    UNION ALL
    SELECT ID, Notes, Pos +1, CHARINDEX(CHAR(13)+CHAR(10),Notes,Pos+1) AS Pos
    FROM cte
    WHERE
        pos >0),

--2nd CTE breaks each question set into it's own row
cte2
AS
    (SELECT ID, Notes,Starts, Pos,
        SUBSTRING(Notes, Starts,
            CASE
                WHEN pos > 0 THEN (pos - starts)
                ELSE LEN(notes)
            END) AS Token
    FROM cte),

--3rd CTE cleans up the data, separating the Questions/Answers into separate columns
--REPLACE is used to remove Line Break (CHAR(10)), output was then showing a TAB so used
--double REPLACE and removed CHAR(9) (tab)
--LTRIM removes leading space
cte3
AS
    (SELECT ID, 
        LTRIM(REPLACE(REPLACE(SUBSTRING(Token,CHARINDEX(CHAR(13)+CHAR(10),Token),CHARINDEX(':',Token)),CHAR(10),''),CHAR(9),'')) AS Question, 
        LTRIM(SUBSTRING(Token,CHARINDEX(':',Token)+1,4000)) AS Answer
    FROM cte2)

--Pivot separates each Question/Answer row into it's own column
SELECT *
FROM
    (SELECT ID, Question, Answer
    FROM cte3) AS a
PIVOT
    (MAX(Answer)
    FOR [Question] IN([Address],[Cell Phone],[Date of Birth],[Grade Levels],[Home Phone],[How did they hear],
                        [School City],[School Name],[School State],[School Zip],[Subject],[Total Students],[Type],[Years Teaching])) AS pvt

我对每个部分都发表了评论,希望能解释我的逻辑,但如果您有任何问题,请告诉我。

编辑:动态枢轴

可以使用动态 SQL 创建一个 PIVOT,该 PIVOT 将自动选择所有“问题”列并进行相应调整。我不相信它可以一步完成,因为我必须使用多个 CTE。我要做的是采取上述步骤用于创建 CTE、CTE2 和 CTE3(基本上是 PIVOT 查询之前的所有内容)并创建这些步骤的视图,然后使用该视图执行以下操作(对于我的示例,该视图称为“问卷”)

DECLARE @columns AS NVARCHAR(MAX)
DECLARE @query AS NVARCHAR(MAX)

SET @columns =  STUFF((SELECT DISTINCT ',' + QUOTENAME(q.question)
        FROM questionaire AS q
        FOR XML PATH(''), TYPE
        ).value('.','NVARCHAR(MAX)')
        ,1,1,'')

SET @query =    'SELECT ID, '+ @columns +' FROM
        (
            SELECT ID, Answer, Question
            FROM questionaire
        ) AS a
        PIVOT
        (
            MAX(Answer)
            FOR Question IN(' +@columns+')
        ) AS p'
EXECUTE(@query)
于 2016-02-29T21:37:20.927 回答
0

我知道这里很多人不喜欢这个分离器,但它是我更喜欢的。它只能处理最大 8000 的输入值,并且分隔符只有一个字符。然而,它有一些其他分离器没有的好东西,除非你有大量的输入,否则几乎所有东西都足够了。你可以在这里找到代码。http://www.sqlservercentral.com/articles/Tally+Table/72993/评论(需要登录)运行了许多页面,并且对该拆分器进行了非常冗长的讨论。

然后其他人更喜欢使用枢轴这种东西,我更喜欢交叉表(也称为条件聚合),因为我发现语法远没有那么迟钝。

我冒昧地稍微修改了您的示例数据。我改变了手机的价值,所以它和家里的电话不一样。我还缩短了问题的答案,因为它们不需要数百个字符来演示该技术。

declare @SomeValue varchar(8000)

set @SomeValue = 'Home Phone: 1234567890  
Cell Phone: 3344556677
Date of Birth: 01/01/1971 
School Name: James Jones High  School 
Address:123 Main Street 
School City: Queens  
School State: PA  
School Zip: 32112 
Years Teaching: 12  
Grade Levels: Middle School  
Total Students: 120  
Subject: Music:   
How did they hear:  Other, provide more info: Former partner teacher in the Middle School 
Type: Public/Charter   
Question 1: aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa.
Question 2: bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb
Question 3: ccccccccccccccccccccccc cccccccc';

select 
    MAX(case when s.ItemNumber = 1 then x.Item end) as HomePhone
    , MAX(case when s.ItemNumber = 2 then x.Item end) as DOB
    , MAX(case when s.ItemNumber = 3 then x.Item end) as DOB
    , MAX(case when s.ItemNumber = 4 then x.Item end) as SchoolName
    , MAX(case when s.ItemNumber = 5 then x.Item end) as SchoolAddress
    , MAX(case when s.ItemNumber = 6 then x.Item end) as SchoolCity
    , MAX(case when s.ItemNumber = 7 then x.Item end) as SchoolState
    , MAX(case when s.ItemNumber = 8 then x.Item end) as SchoolZip
    , MAX(case when s.ItemNumber = 9 then x.Item end) as YearsTeaching
    , MAX(case when s.ItemNumber = 10 then x.Item end) as GradeLevels
    , MAX(case when s.ItemNumber = 11 then x.Item end) as TotalStudents
    , MAX(case when s.ItemNumber = 12 then x.Item end) as Subject
    , MAX(case when s.ItemNumber = 13 then x.Item end) as HowHeard
    , MAX(case when s.ItemNumber = 14 then x.Item end) as SchoolType
    , MAX(case when s.ItemNumber = 15 then x.Item end) as Question1
    , MAX(case when s.ItemNumber = 16 then x.Item end) as Question2
    , MAX(case when s.ItemNumber = 17 then x.Item end) as Question3
from dbo.DelimitedSplit8K(@SomeValue, CHAR(10)) s
cross apply dbo.DelimitedSplit8K(s.Item, ':') x
于 2016-02-29T21:24:17.813 回答
0

你可以试试这样,但我在and之后xml去掉了多余的东西。:musicprovide more info

DECLARE @string nvarchar(max) = '
Home Phone: 1234567890  
Cell Phone: 1234567890  
Date of Birth: 01/01/1971 
School Name: James Jones High  School 
Address:123 Main Street 
School City: Queens  
School State: PA  
School Zip: 32112 
Years Teaching: 12  
Grade Levels: Middle School  
Total Students: 120  
Subject: Music   
How did they hear:  Other, provide more info, Former partner teacher in the Middle School 
Type: Public/Charter   
Question 1: aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaa aaaaaaa aaaa aaa aaaaaaaa aaaaaa aaaaaaaa  aaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaa aaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaa aaaaaaa aaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaa aaaaa aaaaaa aaaaaa aaaaaaaaaaaa aaaaaaaaaaaa aaa aaaa aaaaa aaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaaa aaaaaaaaaaa aaaaaaaaa aaaaaaaaaaaa.   
Question 2: bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbb bbbbbbbbb bbbbbbb bbbbbb bbbbbb bbbbbbb  bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbb 
Question 3: ccccccccccccccccccccccc cccccccc ccccccccccc cccccccccccccccccccccc ccc ccccccccc cccccccccccccc ccccccccccccccccccccc cccccccccccccccccccccc cccccccccccccccccc ccccccccccc ccccccccccccc ccccccccccccccccc cccccccc'
,@xml as xml

SELECT @xml = REPLACE ('<mystring><fieldname id="'+REPLACE(REPLACE(right(@string,LEN(@string)-2),':','" >'),CHAR(10),'</fieldname><fieldname id="')+'</fieldname></mystring>' ,CHAR(13),'')

SELECT
    n.v.value('(fieldname[@id="Home Phone"])[1]','NVARCHAR(11)') AS 'Home Phone',
    n.v.value('(fieldname[@id="Cell Phone"])[1]','NVARCHAR(11)') AS 'Cell Phone',
    n.v.value('(fieldname[@id="Date of Birth"])[1]','NVARCHAR(12)') AS 'Date of Birth',
    n.v.value('(fieldname[@id="School Name"])[1]','NVARCHAR(30)') AS 'School Name',
    n.v.value('(fieldname[@id="Address"])[1]','NVARCHAR(30)') AS 'Address',
    n.v.value('(fieldname[@id="School City"])[1]','NVARCHAR(15)') AS 'School City',
    n.v.value('(fieldname[@id="School State"])[1]','NVARCHAR(10)') AS 'School State',
    n.v.value('(fieldname[@id="School Zip"])[1]','NVARCHAR(6)') AS 'School Zip',
    n.v.value('(fieldname[@id="Years Teaching"])[1]','NVARCHAR(5)') AS 'Years Teaching',
    n.v.value('(fieldname[@id="Grade Levels"])[1]','NVARCHAR(15)') AS 'Grade Levels',
    n.v.value('(fieldname[@id="Total Students"])[1]','NVARCHAR(5)') AS 'Total Students',
    n.v.value('(fieldname[@id="How did they hear"])[1]','NVARCHAR(100)') AS 'How did they hear',
    n.v.value('(fieldname[@id="Type"])[1]','NVARCHAR(25)') AS 'Type',
    n.v.value('(fieldname[@id="Question 1"])[1]','NVARCHAR(128)') AS 'Question 1',
    n.v.value('(fieldname[@id="Question 2"])[1]','NVARCHAR(128)') AS 'Question 2',
    n.v.value('(fieldname[@id="Question 3"])[1]','NVARCHAR(128)') AS 'Question 3'
FROM @xml.nodes('mystring') as n(v);

结果:

    Home Phone  Cell Phone  Date of Birth School Name                    Address                        School City     School State School Zip Years Teaching Grade Levels    Total Students How did they hear                                                                                    Type                      Question 1                                                                                                                       Question 2                                                                                                                       Question 3
----------- ----------- ------------- ------------------------------ ------------------------------ --------------- ------------ ---------- -------------- --------------- -------------- ---------------------------------------------------------------------------------------------------- ------------------------- -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
 1234567890  1234567890  01/01/1971    James Jones High  School      123 Main Street                 Queens          PA           32112      12             Middle School   120             Other, provide more info, Former partner teacher in the Middle School                               Public/Charter            aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaa aaaaaaa aaaa aaa aaaaaaaa aaaaaa aaaaaaaa  aaaaaaaaaaaaaaaaa  bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb   ccccccccccccccccccccccc cccccccc ccccccccccc cccccccccccccccccccccc ccc ccccccccc cccccccccccccc ccccccccccccccccccccc cccccccc

(1 row(s) affected)
于 2016-03-01T05:46:06.463 回答