我有一个表,其中有一列名为Description
. 该列填充有文本数据。我想创建一个查询,返回每个描述中的字数。
我的想法是创建一个函数,该函数接受一个值,并返回在输入文本中找到的单词数量。
SELECT dbo.GetWordCount(Description) FROM TABLE
例如,如果描述是“Hello World!祝你有美好的一天。”,查询应该返回 6。
如何获取描述列的字数?
我有一个表,其中有一列名为Description
. 该列填充有文本数据。我想创建一个查询,返回每个描述中的字数。
我的想法是创建一个函数,该函数接受一个值,并返回在输入文本中找到的单词数量。
SELECT dbo.GetWordCount(Description) FROM TABLE
例如,如果描述是“Hello World!祝你有美好的一天。”,查询应该返回 6。
如何获取描述列的字数?
请参阅此建议的解决方案:http ://www.sql-server-helper.com/functions/count-words.aspx
CREATE FUNCTION [dbo].[WordCount] ( @InputString VARCHAR(4000) )
RETURNS INT
AS
BEGIN
DECLARE @Index INT
DECLARE @Char CHAR(1)
DECLARE @PrevChar CHAR(1)
DECLARE @WordCount INT
SET @Index = 1
SET @WordCount = 0
WHILE @Index <= LEN(@InputString)
BEGIN
SET @Char = SUBSTRING(@InputString, @Index, 1)
SET @PrevChar = CASE WHEN @Index = 1 THEN ' '
ELSE SUBSTRING(@InputString, @Index - 1, 1)
END
IF @PrevChar = ' ' AND @Char != ' '
SET @WordCount = @WordCount + 1
SET @Index = @Index + 1
END
RETURN @WordCount
END
GO
使用示例:
DECLARE @String VARCHAR(4000)
SET @String = 'Health Insurance is an insurance against expenses incurred through illness of the insured.'
SELECT [dbo].[WordCount] ( @String )
这有点麻烦,但它很好地处理了空白问题,它快速且内联,没有 udf。
DECLARE @Term VARCHAR(100) = ' this is pretty fast '
SELECT @Term, LEN(REPLACE(REPLACE(REPLACE(' '+@Term,' ',' '+CHAR(1)) ,CHAR(1)+' ',''),CHAR(1),'')) - LEN(REPLACE(REPLACE(REPLACE(REPLACE(' '+@Term,' ',' '+CHAR(1)) ,CHAR(1)+' ',''),CHAR(1),''),' ','')) [Word Count]
除了Mortalus的回答之外,对于 SQL Server 的早期版本,我会使用内联函数而不是标量(*注意 - 此函数适用于 SQL Server 2012 及更高版本) ,请参见下文:
/*SQL Server 2012 and up*/
CREATE FUNCTION dbo.udf_WordCount
(
@str VARCHAR(8000)
)
RETURNS TABLE AS RETURN
WITH Tally (n) AS
(
SELECT TOP (LEN(@str)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) a(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n)
)
, BreakChar as
(
SELECT SUBSTRING(@str , n , 1) [Char] , N
FROM Tally
)
, Analize as
(
SELECT * , lag([Char],1) OVER (ORDER BY N) PrevChar
FROM BreakChar
)
SELECT WordCount = COUNT(1) + 1
FROM Analize
WHERE [Char] != PrevChar
AND PrevChar = ' '
如何使用:
DECLARE @str varchar(1000) = 'It''s now or never I ain''t gonna live forever'
SELECT * FROM dbo.udf_WordCount(@str) --> 9
**SQL Server 2008 及更低版本:
/*SQL Server 2008 and down*/
CREATE FUNCTION dbo.udf_WordCount_2008
(
--declare
@str VARCHAR(8000)
--= 'It''s now or never I ain''t gonna live forever'
)
RETURNS TABLE AS RETURN
WITH Tally (n) AS
(
SELECT TOP (LEN(@str)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) a(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n)
)
, BreakChar as
(
SELECT SUBSTRING(@str , n , 1) [Char] , N
FROM Tally
)
, Analize as
(
SELECT a.* , b.Char PrevChar
FROM BreakChar a
JOIN BreakChar b
on a.n = b.n+1
)
SELECT WordCount = COUNT(1) + 1
FROM Analize
WHERE [Char] != PrevChar
AND PrevChar = ' '
通用语法:
SELECT (LENGTH(column_name) - LENGTH(REPLACE(column_name, ' ', ''))),column_name1,column_name2 FROM table_name;
如果要计算名为“employeeDetails”的表的单个“地址”列中有多少个单词,则:
SELECT (LENGTH(address) - LENGTH(REPLACE(address, ' ', ''))),address,employee_name FROM employeeDetails ;
此答案基于我最初在此处找到的Mortalus 答案中使用的相同代码。
此解决方案是该代码的更高效、更简洁的版本。我还为代码添加了一些解释,希望能让未来的读者更清楚地回答这个问题。
下面的用户定义函数接受一个文本字符串,然后循环输入文本的每个字符。如果前一个字符是空格,则字数加一。
由于字数是通过计算单词之间的空格来计算的,因此空格总是比实际单词少 1。为了解决这个问题,从@PrevChar
的值开始' '
。然后,当第一次运行循环时,当代码到达时IF @PrevChar = ' '
,它将返回true,并且字数将增加一。即使文本的长度为 0,这也有效,因为在这种情况下,它不会通过@Index <= LEN(@InputString)
检查,并且字数永远不会增加。(这替换了CASE
链接答案中使用的语句。)
AND @CurrentChar != ' '
用于解决双倍行距被计为多字的问题。如果前一个字符是空格,但当前字符也是空格,则在不增加字数的情况下继续下一个索引。下一次迭代将只@PrevChar
设置为' '
,因此双倍空格的字数只会增加一次。
CREATE FUNCTION [dbo].[WordCount] (@InputString VARCHAR(MAX))
RETURNS INT
AS
BEGIN
DECLARE @Index INT = 1
DECLARE @CurrentChar CHAR(1)
--Initialize the previous character as a space.
DECLARE @PrevChar CHAR(1) = ' '
DECLARE @WordCount INT = 0
WHILE @Index <= LEN(@InputString)
BEGIN
--Set the current character to equal the character in the index
--position of the inputted text.
SET @CurrentChar= SUBSTRING(@InputString, @Index, 1)
--If the previous character was a space and the current character
--is not a space, increase the wordcount by 1.
IF @PrevChar = ' ' AND @CurrentChar != ' '
SET @WordCount = @WordCount + 1
--Increase the index counter by 1.
SET @Index = @Index + 1
--Now that we are done with the current character, set the previous
--character to equal the current character.
SET @PrevChar = @CurrentChar
END
RETURN @WordCount
END