1

我有一个表,其中有一列名为Description. 该列填充有文本数据。我想创建一个查询,返回每个描述中的字数。

我的想法是创建一个函数,该函数接受一个值,并返回在输入文本中找到的单词数量。

SELECT dbo.GetWordCount(Description) FROM TABLE

例如,如果描述是“Hello World!祝你有美好的一天。”,查询应该返回 6。

如何获取描述列的字数?

4

6 回答 6

2

请参阅此建议的解决方案:http ://www.sql-server-helper.com/functions/count-words.aspx

CREATE FUNCTION [dbo].[WordCount] ( @InputString VARCHAR(4000) ) 
RETURNS INT
AS
BEGIN

DECLARE @Index          INT
DECLARE @Char           CHAR(1)
DECLARE @PrevChar       CHAR(1)
DECLARE @WordCount      INT

SET @Index = 1
SET @WordCount = 0

WHILE @Index <= LEN(@InputString)
BEGIN
    SET @Char     = SUBSTRING(@InputString, @Index, 1)
    SET @PrevChar = CASE WHEN @Index = 1 THEN ' '
                         ELSE SUBSTRING(@InputString, @Index - 1, 1)
                    END

    IF @PrevChar = ' ' AND @Char != ' '
        SET @WordCount = @WordCount + 1

    SET @Index = @Index + 1
END

RETURN @WordCount

END
GO

使用示例:

DECLARE @String VARCHAR(4000)
SET @String = 'Health Insurance is an insurance against expenses incurred through illness of the insured.'

SELECT [dbo].[WordCount] ( @String )
于 2013-03-12T03:55:00.500 回答
2

这有点麻烦,但它很好地处理了空白问题,它快速且内联,没有 udf。

DECLARE @Term VARCHAR(100) = '  this   is   pretty fast '

SELECT @Term, LEN(REPLACE(REPLACE(REPLACE(' '+@Term,'  ',' '+CHAR(1)) ,CHAR(1)+' ',''),CHAR(1),'')) -  LEN(REPLACE(REPLACE(REPLACE(REPLACE(' '+@Term,'  ',' '+CHAR(1)) ,CHAR(1)+' ',''),CHAR(1),''),' ','')) [Word Count]
于 2014-05-31T01:28:36.403 回答
0

除了Mortalus的回答之外,对于 SQL Server 的早期版本,我会使用内联函数而不是标量(*注意 - 此函数适用于 SQL Server 2012 及更高版本) ,请参见下文:

    /*SQL Server 2012 and up*/
    CREATE FUNCTION dbo.udf_WordCount 
    (

    @str VARCHAR(8000) 

    )
    RETURNS TABLE AS RETURN

    WITH Tally (n) AS
    (
        SELECT TOP (LEN(@str)) ROW_NUMBER()  OVER (ORDER BY (SELECT NULL)) 
        FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) a(n)
        CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n)
        CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n)
        CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n)
    )
    , BreakChar as
    (
        SELECT  SUBSTRING(@str , n , 1) [Char] , N
        FROM Tally

    )
    , Analize as 
    (
        SELECT * , lag([Char],1) OVER (ORDER BY N) PrevChar
        FROM BreakChar
    )

        SELECT WordCount = COUNT(1) + 1 
        FROM Analize
        WHERE [Char] != PrevChar
        AND PrevChar = ' ' 

如何使用:

    DECLARE @str varchar(1000) = 'It''s now or never I ain''t gonna live forever'
    SELECT * FROM dbo.udf_WordCount(@str) --> 9 

**SQL Server 2008 及更低版本:

    /*SQL Server 2008 and down*/
    CREATE FUNCTION dbo.udf_WordCount_2008 
    (
    --declare
    @str VARCHAR(8000) 
    --= 'It''s now or never I ain''t gonna live forever'
    )
    RETURNS TABLE AS RETURN

    WITH Tally (n) AS
    (
        SELECT TOP (LEN(@str)) ROW_NUMBER()  OVER (ORDER BY (SELECT NULL)) 
        FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) a(n)
        CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n)
        CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n)
        CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n)
    )
    , BreakChar as
    (
        SELECT  SUBSTRING(@str , n , 1) [Char] , N
        FROM Tally

    )
    , Analize as 
    (
        SELECT a.* , b.Char PrevChar
        FROM BreakChar a
        JOIN BreakChar b
        on a.n = b.n+1


    )

        SELECT WordCount = COUNT(1) + 1 
        FROM Analize
        WHERE [Char] != PrevChar
        AND PrevChar = ' ' 
于 2016-06-20T06:23:07.757 回答
0

通用语法:

SELECT (LENGTH(column_name) - LENGTH(REPLACE(column_name, ' ', ''))),column_name1,column_name2 FROM table_name;

如果要计算名为“employeeDetails”的表的单个“地址”列中有多少个单词,则:

SELECT (LENGTH(address) - LENGTH(REPLACE(address, ' ', ''))),address,employee_name FROM employeeDetails ;
于 2017-01-05T05:57:12.977 回答
0

此答案基于我最初在此处找到的Mortalus 答案中使用的相同代码。

此解决方案是该代码的更高效、更简洁的版本。我还为代码添加了一些解释,希望能让未来的读者更清楚地回答这个问题。


下面的用户定义函数接受一个文本字符串,然后循环输入文本的每个字符。如果前一个字符是空格,则字数加一。

由于字数是通过计算单词之间的空格来计算的,因此空格总是比实际单词少 1。为了解决这个问题,从@PrevChar的值开始' '。然后,当第一次运行循环时,当代码到达时IF @PrevChar = ' ',它将返回true,并且字数将增加一。即使文本的长度为 0,这也有效,因为在这种情况下,它不会通过@Index <= LEN(@InputString)检查,并且字数永远不会增加。(这替换了CASE链接答案中使用的语句。)

AND @CurrentChar != ' '用于解决双倍行距被计为多字的问题。如果前一个字符是空格,但当前字符也是空格,则在不增加字数的情况下继续下一个索引。下一次迭代将只@PrevChar设置为' ',因此双倍空格的字数只会增加一次。

CREATE FUNCTION [dbo].[WordCount] (@InputString VARCHAR(MAX))
RETURNS INT
AS
BEGIN
    DECLARE @Index INT = 1
    DECLARE @CurrentChar CHAR(1)

    --Initialize the previous character as a space.
    DECLARE @PrevChar CHAR(1) = ' '

    DECLARE @WordCount INT = 0

    WHILE @Index <= LEN(@InputString)
    BEGIN
        --Set the current character to equal the character in the index 
        --position of the inputted text.
        SET @CurrentChar= SUBSTRING(@InputString, @Index, 1)

        --If the previous character was a space and the current character
        --is not a space, increase the wordcount by 1.
        IF @PrevChar = ' ' AND @CurrentChar != ' '
            SET @WordCount = @WordCount + 1

        --Increase the index counter by 1.
        SET @Index = @Index + 1

        --Now that we are done with the current character, set the previous
        --character to equal the current character.
        SET @PrevChar = @CurrentChar
    END

    RETURN @WordCount
END
于 2017-04-25T21:02:50.447 回答
0

必备条件:SQL Server 2016 及更高版本

我在我的 sp 中使用它,我收到一个句子,所以我可以处理内部空间。

在此处输入图像描述

SELECT value from STRING_SPLIT(@oracion1,' ')

现在我用文本过滤值并计算它们以实现此目的:

在此处输入图像描述

SELECT count(value) from STRING_SPLIT(@str,' ') where len(value)>0

@oracion1 可能是 N"JUAN ES CARPINTERO" 或 @oracion1 可能是 N"JUAN ES CARPINTERO"

于 2021-03-04T20:10:49.883 回答