2

我的数据库(PostgreSQL)中有很多字符串,例如:

with mystrings as (
    select 'H e l l o, how are you'::varchar string union all
    select 'I am fine, t h a n k you'::varchar string union all
    select 'This is s t r a n g e text'::varchar string union all
    select 'With c r a z y space b e t w e e n characters'::varchar string 
)
select * from mystrings

有没有办法可以删除单词中字符之间的空格?对于我的示例,结果应该是:

Hello, how are you
I am fine, thank you
This is strange text
With crazy space between characters

我从 开始replace,但是有很多这样的单词,字符之间有空格,我什至找不到它们。

因为可能难以有意义地连接字符,所以最好只获取连接候选者的列表。使用示例数据,结果应该是:

H e l l o
t h a n k
s t r a n g e
c r a z y
b e t w e e n

当至少有三个由两个空格分隔的单独字符时,此类查询应查找并返回字符串中的所有子字符串(并继续直到[space] individual character出现模式):

He l l o how are you --> llo
H e l l o how are you --> Hello
C r a z y space b e t w e e n --> {crazy, between}
4

3 回答 3

1

根据您编辑的问题,以下获得了所有可能的候选人least three individual characters separated by two spaces

SELECT 
    data || ' --> {' || replace_candidates || '}'
FROM(
SELECT 
    data,
    ( SELECT 
            array_to_string( array_agg( data ),',' )  
        FROM (
            SELECT 
                data,
                length( data ) 
            FROM ( 
                SELECT 
                    replace( data, ' ', '' ) AS data 
                FROM 
                    regexp_split_to_table( data, '\S{2,}' ) AS data 
                ) t
            WHERE length( data ) > 2
        ) t ) AS replace_candidates
    FROM
        mystrings
) T
WHERE 
  replace_candidates IS NOT NULL

在职的

首先开始查看最里面的查询(带有 的查询regexp_split_to_table

  1. regexg获取所有具有2 characters in a sequence(不是separated空格)的字符串
  2. regexp_split_to_table得到匹配的逆,更多信息在这里
  3. 用 a 替换空格empty char并过滤records具有 alength greater than 2

铰孔是array aggregate要照顾的功能formatting,根据您的要求,这里有更多

结果

H e l l o how are you --> {Hello}
I am fine, t h a n k you --> {thank}
This is s t r a n g e text --> {strange}
With c r a z y space b e t w e e n characters --> {crazy,between}
SOME MORE TEST T E X T --> {TEXT}

SQLFIDDLE

注意:它认为属于 的字符[space][char][space],但是,您可以将其修改为适合您的需要[space][space][char][space][space][char][special_char][space]...

希望这会有所帮助;p

于 2013-04-05T18:38:56.197 回答
0

如果单词存在,您可以使用诸如在线词典之类的资源,那么您不必删除空格,否则删除空格,或者您可以使用一个表格,您必须在其中放置所有存在的字符串,然后您必须检查该表格。希望你明白了我的意思。

于 2013-04-04T10:44:26.493 回答
0

下面找到可能的连接候选:

 with mystrings as (
    select 'H e l l o, how are you'::varchar string union all
    select 'I am fine, t h a n k you'::varchar string union all
    select 'This is s t r a n g e text'::varchar string union all
    select 'With c r a z y space b e t w e e n characters'::varchar string 
)

, u as (
select string, strpart[rn] as strpart, rn
from  (
   select *, generate_subscripts(strpart, 1) as rn
   from  (
      select string, string_to_array(replace(string,',',''), ' ') as strpart
      from   mystrings
      ) x
   ) y
)

,w as (
select 
   string,strpart,rn, 
   case when length(strpart) = 1 then 1 else 0 end as indchar ,
   case when coalesce(length(lag(strpart) over()),0) <> 1 and length(strpart) = 1 then 1 else 0 end as strstart,
   case when coalesce(length(lead(strpart) over()),0) <> 1 and length(strpart) = 1 then 1 else 0 end as strend   
from u
) 


,x as (
   select 
      string,rn,strpart,indchar,strstart,
      sum(strstart) over (order by string, rn) as strid 
   from w 
   where indchar = 1 and not (strstart = 1 and strend = 1)
    )

select string, array_to_string(array_agg(strpart),'') as candidate from x group by string, strid 
于 2013-04-04T13:21:09.390 回答