0

我有一个包含 500k 行的表,其中地址位于一个字段中,由 Char(13)+Char(10) 分隔。我在表中添加了 5 个字段,希望能将其拆分。

在网上找到了这个似乎表现良好的拆分功能,因为我无法使用parsename,因为它有 5 个部分,而且.可能在现场。

这是一个表值函数,因此我必须循环行并更新记录,以前我会使用游标或 sql while 甚至可能使用 c# 来执行此操作,但我觉得它们必须是基于cteset的答案做这个。

4

2 回答 2

3

你有几个选择:

您可以创建一个临时表,然后将地址解析到临时表中,然后通过将原始表加入临时表来更新原始表。

或者

您可以编写自己的 T-SQL 函数并在更新语句函数中使用这些函数,如下所示:

UPDATE myTable
   SET address1 = myGetAddress1Function(address),
       address2 = myGetAddress2Function(address)....
于 2013-03-13T15:06:06.150 回答
3

所以给定一些源数据:

CREATE TABLE dbo.Addresses
(
  AddressID INT IDENTITY(1,1),
  [Address] VARCHAR(255),
  Address1  VARCHAR(255),
  Address2  VARCHAR(255),
  Address3  VARCHAR(255),
  Address4  VARCHAR(255),
  Address5  VARCHAR(255)
);

INSERT dbo.Addresses([Address])
SELECT 'foo
bar'
UNION ALL SELECT 'add1
add2
add3
add4
add5';

让我们创建一个按顺序返回地址部分的函数:

CREATE FUNCTION dbo.SplitAddressOrdered
(
    @AddressID  INT,
    @List       VARCHAR(MAX),
    @Delimiter  VARCHAR(32)
)
RETURNS TABLE
AS
    RETURN 
    (
      SELECT  
          AddressID = @AddressID, 
          rn = ROW_NUMBER() OVER (ORDER BY Number), 
          AddressItem = Item 
        FROM (SELECT Number, Item = LTRIM(RTRIM(SUBSTRING(@List, Number, 
          CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)))
        FROM (SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
          FROM sys.all_objects) AS n(Number)
        WHERE Number <= CONVERT(INT, LEN(@List))
        AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
      ) AS y
    );
GO

现在您可以执行此操作(您必须运行查询 5 次):

DECLARE 
  @i INT = 1, 
  @sql NVARCHAR(MAX),
  @src NVARCHAR(MAX) = N';WITH x AS 
    (
      SELECT a.*, Original = s.AddressID, s.rn, s.AddressItem
      FROM dbo.Addresses AS a
      CROSS APPLY dbo.SplitAddressOrdered(a.AddressID, a.Address, 
        CHAR(13) + CHAR(10)) AS s WHERE rn = @i
    )';
WHILE @i <= 5
BEGIN
   SET @sql = @src + N'UPDATE x SET Address' + RTRIM(@i)
     + ' = CASE WHEN AddressID = Original AND rn = ' 
     + RTRIM(@i) + ' THEN AddressItem END;';

   EXEC sp_executesql @sql, N'@i INT', @i;

   SET @i += 1;
END

然后您可以删除该Address列:

ALTER TABLE dbo.Addresses DROP COLUMN [Address];

然后表有:

AddressID  Address1  Address2  Address3  Address4  Address5
---------  --------  --------  --------  --------  --------
1          foo       bar       NULL      NULL      NULL
2          add1      add2      add3      add4      add5

我敢肯定,比我更聪明的人会展示如何在无需循环的情况下使用该功能。

我还可以设想对功能稍作改动,让您可以简单地拉出某个元素……请稍等……

编辑

这是一个标量函数,它本身更昂贵,但允许您对表进行一次而不是 5 次:

CREATE FUNCTION dbo.ElementFromOrderedList
(
    @List       VARCHAR(MAX),
    @Delimiter  VARCHAR(32),
    @Index      SMALLINT
)
RETURNS VARCHAR(255)
AS
BEGIN
    RETURN 
    (
      SELECT Item 
        FROM (SELECT rn = ROW_NUMBER() OVER (ORDER BY Number),
          Item = LTRIM(RTRIM(SUBSTRING(@List, Number, 
          CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)))
        FROM (SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
          FROM sys.all_objects) AS n(Number)
        WHERE Number <= CONVERT(INT, LEN(@List))
        AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
      ) AS y WHERE rn = @Index
    );
END
GO

现在,根据上表(更新之前和删除之前),更新很简单:

UPDATE dbo.Addresses
  SET Address1 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 1),
      Address2 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 2),
      Address3 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 3),
      Address4 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 4),
      Address5 = dbo.ElementFromOrderedList([Address], CHAR(13) + CHAR(10), 5);
于 2013-03-13T15:32:44.363 回答