我正在使用 SSIS 遍历文件夹并将所有 txt 文件的内容读入数据库。首先,我正在检查文件是否已使用存储过程进行处理
ALTER PROC [dbo].[CheckForDuplicateFileEntry]
(
@TaskID INT,
@Filename VARCHAR(50),
@FileAlreadyExists BIT OUTPUT
)AS
BEGIN
DECLARE @TaskTypeID INT
SET @FileAlreadyExists = 0
SELECT @TaskTypeID = TaskTypeID FROM Tasks WHERE TaskID = @TaskID
IF EXISTS ( SELECT TaskID
FROM TaskSteps
WHERE @Filename IN (
SELECT TOP ( 30 )
TaskSteps.Filename
FROM TaskSteps
INNER JOIN Tasks ON TaskSteps.TaskID = Tasks.TaskID
WHERE ( Tasks.TaskTypeID = @TaskTypeID
AND [Filename] IS NOT NULL
)
AND IsValid = 1
AND ProcessStatusID = 2 ) )
BEGIN
INSERT INTO TaskSteps ( TaskID, StepDesc )
VALUES (
@TaskID,
'Duplicate filename. (' + @Filename + ') Already exists.'
)
SET @FileAlreadyExists = 1
END
END
我也试过
IF EXISTS (SELECT TOP 30 Filename
FROM TaskSteps INNER JOIN Tasks ON TaskSteps.TaskID = Tasks.TaskID
WHERE (SUBSTRING(TaskSteps.Filename,18,13) = SUBSTRING(@Filename,18,13))
AND IsValid = 1
AND ProcessStatusID = 2)
但是当它遍历文件时,它处理的第一个声明下一个是重复的,处理第三个并声明第四个重复的文件等等文件名非常相似,即 Songs_120501_175535.txt 文件名保持不变,日期和时间部分是唯一改变的部分,可能只改变一位数,即 Songs_120502_175535.txt