1

I'm attempting to pull data from several spreadsheets that reside in a single folder, then put all the data into a single csv file along with column headings.

I have a foreach loop container setup to iterate through each of the filenames in the folder, which then appends this data to a RAW file, however as many have seemed to run into, there does not appear to be a built in option that will allow one to simply truncate the RAW file before entering the loop container.

Jamie Thompson described a similar situation in his blog here, but the links to the examples do not seem to work. Does anyone have an easy way to truncate the RAW file in a stand alone step before entering the foreach loop?

4

2 回答 2

2

我一直使用的方法是创建具有适当元数据格式但没有实际行的数据流,并将其路由到设置为 Create new 的 RAW 文件。

在我现有的数据流中,我查看填充 RAW 文件的元数据,然后制作一个模仿它的 select 语句。

例如

SELECT
    CAST(NULL AS varchar(70)) AS AddressLine1
,   CAST(NULL AS bigint) AS SomeBigInt
,   CAST(NULL AS nvarchar(max)) AS PerformanceLOL
于 2013-11-01T16:03:44.537 回答
1

这是我会做的:

  • 制作您的初始原始文件
  • 制作该原始文件的副本
  • 每次使用文件任务替换包/作业开头的暂存文件。

在我的用例中,我有 20 个foreach线程同时写入自己的文件。没有线程可以创建然后追加,因此您只需在调用线程之前通过复制已经分配了元数据的“空”原始文件来“重新创建”:

在此处输入图像描述

于 2015-12-16T19:40:58.500 回答