1

I am extracting data from excel files where the data accumulates in the same excel file each week. For example, week X will contain data from week 1 through week X. Then week X+1 will contain data from week 1 through week X+1. What is the most efficient way to load this accumulated data into a database? Currently, I am clearing the entire database then loading data from weeks 1 through the current week. Obviously this process is inefficient as I am clearing data from my database only to load it back into the database again...

For the following ideas I have could someone help me decide which is the best route to go? Or if you have any better ideas please let me know. All help is appreciated!

  1. Is there a way to efficiently compute the "set difference" with excel files? Then I could load the difference from the current week's file and last week's file.
  2. I could keep track of all the weeks in which I have loaded data, and then "query" the excel files for the weeks that are not in the database. I would hope this querying would be efficient through hashing.

I think a necessary question to get either of the above ideas to work is: In what ways through SSIS can I manipulate data in excel?

4

1 回答 1

3
  • 保留您在控制表中处理的最后一个完整日期
  • 将日期读入包变量
  • 更改 Excel 源代码编辑器 -> 数据访问模式以作为 SQL 命令运行
  • 输入 sql 语句,包括工作表名称,然后输入 ? 为参数值。例如SELECT * FROM [Sheet1$] where extractdate > ?
  • 点击参数按钮,将参数变量(第二步)赋给sql语句
于 2012-11-19T12:55:18.687 回答