2

我想将生成的 txt 文件转换为 UTF8 格式的文件,以便可以通过 Polybase 将其加载到我的 Azure SQL DW 中。要求源文件为 UTF8。

MSDN 有一个“IO 流示例” HERE非常适合单个作业。不过,我正在尝试为大约 30 个表构建一个 SSIS 解决方案。我相信使用这种方法会导致竞争条件,当另一个 SSIS 包需要它时,PS 脚本将被 1 个 SSIS 包锁定。

我是 sql 开发人员,而不是 .NET 开发人员,所以请原谅我。假设我知道如何将参数传递给脚本任务,如何将上述内容转换为 SSIS C# 脚本任务?

来自 MSDN 的 PowerShell 代码

#Static variables
$ascii = [System.Text.Encoding]::ASCII
$utf16le = [System.Text.Encoding]::Unicode
$utf8 = [System.Text.Encoding]::UTF8
$ansi = [System.Text.Encoding]::Default
$append = $False

#Set source file path and file name
$src = [System.IO.Path]::Combine("<MySrcFolder>","<MyUtf8stage>.txt")

#Set source file encoding (using list above)
$src_enc = $ascii

#Set target file path and file name
$tgt = [System.IO.Path]::Combine("<MyDestFolder>","<MyFinalstage>.txt")

#Set target file encoding (using list above)
$tgt_enc = $utf8

$read = New-Object System.IO.StreamReader($src,$src_enc)
$write = New-Object System.IO.StreamWriter($tgt,$append,$tgt_enc)

while ($read.Peek() -ne -1)
{
    $line = $read.ReadLine();
    $write.WriteLine($line);
}
$read.Close()
$read.Dispose()
$write.Close()
$write.Dispose()

更新

我发现了一个类似的帖子,我可以根据自己的需要进行调整,我发誓在发布之前我搜索了高低。无论如何,这对我有用。如果您仍然看到要改进它,请分享:

public void Main()
    {
        //$Package::SourceSQLObject = tablename
        //$Package::StageFile_DestinationFolderPath = rootpath eg "C:\temp\"

        string path = (string)Dts.Variables["$Package::StageFile_DestinationFolderPath"].Value;
        string name = (string)Dts.Variables["$Package::SourceSQLObject"].Value;
        string from = Path.Combine(path, name) + ".csv";
        string to = Path.ChangeExtension(from, "txt");
        Dts.Log("Starting " + to.ToUpper(), 0, null);
        using (StreamReader reader = new StreamReader(from, Encoding.ASCII, false, 10))
        using (StreamWriter writer = new StreamWriter(to, false, Encoding.UTF8, 10))
        {
            while (reader.Peek() >= 0)
            {
                writer.WriteLine(reader.ReadLine());
            }
        }
        Dts.TaskResult = (int)ScriptResults.Success;
4

3 回答 3

2

您的代码表明您正在尝试将 ASCII 文件转换为 UTF-8,但是该文章还指出以下内容:

由于 UTF-8 使用与 ASCII 相同的字符编码,PolyBase 也将支持加载 ASCII 编码的数据。

因此,我对您的建议是首先使用 Polybase 尝试该文件,在您花费任何时间尝试转换文件之前检查是否存在任何转换问题。

于 2016-10-11T20:58:33.147 回答
1
public void Main()
    {
        //$Package::SourceSQLObject = tablename
        //$Package::StageFile_DestinationFolderPath = rootpath eg "C:\temp\"

        string path = (string)Dts.Variables["$Package::StageFile_DestinationFolderPath"].Value;
        string name = (string)Dts.Variables["$Package::SourceSQLObject"].Value;
        string from = Path.Combine(path, name) + ".csv";
        string to = Path.ChangeExtension(from, "txt");
        Dts.Log("Starting " + to.ToUpper(), 0, null);
        using (StreamReader reader = new StreamReader(from, Encoding.ASCII, false, 10))
        using (StreamWriter writer = new StreamWriter(to, false, Encoding.UTF8, 10))
        {
            while (reader.Peek() >= 0)
            {
                writer.WriteLine(reader.ReadLine());
            }
        }
        Dts.TaskResult = (int)ScriptResults.Success;
于 2016-10-25T03:44:05.993 回答
1
var mySrcFolder = ""; // something from user variables?
var myUtf8stage = ""; // something from user variables?
var myFinalstage = ""; // something from user variables?

// Static variables
var ascii = System.Text.Encoding.ASCII;
var utf16le = System.Text.Encoding.Unicode;
var utf8 = System.Text.Encoding.UTF8;
var ansi = System.Text.Encoding.Default;
var append = false;

// Set source file path and file name
var src = System.IO.Path.Combine(
    mySrcFolder,
    String.Format("{0}.txt", myUtf8stage));

// Set source file encoding (using list above)
var src_enc = ascii;

// Set target file path and file name
var tgt = System.IO.Path.Combine(
    mySrcFolder,
    String.Format("{0}.txt", myFinalstage));

// Set target file encoding (using list above)
var tgt_enc = utf8;

using (var read = new System.IO.StreamReader(src, src_enc))
using (var write = new System.IO.StreamWriter(tgt, append, tgt_enc))
{
    while (read.Peek() != -1)
    {
        var line = read.ReadLine();
        write.WriteLine(line);
    }
}
于 2016-10-11T20:58:16.113 回答