1

我有一个 .csv 文件(我无法控制数据),出于某种原因,它的所有内容都包含在引号中。

"Date","Description","Original Description","Amount","Type","Category","Name","Labels","Notes"
"2/02/2012","ac","ac","515.00","a","b","","javascript://"
"2/02/2012","test","test","40.00","a","d","c",""," "

我正在使用文件助手,我想知道删除所有这些引号的最佳方法是什么?有没有什么说“如果我看到引号被删除。如果没有找到引号什么也不做”?

这会与数据混淆,因为我将"\"515.00\""使用不需要的额外引号(特别是因为在这种情况下我希望它是小数而不是字符串”。

我也不确定“javascript”是什么以及为什么生成它,但这是来自我无法控制的服务。

编辑 这就是我使用 csv 文件的方式。

    using (TextReader textReader = new StreamReader(stream))
        {
            engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue; 

            object[] transactions = engine.ReadStream(textReader);
        }
4

4 回答 4

9

您可以使用此处FieldQuoted的属性页面上描述得最好的属性。请注意,该属性可以应用于任何 FileHelpers 字段(即使它类型为)。(请记住,FileHelpers 类描述了导入文件的规范。因此,当您将字段标记为时,您在文件中说,该字段将被引用。DecimalDecimalFieldQuoted

您甚至可以指定引号是否是可选的

[FieldQuoted('"', QuoteMode.OptionalForBoth)] 

这是一个处理您的数据的控制台应用程序:

class Program
{
    [DelimitedRecord(",")]
    [IgnoreFirst(1)]
    public class Format1
    {
        [FieldQuoted]
        [FieldConverter(ConverterKind.Date, "d/M/yyyy")]
        public DateTime Date;
        [FieldQuoted]
        public string Description;
        [FieldQuoted]
        public string OriginalDescription;
        [FieldQuoted]
        public Decimal Amount;
        [FieldQuoted]
        public string Type;
        [FieldQuoted]
        public string Category;
        [FieldQuoted]
        public string Name;
        [FieldQuoted]
        public string Labels;
        [FieldQuoted]
        [FieldOptional]
        public string Notes;
    }

    static void Main(string[] args)
    {
        var engine = new FileHelperEngine(typeof(Format1));

        // read in the data   
        object[] importedObjects = engine.ReadString(@"""Date"",""Description"",""Original Description"",""Amount"",""Type"",""Category"",""Name"",""Labels"",""Notes""
""2/02/2012"",""ac"",""ac"",""515.00"",""a"",""b"","""",""javascript://""
""2/02/2012"",""test"",""test"",""40.00"",""a"",""d"",""c"","""","" """);

        // check that 2 records were imported
        Assert.AreEqual(2, importedObjects.Length);

        // check the values for the first record
        Format1 customer1 = (Format1)importedObjects[0];
        Assert.AreEqual(DateTime.Parse("2/02/2012"), customer1.Date);
        Assert.AreEqual("ac", customer1.Description);
        Assert.AreEqual("ac", customer1.OriginalDescription);
        Assert.AreEqual(515.00, customer1.Amount);
        Assert.AreEqual("a", customer1.Type);
        Assert.AreEqual("b", customer1.Category);
        Assert.AreEqual("", customer1.Name);
        Assert.AreEqual("javascript://", customer1.Labels);
        Assert.AreEqual("", customer1.Notes);

        // check the values for the second record
        Format1 customer2 = (Format1)importedObjects[1];
        Assert.AreEqual(DateTime.Parse("2/02/2012"), customer2.Date);
        Assert.AreEqual("test", customer2.Description);
        Assert.AreEqual("test", customer2.OriginalDescription);
        Assert.AreEqual(40.00, customer2.Amount);
        Assert.AreEqual("a", customer2.Type);
        Assert.AreEqual("d", customer2.Category);
        Assert.AreEqual("c", customer2.Name);
        Assert.AreEqual("", customer2.Labels);
        Assert.AreEqual(" ", customer2.Notes);
    }
}

(注意,您的第一行数据似乎有 8 个字段而不是 9 个,所以我用 标记了该Notes字段FieldOptional)。

于 2012-02-06T11:12:06.710 回答
0

这是一种方法:

string[] lines = new string[]
{
    "\"Date\",\"Description\",\"Original Description\",\"Amount\",\"Type\",\"Category\",\"Name\",\"Labels\",\"Notes\"",
    "\"2/02/2012\",\"ac\",\"ac\",\"515.00\",\"a\",\"b\",\"\",\"javascript://\"",
    "\"2/02/2012\",\"test\",\"test\",\"40.00\",\"a\",\"d\",\"c\",\"\",\" \"",
};

string[][] values =
    lines.Select(line =>
        line.Trim('"')
            .Split(new string[] { "\",\"" }, StringSplitOptions.None)
            .ToArray()
        ).ToArray();

lines数组表示样本中的行。每个"字符都必须像\"C# 字符串文字一样进行转义。

对于每一行,我们首先删除第一个和最后一个字符,然后使用字符序列作为分隔符"将其拆分为子字符串集合。","

请注意,如果您的值中自然出现字符(即使转义),上述代码将不起作用。"

编辑:如果要从流中读取 CSV,您需要做的就是:

var lines = new List<string>();
using (var streamReader = new StreamReader(stream))
    while (!streamReader.EndOfStream)
        lines.Add(streamReader.ReadLine());

上述代码的其余部分将完好无损。

编辑:给定你的新代码,检查你是否正在寻找这样的东西:

for (int i = 0; i < transactions.Length; ++i)
{
    object oTrans = transactions[i]; 
    string sTrans = oTrans as string;
    if (sTrans != null && 
        sTrans.StartsWith("\"") &&
        sTrans.EndsWith("\""))
    {
        transactions[i] = sTrans.Substring(1, sTrans.Length - 2);
    }
}
于 2012-02-03T19:09:07.773 回答
0

我有同样的困境,当我将值加载到我的列表对象中时,我替换了引号:

using System;
using System.Collections.Generic;
using System.IO;
using System.Windows.Forms;

namespace WindowsFormsApplication6
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            LoadCSV();
        }

        private void LoadCSV()
        {
            List<string> Rows = new List<string>();
            string m_CSVFilePath = "<Path to CSV File>";

            using (StreamReader r = new StreamReader(m_CSVFilePath))
            {
                string row;

                while ((row = r.ReadLine()) != null)
                {
                    Rows.Add(row.Replace("\"", ""));
                }

                foreach (var Row in Rows)
                {
                    if (Row.Length > 0)
                    {
                        string[] RowValue = Row.Split(',');

                        //Do something with values here
                    }
                }
            }
        }

    }
}
于 2012-02-03T20:23:26.563 回答
0

这段代码可能对我开发的有所帮助:

using (StreamReader r = new StreamReader("C:\\Projects\\Mactive\\Audience\\DrawBalancing\\CSVFiles\\Analytix_ABC_HD.csv"))
{
     string row;

     int outCount;
         StringBuilder line=new StringBuilder() ;
         string token="";
         char chr;
         string Eachline;

     while ((row = r.ReadLine()) != null)
     {
         outCount = row.Length;
         line = new StringBuilder();
         for (int innerCount = 0; innerCount <= outCount - 1; innerCount++)
         {                   
             chr=row[innerCount];

             if (chr != '"')
             {
                 line.Append(row[innerCount].ToString());
             }
             else if(chr=='"')
             {
                 token = "";
                 innerCount = innerCount + 1;
                 for (; innerCount < outCount - 1; innerCount++)
                 {
                     chr=row[innerCount];
                     if(chr=='"')
                     {
                         break;
                     }

                     token = token + chr.ToString();                               
                 }

                 if(token.Contains(",")){token=token.Replace(",","");}
                 line.Append(token);
             }                 
         }
         Eachline = line.ToString();
         Console.WriteLine(Eachline);
    }
}
于 2013-10-30T02:46:13.193 回答