0

谢谢解决。

我的words.txt文件如下所示:

await   -1

awaited -1

award   3

awards  3

这些值是制表符分隔的。首先,我想得到例如 await = -1 分的结果,并根据文件为我的 comment.txt 文件中的每个句子提供分数words.txt。程序的输出应该是这样的(例如)

-1.0

2.0

0.0

5.0

我被卡住了,不知道接下来我应该做什么。words.txt到目前为止,我只设法读取了该文件。

    const char DELIM = '\t'; 
    const string FILENAME = @"words.txt"; 

    string record;  
    string[] fields; 

    FileStream inFile; 
    StreamReader reader; 


    inFile = new FileStream(FILENAME, FileMode.Open, FileAccess.Read);

    reader = new StreamReader(inFile);

    record = reader.ReadLine();

    //Spliting up a string using delimiter and
    //storing the spilt strings into a string array
    fields = record.Split(DELIM);

    double values = double.Parse(fields[1]);
    string words = fields[0];
4

4 回答 4

1

如果你想使用正则表达式方法,试试这个

using (FileStream fileStream = new FileStream(FILENAME, FileMode.Open, FileAccess.Read)) {
  using (StreamReader streamReader = new StreamReader(fileStream)) {
    String record = streamReader.ReadLine();
    foreach (String str in record.Split('\t')) {
      Console.WriteLine(Regex.Replace(str, @"[^-?\d+]", String.Empty));
    }
    streamReader.Close();
  }
  fileStream.Close();
}

用words.txt测试

await -1    awaited -1  awaited -1  award 3 award 2 award 1 award 3 awards 3
于 2013-06-13T17:02:16.803 回答
1

您应该看一下字典,您可以将要评分的每个单词与他在字典中的值匹配。这样你就可以循环你得到的所有单词并输出值

using System;
using System.Collections.Generic;

class Program
{
    static void Main()
    {
        Dictionary<string, int> dictionary = new Dictionary<string, int>();
        dictionary.Add("await", -1);
        dictionary.Add("awaited", -1);
        dictionary.Add("award", 3);
        dictionary.Add("awards", 3);

        //read your file
        //split content on the splitter (tab) in an array

        for(int i=0; i<array.Length; i++)
        {
            //output the value
        }
    }
}
于 2013-06-13T16:45:46.697 回答
0

A working solution without a dictionary:

using System.IO;
using System.Text.RegularExpressions; 

class Program
{
    static void Main(string[] args)
    {
        foreach (var comment in File.ReadAllLines(@"..\..\comments.txt"))
            Console.WriteLine(GetRating(comment));

        Console.ReadLine();
    }

    static double GetRating(string comment)
    {
        double rating = double.NaN;

        var wordsLines = from line in File.ReadAllLines(@"..\..\words.txt")
                         where !String.IsNullOrEmpty(line)
                         select Regex.Replace(line, @"\s+", " ");

        var wordRatings = from wLine in wordsLines
                          select new { Word = wLine.Split()[0],  Rating = Double.Parse(wLine.Split()[1]) };


        foreach (var wr in wordRatings)
        {
            if (comment.ToLower().Split(new Char[] {' ', ',', '.', ':', ';'}).Contains(wr.Word))
                rating = wr.Rating;
        }

        return rating;
    }
}
于 2013-06-15T19:49:48.607 回答
0

结合 vadz 的答案和 im_a_noob 的答案,您应该能够阅读您的 words.txt 文件并将其放入字典中。

    Dictionary<string, double> wordDictionary = new Dictionary<string, double>();
    using (FileStream fileStream = new FileStream(FILENAME, FileMode.Open, FileAccess.Read))
        {
            using (StreamReader reader = new StreamReader(fileStream))
            {
                int lineCount = 0;
                int skippedLine = 0;
                while( !reader.EndOfStream)
                {
                    string[] fields = reader.ReadLine().Split('\t');
                    string word = fields[0];
                    double value = 0;
                    lineCount++;

                    //this check verifies there are two elements, tries to parse the second value and checks that the word 
                    //is not already in the dictionary
                    if (fields.Count() == 2 && double.TryParse(fields[1], out value) && !wordDictionary.ContainsKey(word))
                    {
                        wordDictionary.Add(word, value);
                    }
                    else{
                        skippedLine++;
                    }
                }

                Console.WriteLine(string.Format("Total Lines Read: {0}", lineCount));
                Console.WriteLine(string.Format("Lines Skipped: {0}", skippedLine));
                Console.WriteLine(string.Format("Expected Entries in Dictonary: {0}", lineCount - skippedLine));
                Console.WriteLine(string.Format("Actual Entries in Dictionary: {0}", wordDictionary.Count()));

                reader.Close();
            }
            fileStream.Close();
        }

要为句子评分,您可以使用以下内容。

    string fileText = File.ReadAllText(COMMENTSTEXT); //COMMENTSTEXT = comments.txt
    // assumes sentences end with a period, won't take into account any other periods in sentence
    var sentences = fileText.Split('.'); 

    foreach( string sentence in sentences )
    {
        double sentenceScore = 0;

        foreach (KeyValuePair<string, double> word in wordDictionary)
        {
            sentenceScore += sentence.Split(' ').Count(w => w == word.Key) * word.Value; 
        }

        Console.WriteLine(string.Format("Sentence Score = {0}", sentenceScore));
    }
于 2013-06-13T20:53:16.190 回答