1

I'm trying to parse a fairly complicated, but structured file using c++.

011 FistName MiddleName LastName age(int) date(4/6/2001) position status ...
012 FistName MiddleName LastName age(int) date(4/6/2001) position status ...
...

That's what the file format looks like. I'm trying to store the data as individual field of a struct but the first middle last name are of variable size and may not have the middle name in them, so how would you distinguish that?

For example,

014 Jon Smith ...
015 Jon J Smith, Jr. ...

I want to store the whole name in a name field rather than separate them. Say we have

struct{
    std::string name;
    int id;
    int age;
    std::string position;
    ...

}

How would i go about parsing everything?

4

2 回答 2

1

For your purposes, if you're using C++11, you could adapt the std::regex match example to accomplish what you want.

If you're not, you should use boost::regex to accomplish what you want.

Here's an example of a regular expression you could use:

^\d+ (\w+) ?(\w*) (\w+),? ?(\w+\.)? age\((\d+)\) date\((\d\/\d\/\d+)\) (\w+) (\w+)

To find out what that regular expression means and how it matches things, check out this link.

To learn more about regular expressions, I'd highly recommend this book by Jeffrey Friedl.

It would match the following:

014 Jon Smith age(32) date(4/6/2001) position status
014 Jon J Smith, Jr. age(16) date(4/6/2001) position status
015 FistName MiddleName LastName, Title. age(45) date(4/6/2001) position status
016 FistName MiddleName LastName age(7) date(4/6/2001) position status
039 FistName MiddleName LastName age(100) date(4/6/2001) position status
于 2013-02-01T00:56:30.673 回答
0

Well you could simply use fstream and then take it 1 word at a time. First into an int, then into a string until the next value is an int (age). If i recall correctly from my infinite loops, doing a stream input to an int when the stream isnt a number doesnt take the value from the stream so you could do >> int, >> string >> int >> etc until you know you have the age.

etc etc. you get the point :)

PS: remember to use .get() and not .eof() for your input loops :)

于 2013-01-31T22:36:43.903 回答