Need help with RegEx. Using C#.
Group of Words in parentheses (round or box or curly) should be considered as one word. The part, which is outside parentheses, should split based on white space ' '.
A) Test Case –
Input - Andrew. (The Great Musician) John Smith-Lt.Gen3rd
Result (Array of string) –<br>
1. Andrew.
2. The Great Musician
3. John
4. Smith-Lt.Gen3rd
B) Test Case –
Input - Andrew. John
Result (Array of string) –<br>
1. Andrew.
2. John
C) Test Case –
Input - Andrew {The Great} Pirate
Result (Array of string) –<br>
1. Andrew
2. The Great
3. Pirate
The input is name of a person or any other entity. Current system is very old written in Access. They did it by scanning character by character. I am replacing it with C#.
I thought of doing it in two steps – first parentheses based split and then word split.
I wanted to throw these cases out as bad input -
Only Starting or ending parentheses available
nested parentheses
Overall, I wanted to split only well-formed (if start parentheses is there, there must be an ending) Inputs only.