You can use a regular expression:
void Main()
{
string ex = "aabbbbchhhhaaaacc";
var re = new Regex(@"(.)\1*");
var sequences =
from Match ma in re.Matches(ex)
let value = ma.Value
group value by value[0] into g
select g.Aggregate((s1, s2) => s1 + s2);
foreach (var sequence in sequences)
Debug.WriteLine(sequence);
}
You can test this in LINQPad and it'll output exactly what you want.
Here's an explanation of the expression:
(.)\1*
^-^^-^
| |
| +- says "0 or more times of the exact same character as in group 1"
+- first group
The LINQ query will do this:
- Find each distinct occurence of the characters
- Pick out the first (possibly the only) character of this sequence
- Group all the occurances by their first character (so that all sequences of "a"'s is in one group)
- Aggregate each group together to get each complete sequence
Now, the above solution works though there are other ways to do this. If the patterns to detect was a bit more complicated than just "the same character N times", then a regular expression might be the best way.
However, what you can do is exploit that fact, that it's the same character:
void Main()
{
string ex = "aabbbbchhhhaaaacc";
var groups = ex
.ToCharArray()
.GroupBy(c => c);
foreach (var group in groups)
Debug.WriteLine(new string(group.Key, group.Count()));
}
This will simply create groups containing all occurences of each character, where group.Key
is the character and group.Count()
is the length, and then we can simply construct new strings from each group.