2

I have a regex which parses a TNS names file. However it hangs on certain TNSNames files. The problem has been tracked down to whether the string being matched has a space after the HOST= part or not. Ignoring the appropriateness of the pattern, and how to fix the issue (this has been dealt with) what I want to know is why does the change in input cause the application to hang, as the Regex.Match(invalid) call never returns

string valid = "SOMENAME = (DESCRIPTION= " + 
                "(ADDRESS= (PROTOCOL=TCP) (HOST = localhost) (PORT=1521) ) " + 
                "(CONNECT_DATA= (SERVICE_NAME=ABC)))";

string invalid = "SOMENAME = (DESCRIPTION= " + 
                "(ADDRESS= (PROTOCOL=TCP) (HOST =localhost) (PORT=1521) ) " + 
                "(CONNECT_DATA= (SERVICE_NAME=ABC)))";
Regex regex = new Regex("SOMENAME" + @"[^=]*=(\s|[^H]*)*HOST\s*=\s(?<host>[^\)]*)\s*\)", RegexOptions.Multiline | RegexOptions.IgnoreCase);
//this line is fine
Match match = regex.Match(valid);  
//this line causes visual studio to hang
match = regex.Match(invalid);
4

1 回答 1

4

这肯定是由灾难性的回溯造成的,罪魁祸首是

(\s|[^H]*)*

因为\s[^H]可以匹配相同的内容,并且因为您嵌套了两个无限量词。

[^H]*单独匹配完全相同的内容并且不容易回溯,所以试试这个:

Regex regex = new Regex("SOMENAME" + @"[^=]*=([^H]*)HOST\s*=\s(?<host>[^\)]*)\s*\)", RegexOptions.Multiline | RegexOptions.IgnoreCase);
于 2012-05-01T08:38:36.817 回答