c# - Using Regex to find values after a specific string of variable length and drop into array

Question

I'm trying to parse through text files that contain several pieces of data that I need to drop into a growing array.I need to find and store in an array measurements happening after a string that reads "P1".

The values can be either positive or negative (and i need the positive or negative information to be kept with the values), so I can't just set the regex to find blank number of characters after a string because it may vary. Also, the section named "Error" may contain more lines depending on the text file...

Mind you, I have never used regex for anything other than finding certain specific text and returning a true or false value...so I'm very new at this/don't know how to. I was thinking that something could be done to find and store text occurring after "P1:", but I don't know how to do express it.

Any help would be great...I am lost and learning...but it's not happening as fast as I want it to and I'm a little stuck.

Thanks! I appreciate the help.

Elle

score 0 · Accepted Answer

这取决于机器生成什么形式。如果它是统一的，您可能只需要找到这条线^[^\S\n}*P1:。它可以扩展为使用地标，例如它周围的文本。这进一步消除了其位置的歧义。

要使用的修饰符：no-dotall('.' 匹配除换行符以外的所有内容)、多行('^' 行首、'$' 行尾)。

这假设您将整个文件读入一个字符串。没有必要不使用现代内存来执行此操作。如果您想逐行执行此操作，只需将正则表达式拆分为交替，其中每个匹配项都是最终导致有效行的级别... Centration: (level1), E0: (level2), P1: (level3)，如果level1 && level2 && level3，则有效。管他呢。

压缩

^[^\S\n]*Centration:.*\n+^[^\S\n]*E0:.*\n+^[^\S\n]*P1:[^\S\n]*[+-]?[\d.]+[^\S\n]*,[^\S\n]*([+-]?[\d.]+)[^\S\n]*,[^\S\n]*([+-]?[\d.]+)[^\S\n]*$

展开

^ [^\S\n]*  Centration: .* \n+
^ [^\S\n]*  E0: .* \n+
^ [^\S\n]*  P1: [^\S\n]*  [+-]?[\d.]+  [^\S\n]* ,   # distance
                [^\S\n]* ([+-]?[\d.]+) [^\S\n]* ,   # x  - capture grp 1
                [^\S\n]* ([+-]?[\d.]+) [^\S\n]* $   # y  - capture grp 2

score 0 · Accepted Answer

这与P1捕获三个数值的行匹配：

P1: (-?\d+\.\d+), (-?\d+\.\d+), (-?\d+\.\d+)

(…<code>) 是一个捕获组
-?匹配一个可选的-
\d+匹配一位或多位数字 (0-9)
\.匹配文字.

score 0 · Accepted Answer

这是您的正则表达式：^\s*P1:\s+[\-\d\.]+,\s+([\-\d\.]+,\s+[\-\d\.]+)\s*$

让我们分解它，这样你就可以从这个例子中学习：

^表示行首

\s*代表任何前导空白字符，您想忽略它们，但它们可能存在（只是一个好习惯）

P1:是你要找的

\s+允许在 P1 和以下数字之间有任意数量的空格（至少一个）

[\-\d\.]+是检测您的号码的最简单方法。我假设这里提供的输入是正确的，否则我会让它变得更复杂一些，比如\-?\d+(\.\d+)?（这将匹配像 12、-1、12.01、-11.21 和 -0 这样的数字）

,\s+第一个数字后跟一个或多个空格的逗号

([\-\d\.]+,\s+[\-\d\.]+)第二个和第三个数字是您要查找的数字，它们用逗号和一些空格分隔。

如果您确定那里总是只有一个空格，您可能不需要 \s+ 。改为使用空间，如下所示：([\-\d\.]+, [\-\d\.]+)

\s*帮助您忽略任何尾随空格。

$代表行尾

这是您的代码（Perl）：

while (<>)
{
    chomp;
    s/^\s+|\s+$//g;
    print "$1\n" if ($_ =~ m/^\s*P1:\s+[\-\d\.]+,\s+([\-\d\.]+,\s+[\-\d\.]+)\s*$/);
}

为了让它更通用一点，做P1一个参数；

my $pattern="P1"; # or $pattern = shift;

while (<>)
{
    chomp;
    print "$1\n" if ($_ =~ m/^\s*$pattern\:\s+[\-\d\.]+,\s+([\-\d\.]+,\s+[\-\d\.]+)\s*$/);
}

享受！

c# - Using Regex to find values after a specific string of variable length and drop into array

3 回答 3

Related

Reference