The condition NR~2
is checking whether the record number, NR, matches 2. For a 2 or 3 line input file, the expression is equivalent to:
if (NR == 2)
Similarly with NR~3
, of course. Try:
awk '/2/'
That will print all lines where the text of the line ($0
) contains a 2. By default, a regular expression matches against the whole line; you could limit it to a particular field with $3 ~ /3/
, for example.
An awk
program consists of patterns and actions, where either the pattern or the action may be omitted.
awk '{ if ($0 ~ /2/) print }
/2/
/2/ { if ($0 ~ /a.*z/) print "Matches a.*z"; }'
The first line has no pattern; the action in the { ... }
is executed for each input line (but only some input lines will generate output because of the conditional. All lines that contain a 2 will be printed. (If there is no argument to print
, it prints $0
followed by a newline.)
The second line has a pattern but no action; all lines that contain a 2 will be printed again. (The missing action is equivalent to { print }
.)
The third line has both a pattern and an action; all lines that both contain a 2 and also contain an 'a' followed by a 'z' will be remarked upon.
How are these two commands different?
`awk '{if(NR~2)print}' sample.txt`
`awk '{if(NR~/^2$/)print}' sample.txt`
The first command will print line numbers 2, 12, 20..29, 32, 42, ... 102, 112, 120..129, ... 200..299, ...; all lines where the line number contains a 2.
The second command will print only line number 2 because the /^2$/
constrains the value to contain start of string, digit 2 and end of string.
I take it that means that the source is wrong?
Now I've looked at the YouTube resource, I think you must have misunderstood what it is trying to teach. When it talks about {if (NR~2) print}
, it should be saying it will print any line number which contains a 2; the video cites line numbers 2, 12, 20, 21, 22, etc. It should not be saying any line which contains a 2; I think the video does say that, but the video misspoke (but the text was accurate). The comparison against NR is not actually wrong, but it is aconventional; I'm not sure that I'd include regexes against NR in an introductory video describing awk
. So, the video appears to have a glitch in the audio but the text on screen is accurate, I think. I may still have missed something.
The command awk '{ if ($0 ~ /2/) print }
against the file say sample.txt
with the contents I mentioned would only result in the output 2 will be matched. Is that correct?
That command, given the input:
2 will be matched
This will not match 2
2 may be matched
will print all three lines; they all contain the digit 2.
I also thought that the action was print
and the pattern was $0 ~ /2/
.
No; the pattern was empty (because there was nothing before the open brace) — so all lines match it — and the action was the part in braces { if ($0 ~ /2/) print }
. Now, the action contains a conditional, but that's a separate issue.
Now the command awk '/2/' sample.txt
would print all three lines. Is that correct?
Yes.