7

It seems that it does not recognize the accented Ó as uppercase

#!/usr/bin/env perl
use strict;
use warnings;
use 5.14.0;
use utf8;
use feature 'unicode_strings';

" SIMÓN " =~ /^\s+(\p{Upper}+)/u;
print "$1\n";

returns

SIM

Perl should be able to use Unicode data, which already tags Ó as uppercase. From emacs describe-char

character code properties: customize what to show
  name: LATIN CAPITAL LETTER O WITH ACUTE
  old-name: LATIN CAPITAL LETTER O ACUTE
  general-category: Lu (Letter, Uppercase)
  decomposition: (79 769) ('O' '́')
4

1 回答 1

9

You're missing use open ':std', ':locale'; to properly encode your output.

If that doesn't work, your file isn't encoded using UTF-8 even though you tell Perl it is.

于 2012-06-05T05:13:23.333 回答