0

如果我在 perl 方法中得到一个字符串,但当时我不知道它是否是特定编码,并且想将其转换为特定编码,我该怎么做?
例如以下内容(例如,也可以是 UTF-8 而不是 ISO8859):

sub func {
  my $arg = @_;  
  if($arg not ISO8859) {  
     $arg = Encode::encode("ISO-8859", $arg);  
  }  
  #use $arg    
}   

更新:
以下是否正确?(目的是无论$arg在方法中传递的是什么,我都会对其utf8进行编码,然后将其编码iso8859并获得单个表示,而不管输入如何)

$arg = Encode::decode("utf8", $arg);  
$arg = Encode::encode("iso-8859-1, $args);  

perldoc似乎说我需要的内容已涵盖

4

1 回答 1

5

Is 80 € or Ђ? Is it even text?

You have to decode inputs in order to do anything with them, and you have to know an input's encoding to decode it.


I don't know at that point if it is a specific encoding or not and want to convert it to a specific encoding how do I do that?

Generally speaking, you can't. How do you expect to instruct decode how to decode it if you don't know what it is?

At best you can use heuristics. The more you know about the input, the better heuristics you can use.

For example, if you know a string is encoded with either UTF-8 or iso-8859-1, then you could guess nearly perfectly which one it is. In fact, you could even decode a file that's a mix of both!

Is the following correct? (the intention is that regardless of what is the $arg that was passed in the method I make it utf8 and then I encode it to iso8859 and get a single representation regardless of input)

No. Those two lines must be provided text encoded using UTF-8. You can't decode something without knowing the encoding that was used to encode it.

于 2013-06-21T06:15:54.927 回答