1

我已经在一个带有 TCL8.5 和最新版本的 eggdrop 的新 Debian 服务器上安装了 Eggdrop。不幸的是,我的脚本和 é、J'aime 等特殊字符的处理存在问题。

一个例子可能最好向您展示:

13:41 <@me> test
13:41 <@me> !tr nl This is a test
13:41 < bot> Dit is een test
13:41 <@me> !tr fr I am a stranger
13:41 < bot> Je suis un étranger
13:41 <@me> !tr fr I love you
13:42 < bot> Je t&#39;aime

我已经添加了转换为 utf-8 和 eggdrop 也在 utf-8 上运行的行,它似乎使 étranger 在我的 irc 客户端中可读,但是大多数字符(中文、阿拉伯语)根本不接近。TCL代码如下:

namespace eval gTranslator {

bind pub - !tr gTranslator::translate

proc translate { nick uhost handle chan text } {
  package require http
  package require json
  set lngto [string tolower [lindex [split $text] 0]]
  set text [::http::formatQuery q [join [lrange [split $text] 1 end]]]
  set dturl "http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=$text"
  set res [::json::json2dict [::http::data [::http::geturl $dturl]]]
  set lng [dict get $res responseData language]
  if { $lng == $lngto } {
  putserv "PRIVMSG $chan :\002Error\002 translating $lng to $lngto."
  return 0
  }
  set trurl "http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=$lng%7c$lngto&$text"
  putlog $trurl
  set res [::json::json2dict [::http::data [::http::geturl $trurl]]]
  putlog $res
  #putserv "PRIVMSG $chan :Language detected: $lng"
  set translated [dict get $res responseData translatedText]
  putserv "PRIVMSG $chan :[encoding convertto utf-8 $translated]"
}
}

通过 telnet 连接提供了以下附加信息:

*** Me joined the party line.
[13:49:34] http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=en%7cfr&q=I%20like%20cookies
[13:49:34] responseData {translatedText {J&#39;aime les cookies}} responseDetails null responseStatus 200
[13:50:11] http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=en%7cfr&q=I%20am%20a%20stranger
[13:50:11] responseData {translatedText {Je suis un étranger}} responseDetails null responseStatus 200
4

1 回答 1

2

这里有很多问题。一是谷歌正在返回字符串,这些字符串应用了独立于 JSON 编码的实体编码。您必须对其进行解码。其次,您遇到了内存泄漏(http::geturl需要手动清理返回的令牌),最好通过编写帮助程序来解决:

namespace eval gTranslator {

# Factor this out into a helper
proc getJson url {
  set tok [http::geturl $url]
  set res [json::json2dict [http::data $tok]]
  http::cleanup $tok
  return $res
}
# How to decode _decimal_ entities; WARNING: high magic factor within!
proc decodeEntities str {
  set str [string map {\[ {\[} \] {\]} \$ {\$} \\ \\\\} $str]
  subst [regsub -all {&#(\d+);} $str {[format %c \1]}]
}

bind pub - !tr gTranslator::translate
proc translate { nick uhost handle chan text } {
  package require http
  package require json
  set lngto [string tolower [lindex [split $text] 0]]
  set text [http::formatQuery q [join [lrange [split $text] 1 end]]]
  set dturl "http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=$text"

  set lng [dict get [getJson $dturl] responseData language]

  if { $lng == $lngto } {
    putserv "PRIVMSG $chan :\002Error\002 translating $lng to $lngto."
    return 0
  }
  set trurl "http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&langpair=$lng%7c$lngto&$text"
  putlog $trurl

  set res [getJson $trurl]

  putlog $res
  #putserv "PRIVMSG $chan :Language detected: $lng"

  set translated [decodeEntities [dict get $res responseData translatedText]]

  putserv "PRIVMSG $chan :[encoding convertto utf-8 $translated]"
}
}

(您已经encoding convertto utf-8申请了解决 Eggdrop 对编码缺乏正确理解的问题。)

I've checked the results of querying for an Arabic response, and it appears to be correct UTF-8 returned. As such, any problems you're having with it are in your client. (There may be an issue with some Chinese characters due to the fact that Tcl currently only handles the Basic Multilingual Plane – BMP – of Unicode. This is a known issue.)

于 2011-05-16T07:20:49.193 回答