5

我正在开发一个网络项目,该项目(希望)有一天会以多种语言提供(我说“希望”,因为虽然我们今天只计划了一个英语网站,但我公司的其他产品是多语言的,我希望我们足够成功,也需要它)。

我了解最佳实践(我在这里使用 Java、Spring MVC 和 Velocity)是将用户将看到的所有文本放在外部文件中,并在 UI 文件中按名称引用它们,例如:

#in messages_en.properties:
welcome.header = Welcome to AppName!

#in the markup
<title>#springMessage("welcome.header")</title>

但是,以前我自己从来没有在一个项目上经历过这个过程,我很好奇当你有一些 UI 的部分标记很重时,最好的处理方法是什么,例如:

<p>We are excited to announce that Company1 has been acquired by
<a href="http://www.companydivisionx.com" class="boldLink">Division X</a>,
a fast-growing division of <a href="http://www.company2.com" class="boldLink">Company 2</a>, Inc. 
(Nasdaq: <a href="http://finance.google.com/finance?q=blah" class="boldLink">BLAH</a>), based in...

我能想到的一个选择是将这个“低级”标记存储在 messages.properties 本身的消息中——但这似乎是最糟糕的选择。

我能想到的其他选择是:

  • 将每个非标记内部片段存储在 messages.properties 中,例如acquisitionAnnounce1, acquisitionAnnounce2, acquisitionAnnounce3。不过,这似乎乏味。
  • 将此消息分解为更多可重用的组件,例如Company1.nameCompany2.nameCompany2.ticker等,因为这些组件中的每一个都可能在许多其他消息中重用。这可能占此特定消息中单词的 80%。

是否有任何最佳实践来处理带有大量此类标记的国际化文本?是不是只得咬咬牙忍着撕碎每一段文字的痛苦?从您亲自处理过的任何项目中,最好的解决方案是什么?

4

4 回答 4

6

Typically if you use a template engine such as Sitemesh or Velocity you can manage these smaller HTML building blocks as subtemplates more effectively.

By so doing, you can incrementally boil down the strings which are the purely internationalized ones into groups and make them relevant to those markup subtemplates. Having done this sort of work using templates for an app which spanned multi-languages in the same locale, as well as multiple locales, we never ever placed markup in our message bundles.

I'd suggest that a key good practice would be to avoid placing markup (even at a low-level as you put it) inside message properties files at all costs! The potential this has for unleashing hell is not something to be overlooked - biting the bullet and breaking things up correctly, is far less of a pain than having to manage many files with scattered HTML markup. Its important you can visualise markup as holistic chunks and scattering that everywhere would make everyday development a chore since:

  • You would lose IDE color highlighting and syntax validation
  • High possibility that one locale file or another can easily be missed when changes to designs / markup filter down

Breaking things down (to a realistic point, eg logical sentence structures but no finer) is somewhat hard work upfront but worth the effort.

Regarding string breakdown granularity, here's a sample of what we did:

    comment.atom-details=Subscribe To Comments
    comment.username-mandatory=You must supply your name
    comment.useremail-mandatory=You must supply your email address 
    comment.email.notification=Dear {0}, the comment thread you are watching has been updated.
    comment.feed.title=Comments on {0}
    comment.feed.title.default=Comments
    comment.feed.entry.title=Comment on {0} at {1,date,medium} {2,time,HH:mm} by {3}


    comment.atom-details=Suscribir a Comentarios
    comment.username-mandatory=Debes indicar tu nombre
    comment.useremail-mandatory=Debes indicar tu direcci\u00f3n de correo electr\u00f3nico
    comment.email.notification=La conversaci\u00f3n que estas viendo ha sido actualizada
    comment.feed.title=Comentarios sobre {0}
    comment.feed.title.default=Comentarios
    comment.feed.entry.title=Comentarios sobre {0} a {1,date,medium} {2,time,HH:mm} por {3}

So you can do interesting things with how you string replace in the message bundle which may also help you preserve it's logical meaning but allow you to manipulate it mid sentence.

于 2009-02-05T16:38:43.710 回答
6

As others have said, please never split the strings into segments. You will cause translators grief as they have to coerce their language syntax to the ad-hoc rules you inadvertently create. Often it will not be possible to provide a grammatically correct translation, especially if you reuse certain segments in different contexts.

Do not remove the markup, either.

Please do not assume professional translators work in Notepad :) Computer-aided translation (CAT) tools, such as the Trados suite, know about markup perfectly well. If the tagging is HTML, rather than some custom XML format, no special preparation is required. Trados will protect the tags from accidental modification, while still allowing changes where necessary. Note that certain elements of tags often need to be localized, e.g. alt text or some query strings, so just stripping all the markup won't do.

Best of all, unless you're working on a zero-budget personal project, consider contacting a localization vendor. Localization is a service just like web design. A competent vendor will help you pick the optimal solution/format for your project and guide you through the preparation of the source material and incorporating the localized result. And of course they and their translators will have all the necessary tools. (Full disclosure: I am a translator / localization specialist. And don't split up strings :)

于 2009-02-05T21:57:33.613 回答
3

首先,不要拆分你的字符串。这使得本地化人员翻译文本变得更加困难,因为他们看不到要翻译的整个字符串。

我可能会尝试在链接周围使用占位符:

<a href="%link1%" class="%link1class%">Division X</a>

当我将网站本地化为 30 种语言时,我就是这样做的。它并不完美,但它有效。

我认为从字符串中删除所有标记是不可能(或容易)的,您需要有一种方法来插入 URL 和任何额外的标记。

于 2009-02-05T16:09:51.790 回答
2

You should avoid breaking up your strings. Not only does this become a nightmare to translate, but it also makes grammatical assumptions which may not be correct in the target language.

While placeholders can be helpful for many things, I would not recommend using placeholders for URLs. This allows you to customize the URL for different locales. After all, no sense sending them to an English language page when their locale is Argentine Spanish!

于 2009-02-05T17:22:30.783 回答