0

Searching for "word to number" almost always ends up giving results for converting numbers into word representations which seems a much simpler task than the inverse. There are some pretty trivial cases which could be handled with a basic lookup table ("first,1st,one" -> 1, etc.), but I'm looking for something which is better at tackling the general case. The app which I'm building involves taking user input which may or may not include a number and comparing that with a known result (which itself is stored as text), so for even more complexity it'd be preferable if it were able to deal with misspellings as well (e.g. frist, sceond) however this could probably be accomplished by passing the input through a spell checker first).

So far I've found http://j.mearie.org/post/7462182919/spelt-number-to-decimal which seems pretty cool because it seems to support some other languages (or not), but I would prefer something that was more portable and less obfuscated.

The most sophisticated one I've found is https://github.com/ged/linguistics/blob/master/lib/linguistics/en/numbers.rb and http://www.perlmonks.org/?node_id=506028 also seems promising.

Is there any more complete library out there? I'd like it to handle english and spanish numbers in different formats such as first, 1st, 1, one and even invalid ones like 1nd, and roman numerals like MMXII.

4

2 回答 2

1

鉴于您希望从人类语言转换为数学,而不是相反,您基本上需要巨大的表/枚举集。数学是基于逻辑的,当走一条路时,规则集被用来指向单词。从作为一组商定规则的语言向后移动(有关每条规则的不合逻辑的例外情况,请参见英语语言),唯一可靠的完成方法是将所有可能的引用数字的方式聚集在一起,并将其绑定在翻译映射中。

您找到的任何图书馆不仅必须随着时间的推移而更新以接受新形式的数字讨论,而且还可能否定或更改先前已更改的规则。

你甚至打算如何处理像第一个这样的无效输入?他们的意思是1还是2?这是整个博士论文都致力于自然语言处理的原因的一瞥。

于 2012-10-17T22:11:20.927 回答
1

您应该研究 Lex & Yacc 的这类事情。我认为已经编写了一些“人类计算器”(即使我现在找不到它),因此您可以从中提取数字理解。

于 2012-10-17T22:15:23.613 回答