我目前正在寻找用适当的序数表示(1st、2nd、3rd)替换第一、第二、第三等单词的方法。上周我一直在谷歌搜索,但没有找到任何有用的标准工具或 NLTK 的任何功能。
那么有没有或者我应该手动编写一些正则表达式?
感谢您的任何建议
这是从Gareth on codegolf获得的简洁解决方案:
ordinal = lambda n: "%d%s" % (n,"tsnrhtdd"[(n//10%10!=1)*(n%10<4)*n%10::4])
适用于任何号码:
print([ordinal(n) for n in range(1,32)])
['1st', '2nd', '3rd', '4th', '5th', '6th', '7th', '8th', '9th', '10th',
'11th', '12th', '13th', '14th', '15th', '16th', '17th', '18th', '19th',
'20th', '21st', '22nd', '23rd', '24th', '25th', '26th', '27th', '28th',
'29th', '30th', '31st']
如果您不想额外依赖外部库(如luckydonald 建议的那样),但也不希望代码的未来维护者困扰您并杀死您(因为您在生产中使用了打高尔夫球的代码)那么这是一个简短但可维护的变体:
def make_ordinal(n):
'''
Convert an integer into its ordinal representation::
make_ordinal(0) => '0th'
make_ordinal(3) => '3rd'
make_ordinal(122) => '122nd'
make_ordinal(213) => '213th'
'''
n = int(n)
if 11 <= (n % 100) <= 13:
suffix = 'th'
else:
suffix = ['th', 'st', 'nd', 'rd', 'th'][min(n % 10, 4)]
return str(n) + suffix
这个怎么样:
suf = lambda n: "%d%s"%(n,{1:"st",2:"nd",3:"rd"}.get(n if n<20 else n%10,"th"))
print [suf(n) for n in xrange(1,32)]
['1st', '2nd', '3rd', '4th', '5th', '6th', '7th', '8th', '9th', '10th',
'11th', '12th', '13th', '14th', '15th', '16th', '17th', '18th', '19th',
'20th', '21st', '22nd', '23rd', '24th', '25th', '26th', '27th', '28th',
'29th', '30th', '31st']
另一个解决方案是num2words
库(pip | github)。它特别提供不同的语言,因此本地化/国际化(又名 l10n/i18n)是轻而易举的事。
安装后使用很容易pip install num2words
:
from num2words import num2words
# english is default
num2words(4458, to="ordinal_num")
'4458th'
# examples for other languages
num2words(4458, lang="en", to="ordinal_num")
'4458th'
num2words(4458, lang="es", to="ordinal_num")
'4458º'
num2words(4458, lang="de", to="ordinal_num")
'4458.'
num2words(4458, lang="id", to="ordinal_num")
'ke-4458'
奖金:
num2words(4458, lang="en", to="ordinal")
'four thousand, four hundred and fifty-eighth'
上一个问题的公认答案有一个算法可以解决一半问题:它"first"
变成1
. 要从那里到"1st"
,请执行以下操作:
suffixes = ["th", "st", "nd", "rd", ] + ["th"] * 16
suffixed_num = str(num) + suffixes[num % 100]
这仅适用于数字 0-19。
我想在我的一个项目中使用序数,经过几个原型后,我认为这种方法虽然不小,但适用于任何正整数,是的,任何整数。
它的工作原理是确定数字是高于还是低于 20,如果数字低于 20,它将把 int 1 变成字符串 1st , 2 , 2nd ;3、3;其余的将添加“st”。
对于超过 20 的数字,它将采用最后一位和倒数第二位数字,我分别将其称为十位和单位,并测试它们以查看要添加到数字中的内容。
顺便说一句,这是在python中,所以我不确定其他语言是否能够找到字符串上的最后一个或倒数第二个数字,如果他们这样做的话应该很容易翻译。
def o(numb):
if numb < 20: #determining suffix for < 20
if numb == 1:
suffix = 'st'
elif numb == 2:
suffix = 'nd'
elif numb == 3:
suffix = 'rd'
else:
suffix = 'th'
else: #determining suffix for > 20
tens = str(numb)
tens = tens[-2]
unit = str(numb)
unit = unit[-1]
if tens == "1":
suffix = "th"
else:
if unit == "1":
suffix = 'st'
elif unit == "2":
suffix = 'nd'
elif unit == "3":
suffix = 'rd'
else:
suffix = 'th'
return str(numb)+ suffix
为了便于使用,我将函数称为“o”,并且可以通过导入我称为“ordinal”的文件名来调用,方法是先导入序数,然后是 ordinal.o(number)。
让我知道你的想法:D
我发现自己在做类似的事情,需要将带有序数('Third St')的地址转换为地理编码器可以理解的格式('3rd St')。虽然这不是很优雅,但一种快速而肮脏的解决方案是使用inflect.py生成字典进行翻译。
inflect.py 有一个number_to_words()
函数,它将一个数字(例如2
)转换为它的单词形式(例如'two'
)。此外,还有一个ordinal()
函数可以采用任何数字(数字或单词形式)并将其转换为它的序数形式(例如4
-> fourth
,six
-> sixth
)。它们都不能单独执行您要查找的操作,但是您可以一起使用它们来生成字典,以将任何提供的序数词(在合理范围内)翻译成相应的数字序数。看一看:
>>> import inflect
>>> p = inflect.engine()
>>> word_to_number_mapping = {}
>>>
>>> for i in range(1, 100):
... word_form = p.number_to_words(i) # 1 -> 'one'
... ordinal_word = p.ordinal(word_form) # 'one' -> 'first'
... ordinal_number = p.ordinal(i) # 1 -> '1st'
... word_to_number_mapping[ordinal_word] = ordinal_number # 'first': '1st'
...
>>> print word_to_number_mapping['sixth']
6th
>>> print word_to_number_mapping['eleventh']
11th
>>> print word_to_number_mapping['forty-third']
43rd
如果您愿意花一些时间,可以检查 inflect.py 在这两个函数中的内部工作原理并构建您自己的代码来动态执行此操作(我没有尝试过这样做)。
如果使用 django,你可以这样做:
from django.contrib.humanize.templatetags.humanize import ordinal
var = ordinal(number)
(或者在 django 模板中使用 ordinal 作为模板过滤器,尽管从 python 代码中这样调用它也可以)
如果不使用 django,你可以窃取他们非常简洁的实现。
人性化中有一个序数功能
pip install humanize
>>> [(x, humanize.ordinal(x)) for x in (1, 2, 3, 4, 20, 21, 22, 23, 24, 100, 101,
... 102, 103, 113, -1, 0, 1.2, 13.6)]
[(1, '1st'), (2, '2nd'), (3, '3rd'), (4, '4th'), (20, '20th'), (21, '21st'),
(22, '22nd'), (23, '23rd'), (24, '24th'), (100, '100th'), (101, '101st'),
(102, '102nd'), (103, '103rd'), (113, '113th'), (-1, '-1th'), (0, '0th'),
(1.2, '1st'), (13.6, '13th')]
此函数适用于每个数字n。如果n为负数,则将其转换为正数。如果n不是整数,则将其转换为整数。
def ordinal( n ):
suffix = ['th', 'st', 'nd', 'rd', 'th', 'th', 'th', 'th', 'th', 'th']
if n < 0:
n *= -1
n = int(n)
if n % 100 in (11,12,13):
s = 'th'
else:
s = suffix[n % 10]
return str(n) + s
这是使用 num2words 包的替代选项。
>>> from num2words import num2words
>>> num2words(42, to='ordinal_num')
'42nd'
导入humanize模块并使用序号函数。
import humanize
humanize.ordinal(4)
输出
>>> '4th'
如果您不想导入外部模块并更喜欢单行解决方案,那么以下内容可能(稍微)比接受的答案更具可读性:
def suffix(i):
return {1:"st", 2:"nd", 3:"rd"}.get(i%10*(i%100 not in [11,12,13]), "th"))
它使用字典,如https://codereview.stackexchange.com/a/41300/90593和https://stackoverflow.com/a/36977549/5069869.get
所建议的那样。
我使用带有布尔值的乘法来处理特殊情况(11,12,13),而无需启动 if 块。如果条件(i%100 not in [11,12,13])
评估为False
,则整数为 0,我们得到默认的“第”个案例。
这是我刚刚写的一个更复杂的解决方案,它考虑了复合序数。所以它first
一直有效nine hundred and ninety ninth
。我需要它将字符串街道名称转换为数字序数:
import re
from collections import OrderedDict
ONETHS = {
'first': '1ST', 'second': '2ND', 'third': '3RD', 'fourth': '4TH', 'fifth': '5TH', 'sixth': '6TH', 'seventh': '7TH',
'eighth': '8TH', 'ninth': '9TH'
}
TEENTHS = {
'tenth': '10TH', 'eleventh': '11TH', 'twelfth': '12TH', 'thirteenth': '13TH',
'fourteenth': '14TH', 'fifteenth': '15TH', 'sixteenth': '16TH', 'seventeenth': '17TH', 'eighteenth': '18TH',
'nineteenth': '19TH'
}
TENTHS = {
'twentieth': '20TH', 'thirtieth': '30TH', 'fortieth': '40TH', 'fiftieth': '50TH', 'sixtieth': '60TH',
'seventieth': '70TH', 'eightieth': '80TH', 'ninetieth': '90TH',
}
HUNDREDTH = {'hundredth': '100TH'} # HUNDREDTH not s
ONES = {'one': '1', 'two': '2', 'three': '3', 'four': '4', 'five': '5', 'six': '6', 'seven': '7', 'eight': '8',
'nine': '9'}
TENS = {'twenty': '20', 'thirty': '30', 'forty': '40', 'fifty': '50', 'sixty': '60', 'seventy': '70', 'eighty': '80',
'ninety': '90'}
HUNDRED = {'hundred': '100'}
# Used below for ALL_ORDINALS
ALL_THS = {}
ALL_THS.update(ONETHS)
ALL_THS.update(TEENTHS)
ALL_THS.update(TENTHS)
ALL_THS.update(HUNDREDTH)
ALL_ORDINALS = OrderedDict()
ALL_ORDINALS.update(ALL_THS)
ALL_ORDINALS.update(TENS)
ALL_ORDINALS.update(HUNDRED)
ALL_ORDINALS.update(ONES)
def split_ordinal_word(word):
ordinals = []
if not word:
return ordinals
for key, value in ALL_ORDINALS.items():
if word.startswith(key):
ordinals.append(key)
ordinals += split_ordinal_word(word[len(key):])
break
return ordinals
def get_ordinals(s):
ordinals, start, end = [], [], []
s = s.strip().replace('-', ' ').replace('and', '').lower()
s = re.sub(' +',' ', s) # Replace multiple spaces with a single space
s = s.split(' ')
for word in s:
found_ordinals = split_ordinal_word(word)
if found_ordinals:
ordinals += found_ordinals
else: # else if word, for covering blanks
if ordinals: # Already have some ordinals
end.append(word)
else:
start.append(word)
return start, ordinals, end
def detect_ordinal_pattern(ordinals):
ordinal_length = len(ordinals)
ordinal_string = '' # ' '.join(ordinals)
if ordinal_length == 1:
ordinal_string = ALL_ORDINALS[ordinals[0]]
elif ordinal_length == 2:
if ordinals[0] in ONES.keys() and ordinals[1] in HUNDREDTH.keys():
ordinal_string = ONES[ordinals[0]] + '00TH'
elif ordinals[0] in HUNDRED.keys() and ordinals[1] in ONETHS.keys():
ordinal_string = HUNDRED[ordinals[0]][:-1] + ONETHS[ordinals[1]]
elif ordinals[0] in TENS.keys() and ordinals[1] in ONETHS.keys():
ordinal_string = TENS[ordinals[0]][0] + ONETHS[ordinals[1]]
elif ordinal_length == 3:
if ordinals[0] in HUNDRED.keys() and ordinals[1] in TENS.keys() and ordinals[2] in ONETHS.keys():
ordinal_string = HUNDRED[ordinals[0]][0] + TENS[ordinals[1]][0] + ONETHS[ordinals[2]]
elif ordinals[0] in ONES.keys() and ordinals[1] in HUNDRED.keys() and ordinals[2] in ALL_THS.keys():
ordinal_string = ONES[ordinals[0]] + ALL_THS[ordinals[2]]
elif ordinal_length == 4:
if ordinals[0] in ONES.keys() and ordinals[1] in HUNDRED.keys() and ordinals[2] in TENS.keys() and \
ordinals[3] in ONETHS.keys():
ordinal_string = ONES[ordinals[0]] + TENS[ordinals[2]][0] + ONETHS[ordinals[3]]
return ordinal_string
这是一些示例用法:
# s = '32 one hundred and forty-third st toronto, on'
#s = '32 forty-third st toronto, on'
#s = '32 one-hundredth st toronto, on'
#s = '32 hundred and third st toronto, on'
#s = '32 hundred and thirty first st toronto, on'
# s = '32 nine hundred and twenty third st toronto, on'
#s = '32 nine hundred and ninety ninth st toronto, on'
s = '32 sixty sixth toronto, on'
st, ords, en = get_ordinals(s)
print st, detect_ordinal_pattern(ords), en
这可以处理任何长度的数字,除了 ...#11 到 ...#13 和负整数。
def ith(i):return(('th'*(10<(abs(i)%100)<14))+['st','nd','rd',*['th']*7][(abs(i)-1)%10])[0:2]
我建议使用 ith() 作为名称以避免覆盖内置的 ord()。
# test routine
for i in range(-200,200):
print(i,ith(i))
注意:使用 Python 3.6 测试;abs() 函数在没有明确包含数学模块的情况下可用。
试试这个
import sys
a = int(sys.argv[1])
for i in range(1,a+1):
j = i
if(j%100 == 11 or j%100 == 12 or j%100 == 13):
print("%dth Hello"%(j))
continue
i %= 10
if ((j%10 == 1) and ((i%10 != 0) or (i%10 != 1))):
print("%dst Hello"%(j))
elif ((j%10 == 2) and ((i%10 != 0) or (i%10 != 1))):
print("%dnd Hello"%(j))
elif ((j%10 == 3) and ((i%10 != 0) or (i%10 != 1))):
print("%drd Hello"%(j))
else:
print("%dth Hello"%(j))
我向 Gareth 的 lambda 代码致敬。如此优雅。我只了解它是如何工作的。所以我试图解构它并想出了这个:
def ordinal(integer):
int_to_string = str(integer)
if int_to_string == '1' or int_to_string == '-1':
print int_to_string+'st'
return int_to_string+'st';
elif int_to_string == '2' or int_to_string == '-2':
print int_to_string+'nd'
return int_to_string+'nd';
elif int_to_string == '3' or int_to_string == '-3':
print int_to_string+'rd'
return int_to_string+'rd';
elif int_to_string[-1] == '1' and int_to_string[-2] != '1':
print int_to_string+'st'
return int_to_string+'st';
elif int_to_string[-1] == '2' and int_to_string[-2] != '1':
print int_to_string+'nd'
return int_to_string+'nd';
elif int_to_string[-1] == '3' and int_to_string[-2] != '1':
print int_to_string+'rd'
return int_to_string+'rd';
else:
print int_to_string+'th'
return int_to_string+'th';
>>> print [ordinal(n) for n in range(1,25)]
1st
2nd
3rd
4th
5th
6th
7th
8th
9th
10th
11th
12th
13th
14th
15th
16th
17th
18th
19th
20th
21st
22nd
23rd
24th
['1st', '2nd', '3rd', '4th', '5th', '6th', '7th', '8th', '9th', '10th',
'11th', '12th', '13th', '14th', '15th', '16th', '17th', '18th', '19th',
'20th', '21st', '22nd', '23rd', '24th']
Gareth 的代码使用现代 .format() 表示
ordinal = lambda n: "{}{}".format(n,"tsnrhtdd"[(n/10%10!=1)*(n%10<4)*n%10::4])