90

Python中是否有一种标准的方法来标题字符串(即单词以大写字符开头,所有剩余的大小写字符都小写)但留下像andinof小写这样的文章?

4

9 回答 9

156

这有一些问题。如果使用 split 和 join,一些空白字符将被忽略。内置的大写和标题方法不会忽略空格。

>>> 'There     is a way'.title()
'There     Is A Way'

如果句子以文章开头,您不希望标题的第一个单词是小写的。

牢记这些:

import re 
def title_except(s, exceptions):
    word_list = re.split(' ', s)       # re.split behaves as expected
    final = [word_list[0].capitalize()]
    for word in word_list[1:]:
        final.append(word if word in exceptions else word.capitalize())
    return " ".join(final)

articles = ['a', 'an', 'of', 'the', 'is']
print title_except('there is a    way', articles)
# There is a    Way
print title_except('a whim   of an elephant', articles)
# A Whim   of an Elephant
于 2010-09-16T19:16:11.883 回答
56

使用titlecase.py模块!仅适用于英语。

>>> from titlecase import titlecase
>>> titlecase('i am a foobar bazbar')
'I Am a Foobar Bazbar'

GitHub:https ://github.com/ppannuto/python-titlecase

于 2010-09-16T17:18:05.397 回答
23

有这些方法:

>>> mytext = u'i am a foobar bazbar'
>>> print mytext.capitalize()
I am a foobar bazbar
>>> print mytext.title()
I Am A Foobar Bazbar

没有小写文章选项。您必须自己编写代码,可能通过使用您想要降低的文章列表。

于 2010-09-16T16:34:53.973 回答
14

Stuart Colville 制作了一个由 John Gruber 编写的 Perl 脚本的 Python 端口,用于将字符串转换为标题大小写,但根据《纽约时报》风格手册中的规则避免将小字大写,并满足几种特殊情况。

这些脚本的一些巧妙之处:

  • 他们将if、in、of、on等小词大写,但如果它们在输入中错误地大写,它们将不大写。

  • 脚本假定第一个字符以外的大写字母的单词已经正确大写。这意味着他们会单独留下一个像“iTunes”这样的词,而不是把它变成“iTunes”,或者更糟的是,“iTunes”。

  • 他们跳过任何带有线点的单词;“example.com”和“del.icio.us”将保持小写。

  • 他们有专门用于处理奇怪情况的硬编码技巧,例如“AT&T”和“Q&A”,它们都包含通常应该小写的小单词(at 和 a)。

  • 标题的第一个和最后一个单词总是大写,因此“Nothing to be fared of”等输入将变成“Nothing to Be Afraid Of”。

  • 冒号后的一个小单词将大写。

你可以在这里下载。

于 2012-02-10T14:45:09.127 回答
4
capitalize (word)

这应该做。我的理解不同。

>>> mytext = u'i am a foobar bazbar'
>>> mytext.capitalize()
u'I am a foobar bazbar'
>>>

好的,正如上面回复所说,您必须自定义大写:

mytext = u'i am a foobar bazbar'

def xcaptilize(word):
    skipList = ['a', 'an', 'the', 'am']
    if word not in skipList:
        return word.capitalize()
    return word

k = mytext.split(" ") 
l = map(xcaptilize, k)
print " ".join(l)   

这输出

I am a Foobar Bazbar
于 2010-09-16T16:37:15.783 回答
2

Python 2.7 的 title 方法有一个缺陷。

value.title()

当值为 Carpenter 's Assistant 时将返回 Carpenter 'S Assistant

最好的解决方案可能是来自@BioGeek 的使用来自 Stuart Colville 的 titlecase 的解决方案。这与@Etienne 提出的解决方案相同。

于 2014-11-30T18:57:12.217 回答
1
 not_these = ['a','the', 'of']
thestring = 'the secret of a disappointed programmer'
print ' '.join(word
               if word in not_these
               else word.title()
               for word in thestring.capitalize().split(' '))
"""Output:
The Secret of a Disappointed Programmer
"""

标题以大写单词开头,与文章不匹配。

于 2010-09-16T17:05:44.523 回答
1

使用列表推导和三元运算符的单线

reslt = " ".join([word.title() if word not in "the a on in of an" else word for word in "Wow, a python one liner for titles".split(" ")])
print(reslt)

分解:

for word in "Wow, a python one liner for titles".split(" ")将字符串拆分为列表并启动 for 循环(在列表理解中)

word.title() if word not in "the a on in of an" else wordtitle()如果不是文章,则使用本机方法对字符串进行标题大小写

" ".join用 (space) 的分隔符连接列表元素

于 2017-07-07T20:00:49.877 回答
0

未考虑的一种重要情况是首字母缩写词(如果您明确提供首字母缩写词作为例外,python-titlecase 解决方案可以处理它们)。相反,我更喜欢简单地避免向下套管。使用这种方法,已经大写的首字母缩略词仍为大写。以下代码是对最初由 dheerosaur 提供的代码的修改。

# This is an attempt to provide an alternative to ''.title() that works with 
# acronyms.
# There are several tricky cases to worry about in typical order of importance:
# 0. Upper case first letter of each word that is not an 'minor' word.
# 1. Always upper case first word.
# 2. Do not down case acronyms
# 3. Quotes
# 4. Hyphenated words: drive-in
# 5. Titles within titles: 2001 A Space Odyssey
# 6. Maintain leading spacing
# 7. Maintain given spacing: This is a test.  This is only a test.

# The following code addresses 0-3 & 7.  It was felt that addressing the others 
# would add considerable complexity.


def titlecase(
    s,
    exceptions = (
        'and', 'or', 'nor', 'but', 'a', 'an', 'and', 'the', 'as', 'at', 'by',
        'for', 'in', 'of', 'on', 'per', 'to'
    )
):
    words = s.strip().split(' ')
        # split on single space to maintain word spacing
        # remove leading and trailing spaces -- needed for first word casing

    def upper(s):
        if s:
            if s[0] in '‘“"‛‟' + "'":
                return s[0] + upper(s[1:])
            return s[0].upper() + s[1:]
        return ''

    # always capitalize the first word
    first = upper(words[0])

    return ' '.join([first] + [
        word if word.lower() in exceptions else upper(word)
        for word in words[1:]
    ])


cases = '''
    CDC warns about "aggressive" rats as coronavirus shuts down restaurants
    L.A. County opens churches, stores, pools, drive-in theaters
    UConn senior accused of killing two men was looking for young woman
    Giant asteroid that killed the dinosaurs slammed into Earth at ‘deadliest possible angle,’ study reveals
    Maintain given spacing: This is a test.  This is only a test.
'''.strip().splitlines()

for case in cases:
    print(titlecase(case))

运行时,它会产生以下内容:

CDC Warns About "Aggressive" Rats as Coronavirus Shuts Down Restaurants L.A. County Opens Churches, Stores, Pools, Drive-in Theaters
UConn Senior Accused of Killing Two Men Was Looking for Young Woman
Giant Asteroid That Killed the Dinosaurs Slammed Into Earth at ‘Deadliest Possible Angle,’ Study Reveals
Maintain Given Spacing: This Is a Test.  This Is Only a Test.
于 2020-05-27T18:02:52.280 回答