32

我需要找到一种在python中转换以下字符串的方法:

0.000       => 0
0           => 0
123.45000   => 123.45
0000        => 0
123.4506780 => 123.450678

等等。我尝试了 .rstrip('0').rstrip('.'),但如果输入为 0 或 00,这将不起作用。

有任何想法吗?谢谢!

4

7 回答 7

26

更新了 Generalized 以保持精度并处理看不见的值:

import decimal
import random

def format_number(num):
    try:
        dec = decimal.Decimal(num)
    except:
        return 'bad'
    tup = dec.as_tuple()
    delta = len(tup.digits) + tup.exponent
    digits = ''.join(str(d) for d in tup.digits)
    if delta <= 0:
        zeros = abs(tup.exponent) - len(tup.digits)
        val = '0.' + ('0'*zeros) + digits
    else:
        val = digits[:delta] + ('0'*tup.exponent) + '.' + digits[delta:]
    val = val.rstrip('0')
    if val[-1] == '.':
        val = val[:-1]
    if tup.sign:
        return '-' + val
    return val

# test data
NUMS = '''
    0.0000      0
    0           0
    123.45000   123.45
    0000        0
    123.4506780 123.450678
    0.1         0.1
    0.001       0.001
    0.005000    0.005
    .1234       0.1234
    1.23e1      12.3
    -123.456    -123.456
    4.98e10     49800000000
    4.9815135   4.9815135
    4e30        4000000000000000000000000000000
    -0.0000000000004 -0.0000000000004
    -.4e-12     -0.0000000000004
    -0.11112    -0.11112
    1.3.4.5     bad
    -1.2.3      bad
'''

for num, exp in [s.split() for s in NUMS.split('\n') if s]:
    res = format_number(num)
    print res
    assert exp == res

输出:

0
0
123.45
0
123.450678
0.1
0.001
0.005
0.1234
12.3
-123.456
49800000000
4.9815135
4000000000000000000000000000000
-0.0000000000004
-0.0000000000004
-0.11112
bad
bad
于 2011-04-27T17:20:48.997 回答
19

如果需要,您可以使用格式字符串,但请注意,您可能需要设置所需的精度,因为默认情况下,格式字符串有自己的逻辑。Janneb 在另一个答案中建议精度为 17 。

'{:g}'.format(float(your_string_goes_here))

不过,在考虑了更多之后,我认为最简单和最好的解决方案就是将字符串转换两次(正如jathanism 所建议的那样):

str(float(your_string_goes_here))

编辑:由于评论而添加了说明。

于 2011-04-27T17:20:57.707 回答
9

对于浮点数,您可以将字符串转换为 a float

>>> float('123.4506780')
123.450678

对于零值,您可以将它们转换为整数:

>>> int('0000')
0

打印时,数值会自动转换为字符串。如果您需要这些实际上是字符串,您可以简单地将它们转换回字符串str(),例如:

>>> str(float('123.4506780'))
'123.450678'
于 2011-04-27T17:19:17.813 回答
4
'%.17g' % float(mystr)

取决于你真正想要做什么..

于 2011-04-27T17:20:50.307 回答
2

第一个“解决方案”

import re
regx=re.compile('(?<![\d.])'
                '(?!\d*\.\d*\.)'  # excludes certain string as not being numbers
                '((\d|\.\d)([\d.])*?)'  # the only matching  group
                '([0\.]*)'
                '(?![\d.])')
regx.sub('\\1',ch)

.

编辑 1

John Machin 说 10000 和 10000.000 产生 1 而不是 10000

我在帮助下更正了替换功能(?!(?<=0)\.)

import re
regx = re.compile('(?<![\d.])'       '(?![1-9]\d*(?![\d.])|\d*\.\d*\.)'
                  '0*(?!(?<=0)\.)'
                  '([\d.]+?)'      # the only group , which is kept
                  '\.?0*'
                  '(?![\d.])')    
regx.sub('\\1',ch)               

.

编辑 2

纠正剩余的缺点 [ '.0000'产生'.' , 由 John Machin 指出,并且'000078000'产生'78' ] ,我重写了一个基于新想法的正则表达式。它更简单。正则表达式检测所有类型的数字。

该解决方案不仅可以消除尾随零,还可以消除航向零。这是此解决方案与 John Machin's tidy_float()、 samplebias's number_format()、 arussell84's 的比较'{:g}'.format()。我的函数的结果(这次都是正确的)与其他函数的结果存在一些差异:

import re
def number_shaver(ch,
                  regx = re.compile('(?<![\d.])0*(?:'
                                    '(\d+)\.?|\.(0)'
                                    '|(\.\d+?)|(\d+\.\d+?)'
                                    ')0*(?![\d.])')  ,
                  repl = lambda mat: mat.group(mat.lastindex)
                                     if mat.lastindex!=3
                                     else '0' + mat.group(3) ):
    return regx.sub(repl,ch)


def tidy_float(s):  # John Machin
    """Return tidied float representation.
    Remove superflous leading/trailing zero digits.
    Remove '.' if value is an integer.
    Return '****' if float(s) fails.
    """
    # float?
    try:
        f = float(s)
    except ValueError:
        return s
    # int?
    try:
        i = int(s)
        return str(i)
    except ValueError:
        pass
    # scientific notation?
    if 'e' in s or 'E' in s:
        t = s.lstrip('0')
        if t.startswith('.'): t = '0' + t
        return t
    # float with integral value (includes zero)?
    i = int(f)
    if i == f:
        return str(i)
    assert '.' in s
    t = s.strip('0')
    if t.startswith('.'): t = '0' + t
    if t.endswith('.'): t += '0'
    return t


def format_float(s):  # arrussell84
    return '{:g}'.format(float(s)) if s.count('.')<2 \
           else "Can't treat"


import decimal
def format_number(num):
    try:
        dec = decimal.Decimal(num)
    except:
        return 'bad'
    tup = dec.as_tuple()
    delta = len(tup.digits) + tup.exponent
    digits = ''.join(str(d) for d in tup.digits)
    if delta <= 0:
        zeros = abs(tup.exponent) - len(tup.digits)
        val = '0.' + ('0'*zeros) + digits
    else:
        val = digits[:delta] + ('0'*tup.exponent) + '.' + digits[delta:]
    val = val.rstrip('0')
    if val[-1] == '.':
        val = val[:-1]
    if tup.sign:
        return '-' + val
    return val


numbers = ['23456000', '23456000.', '23456000.000',
           '00023456000', '000023456000.', '000023456000.000',
           '10000', '10000.', '10000.000',
           '00010000', '00010000.', '00010000.000',
           '24', '24.', '24.000',
           '00024', '00024.', '00024.000',
           '8', '8.', '8.000',
           '0008', '0008.', '0008.000',
           '0', '00000', '0.', '000.',
           '\n',
           '0.0', '0.000', '000.0', '000.000', '.000000', '.0',
           '\n',
           '.00023456', '.00023456000', '.00503', '.00503000',
           '.068', '.0680000', '.8', '.8000',
           '.123456123456', '.123456123456000',
           '.657', '.657000', '.45', '.4500000', '.7', '.70000',
           '\n',
           '0.0000023230000', '000.0000023230000',
           '0.0081000', '0000.0081000',
           '0.059000', '0000.059000',
           '0.78987400000', '00000.78987400000',
           '0.4400000', '00000.4400000',
           '0.5000', '0000.5000',
           '0.90', '000.90', '0.7', '000.7',
           '\n',
           '2.6', '00002.6', '00002.60000',
           '4.71', '0004.71', '0004.7100',
           '23.49', '00023.49', '00023.490000',
           '103.45', '0000103.45', '0000103.45000',
           '10003.45067', '000010003.45067', '000010003.4506700',
           '15000.0012', '000015000.0012', '000015000.0012000',
           '78000.89', '000078000.89', '000078000.89000',
           '\n',
           '.0457e10', '.0457000e10','00000.0457000e10',
           '258e8', '2580000e4', '0000000002580000e4',
           # notice the difference of exponents
           '0.782e10', '0000.782e10', '0000.7820000e10',
           '1.23E2', '0001.23E2', '0001.2300000E2',
           '432e-102', '0000432e-102', '004320000e-106',
           # notice the difference of exponents
           '1.46e10', '0001.46e10', '0001.4600000e10',
           '1.077e-300', '0001.077e-300', '0001.077000e-300',
           '1.069e10', '0001.069e10', '0001.069000e10',
           '105040.03e10', '000105040.03e10', '105040.0300e10',
           '\n',
           '..18000', '25..00',  '36...77', '2..8',
           '3.8..9', '.12500.', '12.51.400' ]

pat = '%18s %-15s %-15s %-15s %s' li = [pat % ('tested number','float_shaver', 'tidy_float',"format_number()","'{:g}'. format()")] li.extend(pat % (n,number_shaver(n),tidy_float(n),format_number(n),format_float(n)) if n!='\n' else '\n' for n数)

打印 '\n'.join(li)

比较结果:

     tested number  float_shaver    tidy_float      format_number() '{:g}'.format()
          23456000  23456000        23456000        23456000        2.3456e+07
         23456000.  23456000        23456000        23456000        2.3456e+07
      23456000.000  23456000        23456000        23456000        2.3456e+07
       00023456000  23456000        23456000        23456000        2.3456e+07
     000023456000.  23456000        23456000        23456000        2.3456e+07
  000023456000.000  23456000        23456000        23456000        2.3456e+07
             10000  10000           10000           10000           10000
            10000.  10000           10000           10000           10000
         10000.000  10000           10000           10000           10000
          00010000  10000           10000           10000           10000
         00010000.  10000           10000           10000           10000
      00010000.000  10000           10000           10000           10000
                24  24              24              24              24
               24.  24              24              24              24
            24.000  24              24              24              24
             00024  24              24              24              24
            00024.  24              24              24              24
         00024.000  24              24              24              24
                 8  8               8               8               8
                8.  8               8               8               8
             8.000  8               8               8               8
              0008  8               8               8               8
             0008.  8               8               8               8
          0008.000  8               8               8               8
                 0  0               0               0               0
             00000  0               0               0               0
                0.  0               0               0               0
              000.  0               0               0               0


               0.0  0               0               0               0
             0.000  0               0               0               0
             000.0  0               0               0               0
           000.000  0               0               0               0
           .000000  0               0               0               0
                .0  0               0               0               0


         .00023456  0.00023456      0.00023456      0.00023456      0.00023456
      .00023456000  0.00023456      0.00023456      0.00023456      0.00023456
            .00503  0.00503         0.00503         0.00503         0.00503
         .00503000  0.00503         0.00503         0.00503         0.00503
              .068  0.068           0.068           0.068           0.068
          .0680000  0.068           0.068           0.068           0.068
                .8  0.8             0.8             0.8             0.8
             .8000  0.8             0.8             0.8             0.8
     .123456123456  0.123456123456  0.123456123456  0.123456123456  0.123456
  .123456123456000  0.123456123456  0.123456123456  0.123456123456  0.123456
              .657  0.657           0.657           0.657           0.657
           .657000  0.657           0.657           0.657           0.657
               .45  0.45            0.45            0.45            0.45
          .4500000  0.45            0.45            0.45            0.45
                .7  0.7             0.7             0.7             0.7
            .70000  0.7             0.7             0.7             0.7


   0.0000023230000  0.000002323     0.000002323     0.000002323     2.323e-06
 000.0000023230000  0.000002323     0.000002323     0.000002323     2.323e-06
         0.0081000  0.0081          0.0081          0.0081          0.0081
      0000.0081000  0.0081          0.0081          0.0081          0.0081
          0.059000  0.059           0.059           0.059           0.059
       0000.059000  0.059           0.059           0.059           0.059
     0.78987400000  0.789874        0.789874        0.789874        0.789874
 00000.78987400000  0.789874        0.789874        0.789874        0.789874
         0.4400000  0.44            0.44            0.44            0.44
     00000.4400000  0.44            0.44            0.44            0.44
            0.5000  0.5             0.5             0.5             0.5
         0000.5000  0.5             0.5             0.5             0.5
              0.90  0.9             0.9             0.9             0.9
            000.90  0.9             0.9             0.9             0.9
               0.7  0.7             0.7             0.7             0.7
             000.7  0.7             0.7             0.7             0.7


               2.6  2.6             2.6             2.6             2.6
           00002.6  2.6             2.6             2.6             2.6
       00002.60000  2.6             2.6             2.6             2.6
              4.71  4.71            4.71            4.71            4.71
           0004.71  4.71            4.71            4.71            4.71
         0004.7100  4.71            4.71            4.71            4.71
             23.49  23.49           23.49           23.49           23.49
          00023.49  23.49           23.49           23.49           23.49
      00023.490000  23.49           23.49           23.49           23.49
            103.45  103.45          103.45          103.45          103.45
        0000103.45  103.45          103.45          103.45          103.45
     0000103.45000  103.45          103.45          103.45          103.45
       10003.45067  10003.45067     10003.45067     10003.45067     10003.5
   000010003.45067  10003.45067     10003.45067     10003.45067     10003.5
 000010003.4506700  10003.45067     10003.45067     10003.45067     10003.5
        15000.0012  15000.0012      15000.0012      15000.0012      15000
    000015000.0012  15000.0012      15000.0012      15000.0012      15000
 000015000.0012000  15000.0012      15000.0012      15000.0012      15000
          78000.89  78000.89        78000.89        78000.89        78000.9
      000078000.89  78000.89        78000.89        78000.89        78000.9
   000078000.89000  78000.89        78000.89        78000.89        78000.9


          .0457e10  0.0457e10       0.0457e10       457000000       4.57e+08
       .0457000e10  0.0457e10       0.0457000e10    457000000       4.57e+08
  00000.0457000e10  0.0457e10       0.0457000e10    457000000       4.57e+08
             258e8  258e8           258e8           25800000000     2.58e+10
         2580000e4  2580000e4       2580000e4       25800000000     2.58e+10
0000000002580000e4  2580000e4       2580000e4       25800000000     2.58e+10
          0.782e10  0.782e10        0.782e10        7820000000      7.82e+09
       0000.782e10  0.782e10        0.782e10        7820000000      7.82e+09
   0000.7820000e10  0.782e10        0.7820000e10    7820000000      7.82e+09
            1.23E2  1.23E2          1.23E2          123             123
         0001.23E2  1.23E2          1.23E2          123             123
    0001.2300000E2  1.23E2          1.2300000E2     123             123
          432e-102  432e-102        432e-102        0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000432 4.32e-100
      0000432e-102  432e-102        432e-102        0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000432 4.32e-100
    004320000e-106  4320000e-106    4320000e-106    0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000432 4.32e-100
           1.46e10  1.46e10         1.46e10         14600000000     1.46e+10
        0001.46e10  1.46e10         1.46e10         14600000000     1.46e+10
   0001.4600000e10  1.46e10         1.4600000e10    14600000000     1.46e+10
        1.077e-300  1.077e-300      1.077e-300      0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001077 1.077e-300
     0001.077e-300  1.077e-300      1.077e-300      0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001077 1.077e-300
  0001.077000e-300  1.077e-300      1.077000e-300   0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001077 1.077e-300
          1.069e10  1.069e10        1.069e10        10690000000     1.069e+10
       0001.069e10  1.069e10        1.069e10        10690000000     1.069e+10
    0001.069000e10  1.069e10        1.069000e10     10690000000     1.069e+10
      105040.03e10  105040.03e10    105040.03e10    1050400300000000 1.0504e+15
   000105040.03e10  105040.03e10    105040.03e10    1050400300000000 1.0504e+15
    105040.0300e10  105040.03e10    105040.0300e10  1050400300000000 1.0504e+15


           ..18000  ..18000         ..18000         bad             Can't treat
            25..00  25..00          25..00          bad             Can't treat
           36...77  36...77         36...77         bad             Can't treat
              2..8  2..8            2..8            bad             Can't treat
            3.8..9  3.8..9          3.8..9          bad             Can't treat
           .12500.  .12500.         .12500.         bad             Can't treat
         12.51.400  12.51.400       12.51.400       bad             Can't treat

.

我认为我的解决方案有两个优点:

  • 正则表达式和函数number_shave()很短

  • number_shave()不仅一次只处理一个数字,而且还检测并处理字符串中的所有数字。这是 John Machin 和 arrussel84 的解决方案无法解决的问题:

代码:

numbers = [['', '23456000', '23456000.', '23456000.000 \n',
            '00023456000', '000023456000.', '000023456000.000 \n',
            '10000', '10000.', '10000.000 \n',
            '00010000', '00010000.', '00010000.000 \n',
            '24', '24.', '24.000 \n',
            '00024', '00024.', '00024.000 \n',
            '8', '8.', '8.000 \n',
            '0008', '0008.', '0008.000 \n',
            '0', '00000', '0.', '000.' ],




            ['0.0', '0.000', '000.0', '000.000', '.000000', '.0'],

            ['.00023456', '.00023456000', '.00503', '.00503000 \n',
             '.068', '.0680000', '.8', '.8000 \n',
             '.123456123456', '.123456123456000 \n',
             '.657', '.657000', '.45', '.4500000', '.7', '.70000'],

            ['0.0000023230000', '000.0000023230000 \n',
             '0.0081000', '0000.0081000 \n',
             '0.059000', '0000.059000 \n',
             '0.78987400000', '00000.78987400000 \n',
             '0.4400000', '00000.4400000 \n',
             '0.5000', '0000.5000 \n',
             '0.90', '000.90', '0.7', '000.7 '],

            ['2.6', '00002.6', '00002.60000 \n',
             '4.71', '0004.71', '0004.7100 \n',
             '23.49', '00023.49', '00023.490000 \n',
             '103.45', '0000103.45', '0000103.45000 \n',
             '10003.45067', '000010003.45067', '000010003.4506700 \n',
             '15000.0012', '000015000.0012', '000015000.0012000 \n',
             '78000.89', '000078000.89', '000078000.89000'],

            ['.0457e10', '.0457000e10 \n',
             '0.782e10', '0000.782e10', '0000.7820000e10 \n',
             '1.23E2', '0001.23E2', '0001.2300000E2 \n',
             '1.46e10', '0001.46e10', '0001.4600000e10 \n',
             '1.077e-456', '0001.077e-456', '0001.077000e-456 \n',
             '1.069e10', '0001.069e10', '0001.069000e10 \n',
             '105040.03e10', '000105040.03e10', '105040.03e10'],

            ['..18000', '25..00',  '36...77', '2..8 \n',
             '3.8..9', '.12500.', '12.51.400' ]]


import re
def number_shaver(ch,
                 regx = re.compile('(?<![\d.])0*(?:'
                                   '(\d+)\.?|\.(0)'
                                   '|(\.\d+?)|(\d+\.\d+?)'
                                   ')0*(?![\d.])')  ,
                 repl = lambda mat: mat.group(mat.lastindex)
                                    if mat.lastindex!=3
                                    else '0' + mat.group(3) ):
    return regx.sub(repl,ch)




for li in numbers:
    one_string = ' --- '.join(li)
    print one_string + '\n\n' + number_shaver(one_string) + \
          '\n\n' + 3*'---------------------' + '\n'

包含多个数字的字符串的处理结果:

 --- 23456000 --- 23456000. --- 23456000.000 
 --- 00023456000 --- 000023456000. --- 000023456000.000 
 --- 10000 --- 10000. --- 10000.000 
 --- 00010000 --- 00010000. --- 00010000.000 
 --- 24 --- 24. --- 24.000 
 --- 00024 --- 00024. --- 00024.000 
 --- 8 --- 8. --- 8.000 
 --- 0008 --- 0008. --- 0008.000 
 --- 0 --- 00000 --- 0. --- 000.

 --- 23456000 --- 23456000 --- 23456000 
 --- 23456000 --- 23456000 --- 23456000 
 --- 10000 --- 10000 --- 10000 
 --- 10000 --- 10000 --- 10000 
 --- 24 --- 24 --- 24 
 --- 24 --- 24 --- 24 
 --- 8 --- 8 --- 8 
 --- 8 --- 8 --- 8 
 --- 0 --- 0 --- 0 --- 0

---------------------------------------------------------------

0.0 --- 0.000 --- 000.0 --- 000.000 --- .000000 --- .0

0 --- 0 --- 0 --- 0 --- 0 --- 0

---------------------------------------------------------------

.00023456 --- .00023456000 --- .00503 --- .00503000 
 --- .068 --- .0680000 --- .8 --- .8000 
 --- .123456123456 --- .123456123456000 
 --- .657 --- .657000 --- .45 --- .4500000 --- .7 --- .70000

0.00023456 --- 0.00023456 --- 0.00503 --- 0.00503 
 --- 0.068 --- 0.068 --- 0.8 --- 0.8 
 --- 0.123456123456 --- 0.123456123456 
 --- 0.657 --- 0.657 --- 0.45 --- 0.45 --- 0.7 --- 0.7

---------------------------------------------------------------

0.0000023230000 --- 000.0000023230000 
 --- 0.0081000 --- 0000.0081000 
 --- 0.059000 --- 0000.059000 
 --- 0.78987400000 --- 00000.78987400000 
 --- 0.4400000 --- 00000.4400000 
 --- 0.5000 --- 0000.5000 
 --- 0.90 --- 000.90 --- 0.7 --- 000.7 

0.000002323 --- 0.000002323 
 --- 0.0081 --- 0.0081 
 --- 0.059 --- 0.059 
 --- 0.789874 --- 0.789874 
 --- 0.44 --- 0.44 
 --- 0.5 --- 0.5 
 --- 0.9 --- 0.9 --- 0.7 --- 0.7 

---------------------------------------------------------------

2.6 --- 00002.6 --- 00002.60000 
 --- 4.71 --- 0004.71 --- 0004.7100 
 --- 23.49 --- 00023.49 --- 00023.490000 
 --- 103.45 --- 0000103.45 --- 0000103.45000 
 --- 10003.45067 --- 000010003.45067 --- 000010003.4506700 
 --- 15000.0012 --- 000015000.0012 --- 000015000.0012000 
 --- 78000.89 --- 000078000.89 --- 000078000.89000

2.6 --- 2.6 --- 2.6 
 --- 4.71 --- 4.71 --- 4.71 
 --- 23.49 --- 23.49 --- 23.49 
 --- 103.45 --- 103.45 --- 103.45 
 --- 10003.45067 --- 10003.45067 --- 10003.45067 
 --- 15000.0012 --- 15000.0012 --- 15000.0012 
 --- 78000.89 --- 78000.89 --- 78000.89

---------------------------------------------------------------

.0457e10 --- .0457000e10 
 --- 0.782e10 --- 0000.782e10 --- 0000.7820000e10 
 --- 1.23E2 --- 0001.23E2 --- 0001.2300000E2 
 --- 1.46e10 --- 0001.46e10 --- 0001.4600000e10 
 --- 1.077e-456 --- 0001.077e-456 --- 0001.077000e-456 
 --- 1.069e10 --- 0001.069e10 --- 0001.069000e10 
 --- 105040.03e10 --- 000105040.03e10 --- 105040.03e10

0.0457e10 --- 0.0457e10 
 --- 0.782e10 --- 0.782e10 --- 0.782e10 
 --- 1.23E2 --- 1.23E2 --- 1.23E2 
 --- 1.46e10 --- 1.46e10 --- 1.46e10 
 --- 1.077e-456 --- 1.077e-456 --- 1.077e-456 
 --- 1.069e10 --- 1.069e10 --- 1.069e10 
 --- 105040.03e10 --- 105040.03e10 --- 105040.03e10

---------------------------------------------------------------

..18000 --- 25..00 --- 36...77 --- 2..8 
 --- 3.8..9 --- .12500. --- 12.51.400

..18000 --- 25..00 --- 36...77 --- 2..8 
 --- 3.8..9 --- .12500. --- 12.51.400

---------------------------------------------------------------

.

因此,正则表达式也可用于仅查找字符串中的所有数字,而不需要删除零。

.

PS:在我解释正则表达式及其功能的其他答案中查看更多信息

于 2011-04-27T18:18:52.340 回答
1

脚本:

def tidy_float(s):
    """Return tidied float representation.
    Remove superflous leading/trailing zero digits.
    Remove '.' if value is an integer.
    Return '****' if float(s) fails.
    """
    # float?
    try:
        f = float(s)
    except ValueError:
        return '****'
    # int?
    try:
        i = int(s)
        return str(i)
    except ValueError:
        pass
    # scientific notation?
    if 'e' in s or 'E' in s:
        t = s.lstrip('0')
        if t.startswith('.'): t = '0' + t
        return t
    # float with integral value (includes zero)?
    i = int(f)
    if i == f:
        return str(i)
    assert '.' in s
    t = s.strip('0')
    if t.startswith('.'): t = '0' + t
    if t.endswith('.'): t += '0'
    return t

if __name__ == "__main__":

    # Each line has test string followed by expected output
    tests = """
    0.000 0
    0 0
    0000 0
    0.4000 0.4
    0.0081000 0.0081
    103.45 103.45
    103.4506700 103.45067
    14500.0012 14500.0012
    478000.89 478000.89
    993.59.18 ****
    12.5831.400 ****
    .458 0.458
    .48587000 0.48587
    .0000 0
    10000 10000
    10000.000 10000
    -10000 -10000
    -10000.000 -10000
    1.23e2 1.23e2
    1.23e10 1.23e10
    .123e10 0.123e10
     """.splitlines()

    for test in tests:
        x = test.split()
        if not x: continue
        data, expected = x
        actual = tidy_float(data)
        print "data=%r exp=%r act=%r %s" % (
            data, expected, actual, ["**FAIL**", ""][actual == expected])

输出(Python 2.7.1):

data='0.000' exp='0' act='0'
data='0' exp='0' act='0'
data='0000' exp='0' act='0'
data='0.4000' exp='0.4' act='0.4'
data='0.0081000' exp='0.0081' act='0.0081'
data='103.45' exp='103.45' act='103.45'
data='103.4506700' exp='103.45067' act='103.45067'
data='14500.0012' exp='14500.0012' act='14500.0012'
data='478000.89' exp='478000.89' act='478000.89'
data='993.59.18' exp='****' act='****'
data='12.5831.400' exp='****' act='****'
data='.458' exp='0.458' act='0.458'
data='.48587000' exp='0.48587' act='0.48587'
data='.0000' exp='0' act='0'
data='10000' exp='10000' act='10000'
data='10000.000' exp='10000' act='10000'
data='-10000' exp='-10000' act='-10000'
data='-10000.000' exp='-10000' act='-10000'
data='1.23e2' exp='1.23e2' act='1.23e2'
data='1.23e10' exp='1.23e10' act='1.23e10'
data='.123e10' exp='0.123e10' act='0.123e10'
于 2011-04-27T21:40:20.290 回答
1

我的其他答案的编辑 2 的补充

(所有人都渴望只在一个职位上)

正则表达式的模式定义了 4 个子模式,每个子模式都匹配某种类型的数字。每次正则表达式与字符串的一部分匹配时,只有一个子模式匹配,因此可以在替换函数中使用mat.lastindex 。以下代码显示了子模式与各种数字的匹配:

import re
def float_show(ch,
               regx = re.compile(
                   '(?<![\d.])'
                   '0*' # potentiel heading zeros
                   '(?:'
                   '(\d+)\.?' # INTEGERS :
                              # ~ pure integers non-0 or 0
                              #   000450 , 136000 , 87 , 000 , 0
                              # ~ integer part non-0 + '.'
                              #   0044. , 4100.
                              # ~ integer part 0 + '.'
                              #   000. , 0. 
                              # ~ integer part non-0 + '.' + fractional part 0:
                              #   000570.00 , 193.0 , 3.000

                   '|\.(0)' # SPECIAL CASE, 0 WITH FRACTIONAL PART :
                            # ~ integer part 0 + compulsory fractional part 0:
                            #   000.0, 0.000 , .0 , .00000

                   '|(\.\d+?)' # FLOATING POINT NUMBER
                               # ~ with integer part 0:
                               #   000.0890 , 0.52 , 0.1 , .077000 , .1400 , .0006010

                   '|(\d+\.\d+?)' # FLOATING POINT NUMBER
                                  # ~ with integer part non-0:
                                  #   0024000.013000 , 145.0235 , 3.00058
                   ')'
                   '0*' # potential tailing zeros
                   '(?![\d.])'),
               repl = lambda mat: mat.group(mat.lastindex)
                                  if mat.lastindex!=3
                                  else '0' + mat.group(3)  ):
    mat = regx.search(ch)
    if mat:
        return (ch,regx.sub(repl,ch),repr(mat.groups()))
    else:
        return (ch,'No match','No groups')


numbers = ['23456000', '23456000.', '23456000.000',
           '00023456000', '000023456000.', '000023456000.000',
           '10000', '10000.', '10000.000',
           '00010000', '00010000.', '00010000.000',
           '24', '24.', '24.000',
           '00024', '00024.', '00024.000',
           '8', '8.', '8.000',
           '0008', '0008.', '0008.000',
           '0', '00000', '0.', '000.',
           '\n',
           '0.0', '0.000', '000.0', '000.000', '.000000', '.0',
           '\n',
           '.00023456', '.00023456000', '.00503', '.00503000',
           '.068', '.0680000', '.8', '.8000',
           '.123456123456', '.123456123456000',
           '.657', '.657000', '.45', '.4500000', '.7', '.70000',
           '\n',
           '0.0000023230000', '000.0000023230000',
           '0.0081000', '0000.0081000',
           '0.059000', '0000.059000',
           '0.78987400000', '00000.78987400000',
           '0.4400000', '00000.4400000',
           '0.5000', '0000.5000',
           '0.90', '000.90', '0.7', '000.7',
           '\n',
           '2.6', '00002.6', '00002.60000',
           '4.71', '0004.71', '0004.7100',
           '23.49', '00023.49', '00023.490000',
           '103.45', '0000103.45', '0000103.45000',
           '10003.45067', '000010003.45067', '000010003.4506700',
           '15000.0012', '000015000.0012', '000015000.0012000',
           '78000.89', '000078000.89', '000078000.89000',
           '\n',
           '.0457e10', '.0457000e10',
           '0.782e10', '0000.782e10', '0000.7820000e10',
           '1.23E2', '0001.23E2', '0001.2300000E2',
           '1.46e10', '0001.46e10', '0001.4600000e10',
           '1.077e-456', '0001.077e-456', '0001.077000e-456',
           '1.069e10', '0001.069e10', '0001.069000e10',
           '105040.03e10', '000105040.03e10', '105040.0300e10',
           '\n',
           '..18000', '25..00',  '36...77', '2..8',
           '3.8..9', '.12500.', '12.51.400' ]

pat = '%20s  %-16s %s'
li = [pat % ('tested number ',' shaved float',' regx.search(number).groups()')]
li.extend(pat % float_show(ch) if ch!='\n' else '\n' for ch in numbers)
print '\n'.join(li)

演示

      tested number    shaved float     regx.search(number).groups()
            23456000  23456000         ('23456000', None, None, None)
           23456000.  23456000         ('23456000', None, None, None)
        23456000.000  23456000         ('23456000', None, None, None)
         00023456000  23456000         ('23456000', None, None, None)
       000023456000.  23456000         ('23456000', None, None, None)
    000023456000.000  23456000         ('23456000', None, None, None)
               10000  10000            ('10000', None, None, None)
              10000.  10000            ('10000', None, None, None)
           10000.000  10000            ('10000', None, None, None)
            00010000  10000            ('10000', None, None, None)
           00010000.  10000            ('10000', None, None, None)
        00010000.000  10000            ('10000', None, None, None)
                  24  24               ('24', None, None, None)
                 24.  24               ('24', None, None, None)
              24.000  24               ('24', None, None, None)
               00024  24               ('24', None, None, None)
              00024.  24               ('24', None, None, None)
           00024.000  24               ('24', None, None, None)
                   8  8                ('8', None, None, None)
                  8.  8                ('8', None, None, None)
               8.000  8                ('8', None, None, None)
                0008  8                ('8', None, None, None)
               0008.  8                ('8', None, None, None)
            0008.000  8                ('8', None, None, None)
                   0  0                ('0', None, None, None)
               00000  0                ('0', None, None, None)
                  0.  0                ('0', None, None, None)
                000.  0                ('0', None, None, None)


                 0.0  0                (None, '0', None, None)
               0.000  0                (None, '0', None, None)
               000.0  0                (None, '0', None, None)
             000.000  0                (None, '0', None, None)
             .000000  0                (None, '0', None, None)
                  .0  0                (None, '0', None, None)


           .00023456  0.00023456       (None, None, '.00023456', None)
        .00023456000  0.00023456       (None, None, '.00023456', None)
              .00503  0.00503          (None, None, '.00503', None)
           .00503000  0.00503          (None, None, '.00503', None)
                .068  0.068            (None, None, '.068', None)
            .0680000  0.068            (None, None, '.068', None)
                  .8  0.8              (None, None, '.8', None)
               .8000  0.8              (None, None, '.8', None)
       .123456123456  0.123456123456   (None, None, '.123456123456', None)
    .123456123456000  0.123456123456   (None, None, '.123456123456', None)
                .657  0.657            (None, None, '.657', None)
             .657000  0.657            (None, None, '.657', None)
                 .45  0.45             (None, None, '.45', None)
            .4500000  0.45             (None, None, '.45', None)
                  .7  0.7              (None, None, '.7', None)
              .70000  0.7              (None, None, '.7', None)


     0.0000023230000  0.000002323      (None, None, '.000002323', None)
   000.0000023230000  0.000002323      (None, None, '.000002323', None)
           0.0081000  0.0081           (None, None, '.0081', None)
        0000.0081000  0.0081           (None, None, '.0081', None)
            0.059000  0.059            (None, None, '.059', None)
         0000.059000  0.059            (None, None, '.059', None)
       0.78987400000  0.789874         (None, None, '.789874', None)
   00000.78987400000  0.789874         (None, None, '.789874', None)
           0.4400000  0.44             (None, None, '.44', None)
       00000.4400000  0.44             (None, None, '.44', None)
              0.5000  0.5              (None, None, '.5', None)
           0000.5000  0.5              (None, None, '.5', None)
                0.90  0.9              (None, None, '.9', None)
              000.90  0.9              (None, None, '.9', None)
                 0.7  0.7              (None, None, '.7', None)
               000.7  0.7              (None, None, '.7', None)


                 2.6  2.6              (None, None, None, '2.6')
             00002.6  2.6              (None, None, None, '2.6')
         00002.60000  2.6              (None, None, None, '2.6')
                4.71  4.71             (None, None, None, '4.71')
             0004.71  4.71             (None, None, None, '4.71')
           0004.7100  4.71             (None, None, None, '4.71')
               23.49  23.49            (None, None, None, '23.49')
            00023.49  23.49            (None, None, None, '23.49')
        00023.490000  23.49            (None, None, None, '23.49')
              103.45  103.45           (None, None, None, '103.45')
          0000103.45  103.45           (None, None, None, '103.45')
       0000103.45000  103.45           (None, None, None, '103.45')
         10003.45067  10003.45067      (None, None, None, '10003.45067')
     000010003.45067  10003.45067      (None, None, None, '10003.45067')
   000010003.4506700  10003.45067      (None, None, None, '10003.45067')
          15000.0012  15000.0012       (None, None, None, '15000.0012')
      000015000.0012  15000.0012       (None, None, None, '15000.0012')
   000015000.0012000  15000.0012       (None, None, None, '15000.0012')
            78000.89  78000.89         (None, None, None, '78000.89')
        000078000.89  78000.89         (None, None, None, '78000.89')
     000078000.89000  78000.89         (None, None, None, '78000.89')


            .0457e10  0.0457e10        (None, None, '.0457', None)
         .0457000e10  0.0457e10        (None, None, '.0457', None)
            0.782e10  0.782e10         (None, None, '.782', None)
         0000.782e10  0.782e10         (None, None, '.782', None)
     0000.7820000e10  0.782e10         (None, None, '.782', None)
              1.23E2  1.23E2           (None, None, None, '1.23')
           0001.23E2  1.23E2           (None, None, None, '1.23')
      0001.2300000E2  1.23E2           (None, None, None, '1.23')
             1.46e10  1.46e10          (None, None, None, '1.46')
          0001.46e10  1.46e10          (None, None, None, '1.46')
     0001.4600000e10  1.46e10          (None, None, None, '1.46')
          1.077e-456  1.077e-456       (None, None, None, '1.077')
       0001.077e-456  1.077e-456       (None, None, None, '1.077')
    0001.077000e-456  1.077e-456       (None, None, None, '1.077')
            1.069e10  1.069e10         (None, None, None, '1.069')
         0001.069e10  1.069e10         (None, None, None, '1.069')
      0001.069000e10  1.069e10         (None, None, None, '1.069')
        105040.03e10  105040.03e10     (None, None, None, '105040.03')
     000105040.03e10  105040.03e10     (None, None, None, '105040.03')
      105040.0300e10  105040.03e10     (None, None, None, '105040.03')


             ..18000  No match         No groups
              25..00  No match         No groups
             36...77  No match         No groups
                2..8  No match         No groups
              3.8..9  No match         No groups
             .12500.  No match         No groups
           12.51.400  No match         No groups
于 2011-05-01T18:14:41.190 回答