3

我在 text.txt 文件中有一个数字列表。
2.50
2.56
2.81
2.86
2.84
3.21
3.47
2.91
2.96
3.11
2.83
2.89
2.94
2.94
3.34
3.44
2.94
2.96
3.04
3.01
2.85
3.05
3.10

我想对每组范围的每个数字进行分类。就像一个范围内有多少。
2.5-2.7
2.7-2.9
2.9-3.1
3.1-3.3
3.3-3.5

我试过这个。

from __future__ import division
from math import *
from numpy import *
from string import*

infile = open('text1.txt', 'r')
text = infile.read().split('\n')
infile.close()
text.remove('')

numbers = []
for i in text:
count = 0
if (numbers[i] > 2.49) and (numbers[i] < 2.59):
    count += 1
    print("Number of elements", count)

它不工作

4

6 回答 6

6

您可以使用该bisect模块:

>>> import bisect
>>> ranges = [2.5, 2.7, 2.9, 3.1, 3.3, 3.5]
>>> nums = [2.5, 2.56, 2.81, 2.86, 2.84, 3.21, 3.47, 2.91, 2.96, 3.11, 2.83, 2.89, 2.94, 2.94, 3.34, 3.44, 2.94, 2.96, 3.04, 3.01, 2.85, 3.05, 3.1]
>>> lis = [0]*len(ranges)
for item in nums:
    ind = bisect.bisect(ranges, item) - 1
   lis[ind] += 1
for x, y in zip(zip(ranges, ranges[1:]), lis):
   print x, y
...     
(2.5, 2.7) 2
(2.7, 2.9) 6
(2.9, 3.1) 9
(3.1, 3.3) 3
(3.3, 3.5) 3
于 2013-09-05T06:13:05.157 回答
6

使用更多numpy功能怎么样?

import numpy

numbers = numpy.loadtxt('test.txt')
bins = numpy.arange(2.5, 3.51, 0.2) #  3.5 won't work due to floating point issues
counts, _ = numpy.histogram(numbers, bins)

如果您不想使用numpy,您可以通过直接计算相同大小的箱子中的数字落入哪个箱子中受益:

numbers = [float(n) for n in open('test.txt') if len(n.strip())]
start = 2.5
width = 0.2
end = 3.7

def position(n):
    return int((n - start)/width)

counts = [0 for i in range(position(end))]
for n in numbers:
    counts[position(n)] += 1
于 2013-09-05T06:23:06.930 回答
1

那是行不通的,因为你没有存储在 numbers[] 中。

numbers = []
count = 0
for i in text:
    numbers.append(int(i))
    count=count+1

count = 0
for i in text:
    if (numbers[i] > 2.49) and (numbers[i] < 2.59):
        count += 1
print("Number of elements", count)
于 2013-09-05T06:08:31.980 回答
1

首先,您可以使用以下方法改进文件阅读readlines()

numbers = [float(i.strip()) for i in infile.readlines() if i is not '']

接下来,对于 bin 计数,假设每个 bin 的范围相等,您可以创建两个变量指定起始值和增量:

start = 2.5
delta = 0.2
nBins = 5

然后您可以使用filter以下方式获取每个范围的计数:

counts = [len(filter(lambda x: start+delta*i <= x < start+delta*(i+1), numbers)) for i in xrange(nBins)]

并打印结果:

for i,count in enumerate(counts):
    print "Number of elements in the range %.1f-%.1f: %d" % (start+delta*i,start+delta*(i+1),count)

完整代码:

infile = open('text1.txt', 'r')
numbers = [float(i.strip()) for i in infile.readlines() if i is not '']

start = 2.5
delta = 0.2
nBins = 5

counts = [len(filter(lambda x: start+delta*i <= x < start+delta*(i+1), numbers)) for i in xrange(nBins)]

for i,count in enumerate(counts):
    print "Number of elements in the range %.1f-%.1f: %d" % (start+delta*i,start+delta*(i+1),count)
于 2013-09-05T06:21:39.327 回答
0

您可以使用以下代码

infile = open('text1.txt','r')
text = infile.read().split('\n')
infile.close()
#text.remove('')

#calculate the numbers of each range
numbers = [0,0,0,0,0]
for i in text:
    temp = float(i);
    temp = (temp-2.5)*100
    temp = int(temp)/20
    numbers[temp] =numbers[temp] + 1  

#display the numbers of each range
print 'Number of elements'
print '2.5-2.7: '+ str(numbers[0])
print '2.7-2.9: '+ str(numbers[1]) 
print '2.9-3.1: '+ str(numbers[2])
print '3.1-3.3: '+ str(numbers[3])
print '3.3-3.5: '+ str(numbers[4])
于 2013-09-05T06:37:32.583 回答
0

尝试这个:

ranges = [(2.5, 2.7), (2.7, 2.9), (2.9, 3.1), (3.1, 3.3), (3.3, 3.5)]
counts = {i:0 for i in ranges}

def findBin(n, bins):
    mid = len(bins)/2
    low, high = bins[mid]
    if low <= n <= high:
        return (low,high)
    elif low >= n:
        return findBin(n, bins[mid:])
    else:
        return findBin(n, bins[:mid])

with open('path/to/file') as infile:
    for line in infile:
        n = float(line.strip())
        counts[findBin(n, ranges)] += 1

for low,high in sorted(counts):
    print("There are", counts[(low,high)], "many numbers between", low, "and", high)

希望这可以帮助

于 2013-09-05T06:14:03.623 回答