代码 :
import scipy as sp
import matplotlib.pyplot as plt
data=sp.genfromtxt("data/train.tsv", delimiter ="\t", dtype="string", comments=None, skip_header=1)
x = data[:,0]
y = data[:,1]
x = x[~sp.isnan(y)]
y = x[~sp.isnan(y)]
DataOfInterest=x["avglinksize"]
EphemeralOrEvergreen=x["label"]
plt.scatter(DataOfInterest,EphemeralOrEvergreen)
plt.title("Training data")
plt.xlabel("Single feature from training set")
plt.ylabel("Ephemeral or Evergreen")
plt.grid()
plt.show()
输出 :
python GenGraphs.py
Traceback (most recent call last):
File "GenGraphs.py", line 4, in <module>
data=sp.genfromtxt("data/train.tsv", delimiter ="\t", dtype="string", comments=None, skip_header=1)
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 1746, in genfromtxt
output = np.array(data, dtype)
MemoryError
我正在尝试将 tsv 文件中的一列与另一列进行对比。
我在这里误解了什么?我还能怎么做?