0

New to Python, and programming in general and trying to:

1) Read multiple (identically formatted) CSV files from a folder

2) Plot column X 'Time' vs column Y 'pH' from each of the CSV files on a single plot

3) Create a legend using the filename (without .csv) as the reference for each line of the plot.

I have been able to open a single CSV file and plot X vs Y, but have had no success iterating over the files and overlaying multiple lines on a single plot.

Any help would be greatly appreciated! I've tried a few different ways of reading files in, and I'm just showing one of them below. I'd rather read in the files as individual pandas datatables, so that I can maniupulate them later. For now, I'm hoping just to get some basic code working.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas import Series, DataFrame
from numpy import nan as NA
import glob

ferms = glob.glob ('Python/CSV/*.csv')
print ferms

for ferm in ferms:
    fig = plt.figure()
    ax = fig.add_subplot(1,1,1)
    ax.plot(ferms['EFT(h)'], ferms['pH1.PV [pH]'], 'k--')
    plt.xlabel('EFT(h)')
    plt.ylabel('pH')
    plt.show()

Revised code based on @Paul H suggestion


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas import Series, DataFrame
from numpy import nan as NA
import glob

ferms = glob.glob ('Python/CSV/*.csv')
print ferms
fig = plt.figure()
ax = fig.add_subplot(1,1,1)

for ferm in ferms:
# define the dataframe
    data = pd.read_csv(ferm)    
    ax.plot(ferms[0], ferms[3], 'k--')

plt.xlabel('EFT(h)')
plt.ylabel('pH')
plt.show()

new error:

--> 235 return array(a, dtype, copy=False, order=order) 236 237 def asanyarray(a, dtype=None, order=None):

ValueError: could not convert string to float: Python/CSV\20135140.csv


Just to check, I went into my csv files and deleted the headers, thinking they could have been the cause of the 'string to float' error. However, even with only numbers in my csvs, it threw the same error.

4

2 回答 2

0

看起来它不起作用,因为您正在为每个循环创建一个新图形。

尝试这个:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas import Series, DataFrame
from numpy import nan as NA
import glob

ferms = glob.glob ('Python/CSV/*.csv')
print ferms
fig = plt.figure()
ax = fig.add_subplot(1,1,1)

for ferm in ferms:
    # define the dataframe
    data = pd.read_csv(ferm)
    ax.plot(data['EFT(h)'], data['pH1.PV [pH]'], 'k--')

plt.xlabel('EFT(h)')
plt.ylabel('pH')
plt.show()
于 2013-06-20T17:52:43.453 回答
0

我在尝试从任意数量的文件中绘制数据时遇到了类似的问题。这是我的帖子Traceback lines on plot of multiple files的链接。基本上,您想绘制每个文件中的数据,但不包括 plt.show() 在遍历每个文件的循环中。plt.show() 应该在循环之外。

于 2017-12-11T15:38:52.263 回答