I want to specify data types for pandas read_csv. Here's a quick look at something that does work and then doesn't when types are specified. Why doesn't the latter work?
import io
import pandas as pd
csv = """foo,1234567,a,1
foo,2345678,b,3
bar,3456789,b,5
"""
df = pd.read_csv(io.StringIO(csv),
names=["fb", "num", "loc", "x"])
print(df)
df = pd.read_csv(io.StringIO(csv),
names=["fb", "num", "loc", "x"],
dtype=["|S3", "np.int64", "|S1", "np.int8"])
print(df)
I've updated to make this much simpler and, hopefully, clearer on BrenBarn's suggestion. My real dataset is much larger, but I'd like to use the method to generate types for all my data on import.