标签:
网上看到了关于pandas的用法,虽然练习了不少,但是还是有些不是能记得很清楚。所以就写下来了。
chapter1讲的是读取CSV文件。如下代码:
1 #%% 2 import pandas as pd 3 import numpy as np 4 import matplotlib.pyplot as plt 5 # make the graphs a bit prettier 6 pd.set_option(‘display.mpl_style‘,‘default‘) 7 plt.rcParams[‘figure.figsize‘] = (15,5) 8 9 #%% 10 broken_df = pd.read_csv(‘C:\Users\rui\Desktop\pandas-cookbook-master\data\bikes.csv‘) 11 #look at the first 3rows;
其中,关于read_csv的用法,还没有进行深入的了解。之后下一篇会做一个专门的讲解。
1 fixed_df = pd.read_csv(‘C:\Users\rui\Desktop\pandas-cookbook-master\data\bikes.csv‘, 2 sep=‘;‘,encoding=‘latinl‘,parse_dates=[‘Date‘],dayfirst=True,index_col=‘Date‘)
其中,sep代表分隔符,encoding是指明编码,如果文件中包含非-ASCII字符字段,要确保以正确的编码方式读取。这是在UTF-8的本地系统里面读取Latin-1文件的一个主要问题。此时,可以如下处理。对参数parse_date:if True then index will be parsed as dates (False by default). You can specify more complicated options to parse a subset of columns or a combination of columns into a single date column (list of ints or names, list of lists, or dict) [1, 2,3];
dayfirst:boolean, default False //DD/MM format dates, international and European format;
index_col:int or sequence or False, default None
Column to use as the row labels of the DataFrame. If a sequence is given, a MultiIndex is used. If you have a malformed file with delimiters at the end of each line, you might consider index_col=False to force pandas to _not_ use the first column as the index (row names)
1 fixed_df[‘Berri 1‘].plot() 2 fixed_df.plot(figsize=(15,5))
其中,plot函数并不能和原文档里一样画出Berri 1的数据。目前正在寻找原因;
标签:
原文地址:http://www.cnblogs.com/shr123/p/5754993.html