pandas cookbook【1】

时间：2016-08-09 23:25:05 阅读：450 评论：0 收藏：0 [点我收藏+]

标签：

网上看到了关于pandas的用法，虽然练习了不少，但是还是有些不是能记得很清楚。所以就写下来了。

chapter1讲的是读取CSV文件。如下代码：

 1 #%%
 2 import pandas as pd
 3 import numpy as np
 4 import matplotlib.pyplot as plt
 5 # make the graphs a bit prettier
 6 pd.set_option(‘display.mpl_style‘,‘default‘)
 7 plt.rcParams[‘figure.figsize‘] = (15,5)
 8 
 9 #%%
10 broken_df = pd.read_csv(‘C:\Users\rui\Desktop\pandas-cookbook-master\data\bikes.csv‘)
11 #look at the first 3rows;

其中，关于read_csv的用法，还没有进行深入的了解。之后下一篇会做一个专门的讲解。

1 fixed_df = pd.read_csv(‘C:\Users\rui\Desktop\pandas-cookbook-master\data\bikes.csv‘,
2                        sep=‘;‘,encoding=‘latinl‘,parse_dates=[‘Date‘],dayfirst=True,index_col=‘Date‘)

其中，sep代表分隔符，encoding是指明编码，如果文件中包含非-ASCII字符字段，要确保以正确的编码方式读取。这是在UTF-8的本地系统里面读取Latin-1文件的一个主要问题。此时，可以如下处理。对参数parse_date：if True then index will be parsed as dates (False by default). You can specify more complicated options to parse a subset of columns or a combination of columns into a single date column (list of ints or names, list of lists, or dict) [1, 2，3]；

dayfirst：boolean, default False //DD/MM format dates, international and European format；

index_col：int or sequence or False, default None

　　 Column to use as the row labels of the DataFrame. If a sequence is given, a MultiIndex is used. If you have a malformed file with delimiters at the end of each 　　　　 line, you might consider index_col=False to force pandas to _not_ use the first column as the index (row names)

1 fixed_df[‘Berri 1‘].plot() 2 fixed_df.plot(figsize=(15,5))

其中，plot函数并不能和原文档里一样画出Berri 1的数据。目前正在寻找原因；

pandas cookbook【1】

标签：

原文地址：http://www.cnblogs.com/shr123/p/5754993.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行