pandas 学习总结
作者:csj 更新时间:2018.04.02 shenzhen
email:59888745@qq.com
home: http://www.cnblogs.com/csj007523/p/8149929.html
1.import
2.export
3.create object
4.vewing,inspecting data
5.select data
6.data cleaning
7.filter,sort,groupby
8.join:merge,concat
import:
pd.read_csv(‘path‘)
pd.read_excel(‘path‘)
pd.read_table(‘path‘)
pd.read_sql(query,connstr)
read_html(url)
read_json(jsonstr)
pd.DataFrame(dict)
exporting:
df.to_csv(filename)
df.to_excel(filename)
df.to_json(filename)
df.to_sql(talbename,connstr)
create object:
pd.DataFrame(np.random.rand(20,4))
pd.Series(mylist)
df.index=pd.date_range(‘2018/01/01‘,periods=df.shape[0])
viewing/inspecting data:
df.head()
df.tail()
df.shape()
df.info()
df.describe()
df.apple()
df.columns
df.index s.value_counts()
select data:
df[col]
df[[col1,col2]]
df.col1
df.loc[col1/indexname]
df.iloc[0,:]
df.iloc[0,0]
data cleaning:
pd.isnull()
pd.notnull()
df.columns=[‘a‘,‘b‘,‘c‘,‘d‘]
df.dropna(how=‘any‘)
df.dropna(how=‘all‘)
df.dropna()
df.fillna(x)
df.fillna(s.mean())
s.astype(float)
s.replace(1,‘one‘)
s.replace([1,3],[‘one‘,‘three‘])
df.rename(columns=lambda x:x+1)
df.rename(columns={‘oldcolname‘:‘newcolumns‘})
df.rename(index=lambda x:x+1)
df.set_index(‘colu1‘)
filter,sort ,groupby:
df[df[col]>10]
df[df[col] > 5 & df[col] <10]
df.sort_values(col1)
df.sort_values(col1,ascending=False)
df.sort_values([col1,col2],ascending=[False,True])
df.groupby([col1,col2])
df.groupby(col).agg(np.mean)
df.apply(np.mean)
df.apply(np.max,axis=1) #across each row
df.pivot_table(index=col1,values=[col2,col3],aggfunc=mean)
join/combine:
pd.merge(lef,right,how=‘left/right/outer/inner/‘,on=[‘key1‘,‘key2‘]) 横向连接,用于将多个dataframe通过某个相同的键合并
为一个 pd.concat([df1,df2],axis=1) 可横向可纵向
Statistics:
df.describe() df.mean() df.corr() df.count() df.max() df.min() df.median() df.std()