码迷,mamicode.com
首页 > 编程语言 > 详细

[Python Cookbook] Pandas Groupby

时间:2019-02-11 10:57:14      阅读:270      评论:0      收藏:0      [点我收藏+]

标签:splay   res   man   mamicode   get   following   ret   series   OWIN   

Groupby Count

# Party’s Frequency of donations
nyc.groupby(’Party’)[’contb receipt amt’].count()

The command returns a series where the index is the name of a Party and the value is the count of that Party. Note that the series is ordered by the name of Party alphabetically.

技术图片

 

Multiple Variables

# Party’s Frequency of donations by Date
nyc.groupby([’Party’, ’Date’])[’contb receipt amt’].count()

技术图片

 

Groupby Sum

# Party’s Sum of donations
nyc.groupby(’Party’)[’contb receipt amt’].sum()

# Define the format of float
pd.options.display.float format = ’{:,.2f}’.format 
nyc.groupby(’Party’)[’contb receipt amt’].sum()

技术图片

 

Groupby Order

# Top 5 Donors, by Occupation
df7 = nyc.groupby(’contbr occupation’)[’contb receipt amt’]. sum(). reset  index ()
df7.sort_values(’contb receipt amt’, ascending=False, inplace =True)
df7.head(5)
#or
df7.nlargest(5,’contb receipt amt’)

技术图片

# Bottom 5 Donors, by Occupation
df8 = nyc.groupby(’contbr occupation’)[’contb receipt amt’]. sum() . reset   index ()
df8 . sort_values (by=’ contb receipt amt ’ , inplace=True) df8.head(5)
# OR
df7.tail(5)
#OR
df8.nsmallest(5,’contb receipt amt’)

技术图片

Get rid of negative values:

df8 [ df8 . contb receipt amt >0].head(5)

技术图片

The following commands give an example to find the Top 5 occupations that donated to each cadidate. Note that we need to sort the table based on two variables, firtly sorted by candidate name alphabetically and then sorted by contribution amount in a descending order. Finally, we hope to show the Top 5 occupations for each candidate. 

# Top 5 Occupations that donated to Each Candidate
df10 = nyc.groupby ([ ’cand_nm’ , ’contbr_occupation’ ]) [ ’contb_receipt_amt’ ].sum().reset_index ()
df10.sort_values ([ ’cand_nm’ , ’contb_receipt_amt’ ] , ascending =[True , False ], inplace=True)
df10.groupby(’cand_nm’).head(5)

 

Groupby Plot 

#Top 5 Fundraising Candidates Line Graph
df11 = nyc.groupby(’cand_nm’)[’contb_receipt_amt’].sum(). reset_index ()
df11_p = df11.nlargest(5,’contb_receipt_amt’)
df11_g = nyc[nyc.cand_nm.isin(df11_p.cand_nm)][[ ’cand_nm’,’Date’,’contb_receipt_amt’]]
dfpiv=pd.pivot table(df11_g , values=’contb_receipt_amt’, index=[’Date’],columns=[’cand_nm’], aggfunc=np.sum)
dfpiv.loc[‘2016-01-01‘:‘2016?01?30‘].plot.line()

技术图片

 

[Python Cookbook] Pandas Groupby

标签:splay   res   man   mamicode   get   following   ret   series   OWIN   

原文地址:https://www.cnblogs.com/sherrydatascience/p/10360750.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!