标签:eem ike fixed exce comment chain single instr rup
1.构造数据
import pandas as pd
data=pd.DataFrame({‘group‘:[‘a‘,‘a‘,‘a‘,‘b‘,‘b‘,‘b‘,‘c‘,‘c‘,‘c‘],
‘data‘:[4,1,2,2,3,5,3,5,5]})
data
2.排序
data.sort_values(by=[‘group‘,‘data‘],ascending=[False,True],inplace=True)#by指定序列,ascending=[False,True]指定升序,BOOL来确定是升序还是降序;inplace=True确认改变原始数据
data
3指定键值进行排序:
data=pd.DataFrame({‘k1‘:[‘one‘]*3+[‘two‘]*4,‘k2‘:[3,2,1,3,3,3,4]})
data
data.sort_values(by=‘k2‘)
5.对重复的数据删除
data.drop_duplicates()#删除k1+k2里都重复的值
data.drop_duplicates(‘k1‘)#删除k1重复的值
6.对值作出一个新的映射
data1=pd.DataFrame({‘food‘:[‘A1‘,‘A2‘,‘B1‘,‘B2‘,‘C1‘,‘C2‘,‘C3‘],‘data‘:[1,2,3,4,5,6,7]})
data1
6-1 apply的映射
def food_map(series):
if series[‘food‘]==‘A1‘:
return ‘A‘
elif series[‘food‘]==‘A2‘:
return ‘A‘
elif series[‘food‘]==‘B1‘:
return ‘B‘
elif series[‘food‘]==‘B2‘:
return ‘B‘
elif series[‘food‘]==‘C1‘:
return ‘C‘
elif series[‘food‘]==‘C2‘:
return ‘C‘
elif series[‘food‘]==‘C3‘:
return ‘C‘
data1[‘food_map‘]=data1.apply(food_map,axis=‘columns‘)#apply映射
data1
6-2 map的映射
food2Upper={
‘A1‘:‘A‘,
‘A2‘:‘A‘,
‘B1‘:‘B‘,
‘B2‘:‘B‘,
‘C1‘:‘C‘,
‘C2‘:‘C‘,
‘C3‘:‘C‘}#字典的映射
data1[‘upper‘]=data1[‘food‘].map(food2Upper)#map映射操作
data1
7.新添加一列 assign操作
import numpy as np
df=pd.DataFrame({‘data1‘:np.random.random(5),
‘data2‘:np.random.random(5)})
df2=df.assign(rantion=df[‘data1‘]/df[‘data2‘])
df2
df2.drop(‘rantion‘,axis=‘columns‘,inplace=True)#删除指定列操作
df2
8.替换值 replace
data=pd.Series([1,2,3,4,5,6,7,8,9])
data
data.replace(9,np.nan,inplace=True)
data
9.数据离散化:把数据按范围分组 pd.cut
ages=[15,20,18,25,46,89,66,80]
bins=[10,40,90]
bins_res=pd.cut(ages,bins)#离散化数据:10-40,40-90两组
bins_res
bins_res.labels#没有分类
pd.value_counts(bins_res)#输出分组的范围和每组的个数
pd.cut(ages,[10,30,50,90])#把bins直接用[10,30,50,80]代替
group_names=[‘Yonth‘,‘Mille‘,‘Old‘]
pd.value_counts(pd.cut(ages,[10,30,50,90],labels=group_names))
10.查看缺失值
df=pd.DataFrame([range(3),[0,np.nan,0],[0,0,np.nan],range(3)])#构建一些缺失值
df
df.isnull()#查看缺失值位置,False就是缺失值位置
df.isnull().any()#默认按列查看
df.isnull().any(axis=1)#默认按行查看
11.填充缺失值
df.fillna(5)#用5填充缺失值
df[df.isnull().any(axis=1)]#定位有缺失值的行
标签:eem ike fixed exce comment chain single instr rup
原文地址:https://www.cnblogs.com/AI-robort/p/11654976.html