标签:middle can xen ilo when bin nal typeerror sha
Summary of Indexing operation in DataFrame of Pandas
For new users of pandas, the index of DataFrame may seem confusing, so personally I list all its usage in detail and finally make a conclusion about the result of exploration on indexing operation on DataFrame of pandas.
import pandas as pd
import numpy as np
df=pd.DataFrame(np.arange(16).reshape(4,4),index=[‘Ohio‘,‘Colorado‘,‘Utah‘,‘New York‘],columns=[‘one‘,‘two‘,‘three‘,‘four‘]);df
|
one |
two |
three |
four |
Ohio |
0 |
1 |
2 |
3 |
Colorado |
4 |
5 |
6 |
7 |
Utah |
8 |
9 |
10 |
11 |
New York |
12 |
13 |
14 |
15 |
(1) df[val]
- when val is a number,df[val] selects single column from DataFrame,returnning Series type.
df[‘one‘]
Ohio 0
Colorado 4
Utah 8
New York 12
Name: one, dtype: int32
- when val is a list,df[val] selects sequence columns from DataFrame,returnning DataFrame type.
df[[‘one‘,‘two‘]]
|
one |
two |
Ohio |
0 |
1 |
Colorado |
4 |
5 |
Utah |
8 |
9 |
New York |
12 |
13 |
- when val is
:num
, df[val] selects rows, and that is for a convenience purpose.That is equivalent to df.iloc[:num],which is specially used to deal with row selection.
df[:2]
|
one |
two |
three |
four |
Ohio |
0 |
1 |
2 |
3 |
Colorado |
4 |
5 |
6 |
7 |
df.iloc[:2] # the same with above
|
one |
two |
three |
four |
Ohio |
0 |
1 |
2 |
3 |
Colorado |
4 |
5 |
6 |
7 |
df[1:3]
|
one |
two |
three |
four |
Colorado |
4 |
5 |
6 |
7 |
Utah |
8 |
9 |
10 |
11 |
df.iloc[1:3]
|
one |
two |
three |
four |
Colorado |
4 |
5 |
6 |
7 |
Utah |
8 |
9 |
10 |
11 |
- when val is boolean DataFrame, df[val] sets values based on boolean
df<5
|
one |
two |
three |
four |
Ohio |
True |
True |
True |
True |
Colorado |
True |
False |
False |
False |
Utah |
False |
False |
False |
False |
New York |
False |
False |
False |
False |
df[df<5]
|
one |
two |
three |
four |
Ohio |
0.0 |
1.0 |
2.0 |
3.0 |
Colorado |
4.0 |
NaN |
NaN |
NaN |
Utah |
NaN |
NaN |
NaN |
NaN |
New York |
NaN |
NaN |
NaN |
NaN |
df[df<5]=0;df
|
one |
two |
three |
four |
Ohio |
0 |
0 |
0 |
0 |
Colorado |
0 |
5 |
6 |
7 |
Utah |
8 |
9 |
10 |
11 |
New York |
12 |
13 |
14 |
15 |
(2)df.loc[val]
- when val is a single index value,selects corresponding row,returnning Series type, and when val is list of index vale, selects corresponding rows,returnning DataFrame type.
df.loc[‘Colorado‘]
one 0
two 5
three 6
four 7
Name: Colorado, dtype: int32
df.loc[[‘Colorado‘,‘New York‘]]
|
one |
two |
three |
four |
Colorado |
0 |
5 |
6 |
7 |
New York |
12 |
13 |
14 |
15 |
(3)df.loc[:,val]
- when val is a single column value,selects corresponding column,returning Series type and when val is list of columns,select corresponding columns,returnning DataFrame type.
df.loc[:,‘two‘]
Ohio 0
Colorado 5
Utah 9
New York 13
Name: two, dtype: int32
df.loc[:,[‘two‘]] # Note that ,as long as val is a list even though containing just one element ,it will return DataFrame type.
|
two |
Ohio |
0 |
Colorado |
5 |
Utah |
9 |
New York |
13 |
df.loc[:,[‘one‘,‘two‘]]
|
one |
two |
Ohio |
0 |
0 |
Colorado |
0 |
5 |
Utah |
8 |
9 |
New York |
12 |
13 |
df[[‘one‘,‘two‘]] # The same with above df.loc[:,[‘one‘,‘two‘]]
|
one |
two |
Ohio |
0 |
0 |
Colorado |
0 |
5 |
Utah |
8 |
9 |
New York |
12 |
13 |
(3)df.loc[val1,val2]
- when val1 may be a single index value or list of index values,and val2 may be a single column value or list of column values,selects the combination data decided by both val1 and val2.And specially, val1 or val2 can both be : to participate in the combination.
df.loc[‘Ohio‘,‘one‘]
0
df.loc[[‘Ohio‘,‘Utah‘],‘one‘]
Ohio 0
Utah 8
Name: one, dtype: int32
df.loc[‘Ohio‘,[‘one‘,‘two‘]]
one 0
two 0
Name: Ohio, dtype: int32
df.loc[[‘Ohio‘,‘Utah‘],[‘one‘,‘two‘]]
|
one |
two |
Ohio |
0 |
0 |
Utah |
8 |
9 |
df.loc[:,:]
|
one |
two |
three |
four |
Ohio |
0 |
0 |
0 |
0 |
Colorado |
0 |
5 |
6 |
7 |
Utah |
8 |
9 |
10 |
11 |
New York |
12 |
13 |
14 |
15 |
df.loc[‘Ohio‘,:]
one 0
two 0
three 0
four 0
Name: Ohio, dtype: int32
df.loc[:,‘two‘]
Ohio 0
Colorado 5
Utah 9
New York 13
Name: two, dtype: int32
df.loc[:,[‘one‘,‘two‘]]
|
one |
two |
Ohio |
0 |
0 |
Colorado |
0 |
5 |
Utah |
8 |
9 |
New York |
12 |
13 |
(4) df.iloc[val]
- Compared with df.loc,val shall be integer or lists of integer which represents the index number and the function is the same with df.loc
df.iloc[1]
one 0
two 5
three 6
four 7
Name: Colorado, dtype: int32
df.iloc[[1,3]]
|
one |
two |
three |
four |
Colorado |
0 |
5 |
6 |
7 |
New York |
12 |
13 |
14 |
15 |
(5)df.iloc[:,val]
- The same with df.loc,except that val shall be integer or list of integers.
df
|
one |
two |
three |
four |
Ohio |
0 |
0 |
0 |
0 |
Colorado |
0 |
5 |
6 |
7 |
Utah |
8 |
9 |
10 |
11 |
New York |
12 |
13 |
14 |
15 |
df.iloc[:,1]
Ohio 0
Colorado 5
Utah 9
New York 13
Name: two, dtype: int32
df.iloc[:,[1,3]]
|
two |
four |
Ohio |
0 |
0 |
Colorado |
5 |
7 |
Utah |
9 |
11 |
New York |
13 |
15 |
(6)df.iloc[val1,val2]
- The same with df.loc,except val1 and val2 shall be integer or list of integers
df.iloc[1,2]
6
df.iloc[1,[1,2,3]]
two 5
three 6
four 7
Name: Colorado, dtype: int32
df.iloc[[1,2],2]
Colorado 6
Utah 10
Name: three, dtype: int32
df.iloc[[1,2],[1,2]]
|
two |
three |
Colorado |
5 |
6 |
Utah |
9 |
10 |
df.iloc[:,[1,2]]
|
two |
three |
Ohio |
0 |
0 |
Colorado |
5 |
6 |
Utah |
9 |
10 |
New York |
13 |
14 |
df.iloc[[1,2],:]
|
one |
two |
three |
four |
Colorado |
0 |
5 |
6 |
7 |
Utah |
8 |
9 |
10 |
11 |
(7)df.at[val1,val2]
- val1 shall be a single index value,val2 shall be a single column value.
df.at[‘Utah‘,‘one‘]
8
df.loc[‘Utah‘,‘one‘] # The same with above
8
df.at[[‘Utah‘,‘Colorado‘],‘one‘] # Raise exception
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
D:\Anaconda\lib\site-packages\pandas\core\frame.py in _get_value(self, index, col, takeable)
2538 try:
-> 2539 return engine.get_value(series._values, index)
2540 except (TypeError, ValueError):
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
TypeError: ‘[‘Utah‘, ‘Colorado‘]‘ is an invalid key
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-77-c52a9db91739> in <module>()
----> 1 df.at[[‘Utah‘,‘Colorado‘],‘one‘]
D:\Anaconda\lib\site-packages\pandas\core\indexing.py in __getitem__(self, key)
2140
2141 key = self._convert_key(key)
-> 2142 return self.obj._get_value(*key, takeable=self._takeable)
2143
2144 def __setitem__(self, key, value):
D:\Anaconda\lib\site-packages\pandas\core\frame.py in _get_value(self, index, col, takeable)
2543 # use positional
2544 col = self.columns.get_loc(col)
-> 2545 index = self.index.get_loc(index)
2546 return self._get_value(index, col, takeable=True)
2547 _get_value.__doc__ = get_value.__doc__
D:\Anaconda\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3076 ‘backfill or nearest lookups‘)
3077 try:
-> 3078 return self._engine.get_loc(key)
3079 except KeyError:
3080 return self._engine.get_loc(self._maybe_cast_indexer(key))
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
TypeError: ‘[‘Utah‘, ‘Colorado‘]‘ is an invalid key
(8) df.iat[val1,val2]
- The same with df.at,except val1 and val2 shall be both integer
df.iat[2,2]
10
df
|
one |
two |
three |
four |
Ohio |
0 |
0 |
0 |
0 |
Colorado |
0 |
5 |
6 |
7 |
Utah |
8 |
9 |
10 |
11 |
New York |
12 |
13 |
14 |
15 |
Conclusion
- val in df[val] can be a column value or list of column values in this case to selecting the whole column,and specially can also be set :val meaning to select corresponding sliced rows.And also can be boolean DataFrame to set values.
- Generally speaking, df.loc[val] is mainly used to select rows or the combination of rows and columns,so val has the following forms:single row value,list of row values,val1,val2(val1 and val2 can be single value or list of values or :,and in this form,it selects the combination index value val1 and column value val2
- df.iloc[val] is the same with df.loc,except val demands integer,whatever single integer value or lists of integers.
- df.at[val1,val2] shall be only single value and this also applies to df.iat[val1,val2]
Summary of Indexing operation in DataFrame of Pandas
标签:middle can xen ilo when bin nal typeerror sha
原文地址:https://www.cnblogs.com/johnyang/p/12617102.html