码迷,mamicode.com
首页 > 其他好文 > 详细

Data Visualizations 3

时间:2016-10-14 07:36:53      阅读:120      评论:0      收藏:0      [点我收藏+]

标签:

Data Cleaning and visualization:

  1.Before cleaning a set of data, we need to inspect the data by using shape(),head(),dtype(),decribe() function.

  2.First, we are going to deal with the missing data.(by using dropna() or loc[])   

  3.Second, we are going to normalize/victorize the data. 

  4.We need to convert some special data types to float. ( the use of str.rstrip(""), astype("") )

  5.To change the index of each dataframe by using set_index function.

  6.Create a new Dataframe which contains only necessary data. When create a new dataframe according to an origional data frame. The index keep the same.

  #critics_reviews =pd.DataFrame({"RT Score":pixar_movies["RT Score"],"IMDB Score":pixar_movies["IMDB Score"],"Metacritic Score":pixar_movies["Metacritic Score"]})

  7.Plot the dataset. Adjust the cell size by using figsize function. #critics_reviews.plot(figsize = (9,5),kind = ‘box‘)

  8.To compare two values which has the same total number(like percentage). We can use stacked bar plot.

Conclusion:

  Before analyzing the data. First we want to have a clean data set. It is better the data set only contains float or string in the same range. Then we plotting the data set to create a compelling chart. 

Data Visualizations 3

标签:

原文地址:http://www.cnblogs.com/kingoscar/p/5958917.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!