码迷,mamicode.com
首页 > 编程语言 > 详细

machine learning in coding(python):根据关键字合并feature,删除无用feature,转化为numpy数组

时间:2015-08-04 21:12:47      阅读:266      评论:0      收藏:0      [点我收藏+]

标签:scikit-learn   机器学习   根据关键字合并feature   删除feature   

import pandas as pd
import numpy as np
from sklearn import preprocessing
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout

# load training and test datasets
train = pd.read_csv('../input/train_set.csv', parse_dates=[2,])
test = pd.read_csv('../input/test_set.csv', parse_dates=[3,])

tubes = pd.read_csv('../input/tube.csv')

# create some new features
train['year'] = train.quote_date.dt.year
train['month'] = train.quote_date.dt.month
train['dayofyear'] = train.quote_date.dt.dayofyear
train['dayofweek'] = train.quote_date.dt.dayofweek
train['day'] = train.quote_date.dt.day

test['year'] = test.quote_date.dt.year
test['month'] = test.quote_date.dt.month
test['dayofyear'] = test.quote_date.dt.dayofyear
test['dayofweek'] = test.quote_date.dt.dayofweek
test['day'] = test.quote_date.dt.day

train = pd.merge(train,tubes,on='tube_assembly_id',how='inner')
test = pd.merge(test,tubes,on='tube_assembly_id',how='inner')

train['material_id'].fillna('SP-9999',inplace=True)
test['material_id'].fillna('SP-9999',inplace=True)

# drop useless columns and create labels
idx = test.id.values.astype(int)
test = test.drop(['id', 'tube_assembly_id', 'quote_date'], axis = 1)
labels = train.cost.values
train = train.drop(['quote_date', 'cost', 'tube_assembly_id'], axis = 1)

# convert data to numpy array
train = np.array(train)
test = np.array(test)
from:kaggle


版权声明:本文为博主原创文章,未经博主允许不得转载。

machine learning in coding(python):根据关键字合并feature,删除无用feature,转化为numpy数组

标签:scikit-learn   机器学习   根据关键字合并feature   删除feature   

原文地址:http://blog.csdn.net/mmc2015/article/details/47281029

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!