标签:scikit-learn 机器学习 xgboost machine learning in
接上篇:http://blog.csdn.net/mmc2015/article/details/47304591
def xgboost_pred(train,labels,test): params = {} params["objective"] = "reg:linear" params["eta"] = 0.005 params["min_child_weight"] = 6 params["subsample"] = 0.7 params["colsample_bytree"] = 0.7 params["scale_pos_weight"] = 1 params["silent"] = 1 params["max_depth"] = 9 plst = list(params.items()) #Using 5000 rows for early stopping. offset = 4000 num_rounds = 10000 xgtest = xgb.DMatrix(test) #create a train and validation dmatrices xgtrain = xgb.DMatrix(train[offset:,:], label=labels[offset:]) xgval = xgb.DMatrix(train[:offset,:], label=labels[:offset]) #train using early stopping and predict watchlist = [(xgtrain, 'train'),(xgval, 'val')] model = xgb.train(plst, xgtrain, num_rounds, watchlist, early_stopping_rounds=120) preds1 = model.predict(xgtest,ntree_limit=model.best_iteration) #reverse train and labels and use different 5k for early stopping. # this adds very little to the score but it is an option if you are concerned about using all the data. train = train[::-1,:] labels = np.log(labels[::-1]) xgtrain = xgb.DMatrix(train[offset:,:], label=labels[offset:]) xgval = xgb.DMatrix(train[:offset,:], label=labels[:offset]) watchlist = [(xgtrain, 'train'),(xgval, 'val')] model = xgb.train(plst, xgtrain, num_rounds, watchlist, early_stopping_rounds=120) preds2 = model.predict(xgtest,ntree_limit=model.best_iteration) #combine predictions #since the metric only cares about relative rank we don't need to average preds = (preds1)*1.4 + (preds2)*8.6 return preds
代码具体分析有时间写,欢迎吐槽。。。。
版权声明:本文为博主原创文章,未经博主允许不得转载。
machine learning in coding(python):使用xgboost构建预测模型
标签:scikit-learn 机器学习 xgboost machine learning in
原文地址:http://blog.csdn.net/mmc2015/article/details/47304779