标签:scikit-learn 机器学习 数据挖掘 model persistence
参考:http://scikit-learn.org/stable/modules/model_persistence.html
训练了模型之后,我们希望可以保存下来,遇到新样本时直接使用已经训练好的保存了的模型,而不用重新再训练模型。本节介绍pickle在保存模型方面的应用。(After training a scikit-learn model, it is desirable to have a way to persist the model for future use without having to retrain. The following section gives you an example of how to persist a model with pickle. We’ll also review a few security and maintainability issues when working with pickle serialization.)
1、persistence example
It
is possible to save a model in the scikit by using Python’s built-in persistence model, namely pickle:
有些情况下(more efficient on objects that carry large numpy arrays internally)使用joblib’s 代替pickle (joblib.dump & joblib.load)。之后我们甚至可以在另一个pathon程序中load保存好的模型(pickle也可以。。。):
>>> from sklearn.externals import joblib >>> <strong>joblib.dump(clf, 'filename.pkl') >>> clf = joblib.load('filename.pkl') </strong>
Note
joblib.dump returns a list of filenames. Each individual numpy array contained in the clf object is serialized as a separate file on the filesystem. All files are required in the same folder when reloading the model with joblib.load.
2、security & maintainability limitations
pickle
(and joblib by extension)在maintainability and security方面有些问题,因为:
版权声明:本文为博主原创文章,未经博主允许不得转载。
scikit-learn:3.4. Model persistence
标签:scikit-learn 机器学习 数据挖掘 model persistence
原文地址:http://blog.csdn.net/mmc2015/article/details/47143539