标签:tor ase current eva ade gen and max test
1. We need to build the environemnt for the deepctr first
pip install -U deepctr_torch
2. The current version is not supported by torch 1.5.0, we need to downgrade the version to 1.4.0
pip install --default-timeout=100 --no-cache-dir torch==1.4.0+cpu torchvision==0.5.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
3. Then we can run the examples in the directly
python run_deepfm.py
4. The code for deepfm is copied from Zhihu
import pandas as pd from sklearn.metrics import log_loss, roc_auc_score from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder, MinMaxScaler from deepctr_torch.models import DeepFM from deepctr_torch.inputs import SparseFeat, DenseFeat, get_feature_names if __name__ == "__main__": data = pd.read_csv(‘criteo_sample.txt‘) sparse_features = [‘C‘ + str(i) for i in range(1, 27)] dense_features = [‘I‘ + str(i) for i in range(1, 14)] data[sparse_features] = data[sparse_features].fillna(‘-1‘, ) data[dense_features] = data[dense_features].fillna(0, ) target = [‘label‘] # 1.Label Encoding for sparse features,and do simple Transformation for dense features for feat in sparse_features: lbe = LabelEncoder() data[feat] = lbe.fit_transform(data[feat]) mms = MinMaxScaler(feature_range=(0, 1)) data[dense_features] = mms.fit_transform(data[dense_features]) # 2.count #unique features for each sparse field,and record dense feature field name fixlen_feature_columns = [SparseFeat(feat, vocabulary_size=data[feat].nunique(),embedding_dim=4) for i,feat in enumerate(sparse_features)] + [DenseFeat(feat, 1,) for feat in dense_features] dnn_feature_columns = fixlen_feature_columns linear_feature_columns = fixlen_feature_columns feature_names = get_feature_names(linear_feature_columns + dnn_feature_columns) # 3.generate input data for model train, test = train_test_split(data, test_size=0.2) train_model_input = {name:train[name] for name in feature_names} test_model_input = {name:test[name] for name in feature_names} # 4.Define Model,train,predict and evaluate model = DeepFM(linear_feature_columns, dnn_feature_columns, task=‘binary‘) model.compile("adam", "binary_crossentropy", metrics=[‘binary_crossentropy‘], ) history = model.fit(train_model_input, train[target].values, batch_size=256, epochs=10, verbose=2, validation_split=0.2, ) pred_ans = model.predict(test_model_input, batch_size=256) print("test LogLoss", round(log_loss(test[target].values, pred_ans), 4)) print("test AUC", round(roc_auc_score(test[target].values, pred_ans), 4))
5. The results are shown below
cpu Train on 128 samples, validate on 32 samples, 1 steps per epoch Epoch 1/10 0s - loss: 0.7001 - binary_crossentropy: 0.7001 - val_binary_crossentropy: 0.6912 Epoch 2/10 0s - loss: 0.6863 - binary_crossentropy: 0.6863 - val_binary_crossentropy: 0.6804 Epoch 3/10 0s - loss: 0.6727 - binary_crossentropy: 0.6727 - val_binary_crossentropy: 0.6702 Please check the latest version manually on https://pypi.org/project/deepctr-torch/#history Epoch 4/10 0s - loss: 0.6596 - binary_crossentropy: 0.6596 - val_binary_crossentropy: 0.6602 Epoch 5/10 0s - loss: 0.6469 - binary_crossentropy: 0.6469 - val_binary_crossentropy: 0.6506 Epoch 6/10 0s - loss: 0.6343 - binary_crossentropy: 0.6343 - val_binary_crossentropy: 0.6414 Epoch 7/10 0s - loss: 0.6220 - binary_crossentropy: 0.6220 - val_binary_crossentropy: 0.6329 Epoch 8/10 0s - loss: 0.6101 - binary_crossentropy: 0.6101 - val_binary_crossentropy: 0.6251 Epoch 9/10 0s - loss: 0.5985 - binary_crossentropy: 0.5985 - val_binary_crossentropy: 0.6176 Epoch 10/10 0s - loss: 0.5870 - binary_crossentropy: 0.5870 - val_binary_crossentropy: 0.6102 test LogLoss 0.6146 test AUC 0.5273
6. Then we are done with implementing deepfm in deepctr_torch
标签:tor ase current eva ade gen and max test
原文地址:https://www.cnblogs.com/huzdong/p/13333980.html