evaluate module¶
The evaluate module defines the evaluate() function and
GridSearch class
-
class
surprise.evaluate.GridSearch(algo_class, param_grid, measures=['rmse', 'mae'], verbose=1)[source]¶ The
GridSearchclass, used to evaluate the performance of an algorithm on various combinations of parameters, and extract the best combination. It is analogous to GridSearchCV from scikit-learn.See User Guide for usage.
Parameters: - algo_class (
AlgoBase) – A class object of of the algorithm to evaluate. - param_grid (dict) – The dictionary has algo_class parameters as keys (string) and list of parameters as the desired values to try. All combinations will be evaluated with desired algorithm.
- measures (list of string) – The performance measures to compute. Allowed names are function
names as defined in the
accuracymodule. Default is['rmse', 'mae']. - verbose (int) – Level of verbosity. If
0, nothing is printed. If1, accuracy measures for each parameters combination are printed, with combination values. If2, folds accuracy values are also printed. Default is1.
-
cv_results¶ dict of arrays – A dict that contains all parameters and accuracy information for each combination. Can be imported into a pandas DataFrame.
-
best_estimator¶ dict of AlgoBase – Using an accuracy measure as key, get the estimator that gave the best accuracy results for the chosen measure.
-
best_score¶ dict of floats – Using an accuracy measure as key, get the best score achieved for that measure.
-
best_params¶ dict of dicts – Using an accuracy measure as key, get the parameters combination that gave the best accuracy results for the chosen measure.
-
best_index¶ dict of ints – Using an accuracy measure as key, get the index that can be used with cv_results_ that achieved the highest accuracy for that measure.
- algo_class (
-
surprise.evaluate.evaluate(algo, data, measures=['rmse', 'mae'], with_dump=False, dump_dir=None, verbose=1)[source]¶ Evaluate the performance of the algorithm on given data.
Depending on the nature of the
dataparameter, it may or may not perform cross validation.Parameters: - algo (
AlgoBase) – The algorithm to evaluate. - data (
Dataset) – The dataset on which to evaluate the algorithm. - measures (list of string) – The performance measures to compute. Allowed
names are function names as defined in the
accuracymodule. Default is['rmse', 'mae']. - with_dump (bool) – If True, the predictions and the algorithm will be
dumped for later further analysis at each fold (see FAQ). The file names will be set as:
'<date>-<algorithm name>-<fold number>'. Default isFalse. - dump_dir (str) – The directory where to dump to files. Default is
'~/.surprise_data/dumps/'. - verbose (int) – Level of verbosity. If 0, nothing is printed. If 1 (default), accuracy measures for each folds are printed, with a final summary. If 2, every prediction is printed.
Returns: A dictionary containing measures as keys and lists as values. Each list contains one entry per fold.
- algo (