experiments Module¶
Functions related to running experiments and parsing configuration files.
| author: | Dan Blanchard (dblanchard@ets.org) |
|---|---|
| author: | Michael Heilman (mheilman@ets.org) |
| author: | Nitin Madnani (nmadnani@ets.org) |
| author: | Chee Wee Leong (cleong@ets.org) |
-
class
skll.experiments.NumpyTypeEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶ Bases:
json.encoder.JSONEncoderThis class is used when serializing results, particularly the input label values if the input has int-valued labels. Numpy int64 objects can’t be serialized by the json module, so we must convert them to int objects.
A related issue where this was adapted from: http://stackoverflow.com/questions/11561932/why-does-json-dumpslistnp-arange5-fail-while-json-dumpsnp-arange5-tolis
-
skll.experiments.run_configuration(config_file, local=False, overwrite=True, queue='all.q', hosts=None, write_summary=True, quiet=False, ablation=0, resume=False, log_level=20)[source]¶ Takes a configuration file and runs the specified jobs on the grid.
Parameters: - config_file (str) – Path to the configuration file we would like to use.
- local (bool, optional) – Should this be run locally instead of on the cluster?
Defaults to
False. - overwrite (bool, optional) – If the model files already exist, should we overwrite
them instead of re-using them?
Defaults to
True. - queue (str, optional) – The DRMAA queue to use if we’re running on the cluster.
Defaults to
'all.q'. - hosts (list of str, optional) – If running on the cluster, these are the machines we should use.
Defaults to
None. - write_summary (bool, optional) – Write a TSV file with a summary of the results.
Defaults to
True. - quite (bool, optional) – Suppress printing of “Loading…” messages.
Defaults to
False. - ablation (int, optional) – Number of features to remove when doing an ablation
experiment. If positive, we will perform repeated ablation
runs for all combinations of features removing the
specified number at a time. If
None, we will use all combinations of all lengths. If 0, the default, no ablation is performed. If negative, aValueErroris raised. Defaults to 0. - resume (bool, optional) – If result files already exist for an experiment, do not
overwrite them. This is very useful when doing a large
ablation experiment and part of it crashes.
Defaults to
False. - log_level (str, optional) – The level for logging messages.
Defaults to
logging.INFO.
Returns: result_json_paths – A list of paths to .json results files for each variation in the experiment.
Return type: list of str
Raises: ValueError– If value for"ablation"is not a positive int orNone.OSError– If the lenth of theFeatureSetname > 210.