Metadata-Version: 2.1
Name: field_match
Version: 0.1.0
Summary: Identify corresponding fields between datasets
Project-URL: Homepage, https://github.com/cstirry/field-match
Author-email: Christine Stirrat <christine.stirrat@gmail.com>
License-File: LICENSE
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# field-match

## Overview

field-match is a Python library designed to analyze and compare fields between two datasets based on similarity scores,
and match up the field names.

## Installation

To install the package, run:

```bash
pip install field-match
```

## Usage

Import and use the package as follows:

```python
import pandas as pd
from field_match import field_similarity_report, generate_column_rename

# Load your datasets
df1 = pd.read_csv('dataset1.csv')
df2 = pd.read_csv('dataset2.csv')

# Use the field_similarity_report function to match up fields by similarity score
results = field_similarity_report(df1, df2)
                   
# Generate the column rename code snippet
rename_snippet = generate_column_rename(results)                      

```

For more detailed examples, please refer to the `examples` folder in this repository.

## Example Applications

1. **Clarifying Ambiguous Dataset Fields:** When merging or integrating a new or updated dataset (such as data from a
   another year or source) with an existing dataset or workflow, uncertainty in how fields correspond to each
   may exist because of a lack of headers or differing field names. field_match can help identify which fields in the
   new
   dataset correspond to expected fields.

2. **Integrating External Dataset with Existing Model:** When feeding an external dataset into a pre-existing model,
   it's crucial to ensure that the data aligns correctly with the model's expected input format. field_match can help
   you
   identify which fields in your new dataset correspond to the fields your model expects.

## License

field-match is released under the [MIT License](LICENSE).