Metadata-Version: 2.1
Name: faker-pyspark
Version: 0.2.0
Summary: faker-pyspark is a PySpark DataFrame and Schema provider for the Faker python package
Home-page: https://github.com/spsoni/faker-pyspark
License: MIT
Keywords: Faker, PySpark
Author: Sury Soni
Author-email: github@suryasoni.info
Requires-Python: >=3.8.1,<4.0.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: Faker (>=18.11.1,<19.0.0)
Requires-Dist: pyspark (>=3.4.0,<4.0.0)
Project-URL: Repository, https://github.com/spsoni/faker-pyspark
Description-Content-Type: text/markdown


# PySpark provider for Faker

[![Python package](https://github.com/spsoni/faker_pyspark/actions/workflows/python-package.yml/badge.svg)](https://github.com/spsoni/faker_pyspark/actions/workflows/python-package.yml)

`faker-pyspark` is a PySpark DataFrame and Schema (StructType) provider for the `Faker` Python package.


## Description

`faker-pyspark` provides PySpark based fake data for testing purposes.  The definition of "fake" in this context really means "random," as the data may look real.  However, I make no claims about accuracy, so do not use this as real data!


## Installation

Install with pip:

``` bash
pip install faker-pyspark

```

Add as a provider to your Faker instance:

``` python

from faker import Faker
from faker_pyspark import PySparkProvider
fake.add_provider(PySparkProvider)

```

If you already use faker, you probably know the conventional use is:

```python
from faker import Faker
fake = Faker()
```


### PySpark DataFrame and Schema (StructType)

``` python
>>> df = fake.pyspark_dataframe()

>>> schema = fake.pyspark_schema()

```

