Metadata-Version: 2.1
Name: neoscr
Version: 2.1.1
Summary: Wrapper to query the SCR api
Home-page: https://github.com/datarisk-io/neoscr
Author: João Nogueira
Author-email: joao.nogueira@datarisk.io
License: Apache Software License 2.0
Keywords: nbdev jupyter notebook python
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Provides-Extra: dev
License-File: LICENSE

# neoscr

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Install

``` sh
pip install neoscr
```

## How to use

Fill me in please! Don’t forget code examples:

``` python
from neoscr.core import ConsultaSCR
```

``` python
import os

scr = ConsultaSCR(
    user=os.environ["SCR_USER"],
    password=os.environ["SCR_PASSWORD"],
    code=os.environ["SCR_CODE"],
    api_key=os.environ["SCR_API_KEY"]
)
```

<div>

> **Warning**
>
> You have the choice to not pass the API credentials on the
> [`ConsultaSCR`](https://datarisk-io.github.io/neoscr/core.html#consultascr)
> instantiation, but for that you should have the credentials to access
> the SCR API stored in your OS environment variables.

</div>

``` python
cpf = "867.168.046-09" # fake cpf
ano = 2022
mes = 12

# retorna três dataframes
df_cpf_traduzido, df_cpf_modalidade, df_cpf_resumo_lista_das_operacoes = scr.get_cpf_data(cpf, ano, mes)
```

<div>

> **Note**
>
> `neoscr` will save each request made into `.neoscr` folder located at
> your home directory.
>
> For the example above, the saved file will be:
> `~/.neoscr/86716804609_2022_12.json`
>
> Next time you do the same request, it will load from the local
> storage.

</div>

``` python
cnpj = "79.322.561/0001-67" # fake cnpj
ano = 2022
mes = 12

# retorna três dataframes
df_cnpj_traduzido, df_cnpj_modalidade, df_cnpj_resumo_lista_das_operacoes = scr.get_cnpj_data(cnpj, ano, mes)
```

# Batch Query

Execute the code below to query a list of cpfs or cnpjs (under
modification) and download the data

<div>

> **Caution**
>
> Please don’t just copy and execute the code above. Read it and adapt
> it to your needs.

</div>

``` python
import os
import logging
import pandas as pd
from tqdm import tqdm

from neoscr.utils import let_only_digits

# carregando a lista de cpfs
df = pd.read_csv("dataset.csv")
lista_de_cpfs = df['cpf'].tolist()

# instanciando o objeto ConsultaSCR
scr = ConsultaSCR()

# instanciando o objeto logger
logger = logging.getLogger('database_updater')
logger.setLevel(logging.DEBUG)

# criando o file handler
file_handler = logging.FileHandler('querylog.log')
file_handler.setLevel(logging.DEBUG)

# adicionando o file handler ao logger
logger.addHandler(file_handler)

# iterando sobre a lista de cpfs e enriquecendo
ano = 2022
mes = 12
for cpf in tqdm(lista_de_cpfs):
    try:
        df_traduzido, df_modalidade, df_cnpj_resumo_lista_das_operacoes = scr.get_cpf_data(cpf, ano, mes)                               
        cpf_only_digits = let_only_digits(cpf)
        df_traduzido.to_csv(f"data/scr/raw/{cpf_only_digits}_traduzido.csv", index=False)
        df_modalidade.to_csv(f"data/scr/raw/{cpf_only_digits}_modalidade.csv", index=False)
        df_cnpj_resumo_lista_das_operacoes.to_csv(f"data/scr/raw/{cpf_only_digits}_resumo_lista_das_operacoes.csv", index=False)
    except:
        logger.error(f"Erro no CPF {cpf}")
        continue
```

After download the data, you may want to get all the raw data together
in one big table:

``` python
# carregandos os dados de todos os arquivos salvos
df_traduzido_full = pd.DataFrame()
for file in os.listdir("data/scr/raw/"):
    if file.endswith("_traduzido.csv"):
        df_traduzido = pd.read_csv(f"data/scr/raw/{file}")
        df_traduzido_full = pd.concat([df_traduzido_full, df_traduzido])

df_modalidade_full = pd.DataFrame()
for file in os.listdir(".data/scr/raw"):
    if file.endswith("_modalidade.csv"):
        df_modalidade = pd.read_csv(f"data/scr/raw/{file}")
        df_modalidade_full = pd.concat([df_modalidade_full, df_modalidade])

df_cnpj_resumo_lista_das_operacoes_full = pd.DataFrame()
for file in os.listdir("data/scr/raw/"):
    if file.endswith("_resumo_lista_das_operacoes.csv"):
        df_cnpj_resumo_lista_das_operacoes = pd.read_csv(f"data/scr/raw/{file}")
        df_cnpj_resumo_lista_das_operacoes_full = pd.concat([df_cnpj_resumo_lista_das_operacoes_full, df_cnpj_resumo_lista_das_operacoes])
```


