Metadata-Version: 2.1
Name: sec-downloader
Version: 0.6.1
Summary: Useful extensions for sec-edgar-downloader.
Home-page: https://github.com/Elijas/sec-downloader
Author: Elijas
Author-email: 4084885+Elijas@users.noreply.github.com
License: MIT License
Keywords: nbdev jupyter notebook python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Provides-Extra: dev
License-File: LICENSE

# sec-downloader

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

<a href="https://github.com/elijas/sec-downloader/actions/workflows/test.yaml"><img alt="GitHub Workflow Status" src="https://img.shields.io/github/actions/workflow/status/elijas/sec-downloader/test.yaml?label=build"></a>
<a href="https://pypi.org/project/sec-downloader/"><img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/sec-downloader"></a>
<a href="https://badge.fury.io/py/sec-downloader"><img src="https://badge.fury.io/py/sec-downloader.svg" alt="PyPI version" /></a>
<a href="LICENSE"><img src="https://img.shields.io/github/license/elijas/sec-downloader.svg" alt="Licence"></a>

Useful extensions for sec-edgar-downloader. Built with
[nbdev](https://nbdev.fast.ai/).

# Install

``` sh
pip install sec_downloader
```

# Features

- Files are downloaded to a temporary folder, immediately read into
  memory, and then deleted.
- Use “glob” pattern to select which files are read to memory.

# How to use

## Download the metadata

Find a filing with an Accession Number

``` python
from sec_downloader import Downloader

dl = Downloader("MyCompanyName", "email@example.com")
metadata = dl.get_filing_metadatas("AAPL/0000320193-23-000077")
print(metadata[0])
```

    FilingMetadata(accession_number='0000320193-23-000077',
                   form_type='10-Q',
                   primary_doc_url='https://www.sec.gov/Archives/edgar/data/320193/000032019323000077/aapl-20230701.htm',
                   items='',
                   primary_doc_description='10-Q',
                   filing_date='2023-08-04',
                   report_date='2023-07-01',
                   cik='0000320193',
                   company_name='Apple Inc.',
                   tickers=[Ticker(symbol='AAPL', exchange='Nasdaq')])

Alternatively, you can also use any of these to get the same answer:

    metadata = dl.get_filing_metadatas("aapl/000032019323000077")
    metadata = dl.get_filing_metadatas("320193/000032019323000077")
    metadata = dl.get_filing_metadatas("320193/0000320193-23-000077")
    metadata = dl.get_filing_metadatas("0000320193/0000320193-23-000077")
    metadata = dl.get_filing_metadatas(CompanyAndAccessionNumber(ticker_or_cik="320193", accession_number="0000320193-23-000077"))

Find the filing matching a SEC EDGAR Filing URL. Only CIK and Accession
Number are used from the URL:

``` python
metadatas = dl.get_filing_metadatas(
    "https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm"
)
print(metadatas[0])
```

    FilingMetadata(accession_number='0001193125-23-272204',
                   form_type='8-K',
                   primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',
                   items='2.02,9.01',
                   primary_doc_description='8-K',
                   filing_date='2023-11-07',
                   report_date='2023-11-04',
                   cik='0001067983',
                   company_name='BERKSHIRE HATHAWAY INC',
                   tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),
                            Ticker(symbol='BRK-A', exchange='NYSE')])

Alternatively, you can also URLs in other formats and get the same
answer:

    metadata = dl.get_filing_metadatas("https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm")

Find latest filings by company ticker or CIK:

``` python
metadatas = dl.get_filing_metadatas("2/MSFT/10-K")
print(metadatas)
```

    [FilingMetadata(accession_number='0000950170-23-035122',
                    form_type='10-K',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000095017023035122/msft-20230630.htm',
                    items='',
                    primary_doc_description='10-K',
                    filing_date='2023-07-27',
                    report_date='2023-06-30',
                    cik='0000789019',
                    company_name='MICROSOFT CORP',
                    tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')]),
     FilingMetadata(accession_number='0001564590-22-026876',
                    form_type='10-K',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/789019/000156459022026876/msft-10k_20220630.htm',
                    items='',
                    primary_doc_description='10-K',
                    filing_date='2022-07-28',
                    report_date='2022-06-30',
                    cik='0000789019',
                    company_name='MICROSOFT CORP',
                    tickers=[Ticker(symbol='MSFT', exchange='Nasdaq')])]

Alternatively, you can also use any of these to get the same answer:

    metadata = dl.get_filing_metadatas("2/msft/10-K")
    metadata = dl.get_filing_metadatas("2/789019/10-K")
    metadata = dl.get_filing_metadatas("2/0000789019/10-K")
    metadata = dl.get_filing_metadatas(RequestedFilings(limit=2, ticker_or_cik="MSFT", form_type="10-K"))

The parameters `limit` and `form_type` are optional. If omitted, `limit`
defaults to 1, and `form_type` defaults to ‘10-Q’.

``` python
metadatas = dl.get_filing_metadatas("NFLX")
print(metadatas)
```

    [FilingMetadata(accession_number='0001065280-23-000273',
                    form_type='10-Q',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/1065280/000106528023000273/nflx-20230930.htm',
                    items='',
                    primary_doc_description='10-Q',
                    filing_date='2023-10-20',
                    report_date='2023-09-30',
                    cik='0001065280',
                    company_name='NETFLIX INC',
                    tickers=[Ticker(symbol='NFLX', exchange='Nasdaq')])]

Alternatively, you can also use any of these to get the same answer:

    metadata = dl.get_filing_metadatas("nflx")
    metadata = dl.get_filing_metadatas("1/NFLX")
    metadata = dl.get_filing_metadatas("NFLX/10-Q")
    metadata = dl.get_filing_metadatas("1/NFLX/10-Q")
    metadata = dl.get_filing_metadatas(RequestedFilings(ticker_or_cik="NFLX"))
    metadata = dl.get_filing_metadatas(RequestedFilings(limit=1, ticker_or_cik="NFLX", form_type="10-Q"))

## Download the HTML files

After obtaining the Primary Document URL, for example from the metadata,
you can proceed to download the HTML using this URL.

``` python
for metadata in metadatas:
    html = dl.download_filing(url=metadata.primary_doc_url).decode()
    print(html[:50])
    break  # same for all filings, let's just print the first one
```

    '<?xml version="1.0" ?><!--XBRL Document Created wi'

# Advanced usage: Wrapper

If insteand of using the forked/modified `sec-edgar-downloader`, you
want to wrap its output instead, you can use the wrapper class
`SecDownloaderWrapper`.

Let’s demonstrate how to download a single file (latest 10-Q filing
details in HTML format) to memory.

``` python
dl = Downloader("MyCompanyName", "email@example.com")
html = dl.get_latest_html("10-Q", "AAPL")
# Use dl.get_latest_n_html("10-Q", "AAPL", n=5) to get the latest 5 10-Qs
print(f"{html[:50]}...")
```

    '<?xml version="1.0" ?><!--XBRL Document Created wi...'

> **Note** The company name and email address are used to form a
> user-agent string that adheres to the SEC EDGAR’s fair access policy
> for programmatic downloading.
> [Source](https://www.sec.gov/os/webmaster-faq#code-support)

Which is implemented approximately as:

``` python
from sec_edgar_downloader import Downloader as SecEdgarDownloader
from sec_downloader import DownloadStorage

ONLY_HTML = "**/*.htm*"

storage = DownloadStorage(filter_pattern=ONLY_HTML)
with storage as path:
    dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
    dl.get("10-Q", "AAPL", limit=1, download_details=True)
# all files are now deleted and only stored in memory

content = storage.get_file_contents()[0].content
print(f"{content[:50]}...")
```

    '<?xml version="1.0" ?><!--XBRL Document Created wi...'

Downloading multiple documents:

``` python
storage = DownloadStorage()
with storage as path:
    dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
    dl.get("10-K", "GOOG", limit=2)
# all files are now deleted and only stored in memory

for path, content in storage.get_file_contents():
    print(f"Path: {path}\nContent [len={len(content)}]: {content[:30]}...\n")
```

    ('Path: sec-edgar-filings/GOOG/10-K/0001652044-22-000019/full-submission.txt\n'
     'Content [len=15044932]: <SEC-DOCUMENT>0001652044-22-00...\n')
    ('Path: sec-edgar-filings/GOOG/10-K/0001652044-23-000016/full-submission.txt\n'
     'Content [len=15264470]: <SEC-DOCUMENT>0001652044-23-00...\n')

# Contributing

Follow these steps to install the project locally for development:

1.  Install the project with the command `pip install -e ".[dev]"`.

> **Note** We highly recommend using virtual environments for Python
> development. If you’d like to use virtual environments, follow these
> steps instead:
>
> - Create a virtual environment `python3 -m venv .venv`
> - Activate the virtual environment `source .venv/bin/activate`
> - Install the project with the command `pip install -e ".[dev]"`
