Metadata-Version: 2.1
Name: sec-downloader
Version: 0.9.2
Summary: Useful extensions for sec-edgar-downloader.
Home-page: https://github.com/Elijas/sec-downloader
Author: Elijas
Author-email: 4084885+Elijas@users.noreply.github.com
License: MIT License
Keywords: nbdev jupyter notebook python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Provides-Extra: dev
License-File: LICENSE

# sec-downloader

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

<a href="https://github.com/elijas/sec-downloader/actions/workflows/test.yaml"><img alt="GitHub Workflow Status" src="https://img.shields.io/github/actions/workflow/status/elijas/sec-downloader/test.yaml?label=build"></a>
<a href="https://pypi.org/project/sec-downloader/"><img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/sec-downloader"></a>
<a href="https://badge.fury.io/py/sec-downloader"><img src="https://badge.fury.io/py/sec-downloader.svg" alt="PyPI version" /></a>
<a href="LICENSE"><img src="https://img.shields.io/github/license/elijas/sec-downloader.svg" alt="Licence"></a>

A better version of `sec-edgar-downloader`. Includes an alternative
implementation (a wrapper instead of a fork), to keep compatibility with
new `sec-edgar-downloader` releases. This library partially uses
[nbdev](https://nbdev.fast.ai/).

# Features

Advantages over `sec-edgar-downloader`:

**Flexibility in Download Process**

- Tailored for choosing *what*, *where*, and *how* to download.
- Files stored in memory for faster operations and no unnecessary disk
  clutter.

**Separate Metadata and File Downloads**

- Easily skip unneeded files.
- Download metadata first, then selectively download files.
- Option to save metadata for better organization.

**More Input Options**

- Ticker or CIK (e.g., `AAPL`, `0000320193`) for latest filings.
- Accession Number (e.g., `0000320193-23-000077`). Not supported in
  `sec-edgar-downloader`.
- SEC EDGAR URL (e.g.,
  `https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm`).
  Not supported in `sec-edgar-downloader`.

# Install

``` sh
pip install sec_downloader
```

# How to use

## Download the metadata

> **Note** The company name and email address are used to form a
> user-agent string that adheres to the SEC EDGAR’s fair access policy
> for programmatic downloading.
> [Source](https://www.sec.gov/os/webmaster-faq#code-support)

``` python
from sec_downloader import Downloader

dl = Downloader("MyCompanyName", "email@example.com")
```

Find a filing with an Accession Number

``` python
metadatas = dl.get_filing_metadatas("AAPL/0000320193-23-000077")
print(metadatas)
```

    FilingMetadata(accession_number='0000320193-23-000077',
                   form_type='10-Q',
                   primary_doc_url='https://www.sec.gov/Archives/edgar/data/320193/000032019323000077/aapl-20230701.htm',
                   items='',
                   primary_doc_description='10-Q',
                   filing_date='2023-08-04',
                   report_date='2023-07-01',
                   cik='0000320193',
                   company_name='Apple Inc.',
                   tickers=[Ticker(symbol='AAPL', exchange='Nasdaq')])

Alternatively, you can also use any of these to get the same answer:

    metadatas = dl.get_filing_metadatas("aapl/000032019323000077")
    metadatas = dl.get_filing_metadatas("320193/000032019323000077")
    metadatas = dl.get_filing_metadatas("320193/0000320193-23-000077")
    metadatas = dl.get_filing_metadatas("0000320193/0000320193-23-000077")
    metadatas = dl.get_filing_metadatas(CompanyAndAccessionNumber(ticker_or_cik="320193", accession_number="0000320193-23-000077"))

Find the filing matching a SEC EDGAR Filing URL. Only CIK and Accession
Number are used from the URL:

``` python
metadatas = dl.get_filing_metadatas(
    "https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm"
)
print(metadatas)
```

    FilingMetadata(accession_number='0001193125-23-272204',
                   form_type='8-K',
                   primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',
                   items='2.02,9.01',
                   primary_doc_description='8-K',
                   filing_date='2023-11-07',
                   report_date='2023-11-04',
                   cik='0001067983',
                   company_name='BERKSHIRE HATHAWAY INC',
                   tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),
                            Ticker(symbol='BRK-A', exchange='NYSE')])

Alternatively, you can also URLs in other formats and get the same
answer:

    metadatas = dl.get_filing_metadatas("https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm")

Find latest filings by company ticker or CIK:

``` python
from sec_downloader.types import RequestedFilings

metadatas = dl.get_filing_metadatas(
    RequestedFilings(ticker_or_cik="MSFT", form_type="10-K", limit=2)
)
print(metadatas)
```

    [FilingMetadata(accession_number='0001193125-23-272204',
                    form_type='8-K',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',
                    items='2.02,9.01',
                    primary_doc_description='8-K',
                    filing_date='2023-11-07',
                    report_date='2023-11-04',
                    cik='0001067983',
                    company_name='BERKSHIRE HATHAWAY INC',
                    tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),
                             Ticker(symbol='BRK-A', exchange='NYSE')])]

Alternatively, you can also use any of these to get the same answer:

    metadatas = dl.get_filing_metadatas("2/msft/10-K")
    metadatas = dl.get_filing_metadatas("2/789019/10-K")
    metadatas = dl.get_filing_metadatas("2/0000789019/10-K")

The parameters `limit` and `form_type` are optional. If omitted, `limit`
defaults to 1, and `form_type` defaults to ‘10-Q’.

``` python
metadatas = dl.get_filing_metadatas("NFLX")
print(metadatas)
```

    [FilingMetadata(accession_number='0001065280-23-000273',
                    form_type='10-Q',
                    primary_doc_url='https://www.sec.gov/Archives/edgar/data/1065280/000106528023000273/nflx-20230930.htm',
                    items='',
                    primary_doc_description='10-Q',
                    filing_date='2023-10-20',
                    report_date='2023-09-30',
                    cik='0001065280',
                    company_name='NETFLIX INC',
                    tickers=[Ticker(symbol='NFLX', exchange='Nasdaq')])]

Alternatively, you can also use any of these to get the same answer:

    metadatas = dl.get_filing_metadatas("nflx")
    metadatas = dl.get_filing_metadatas("1/NFLX")
    metadatas = dl.get_filing_metadatas("NFLX/10-Q")
    metadatas = dl.get_filing_metadatas("1/NFLX/10-Q")
    metadatas = dl.get_filing_metadatas(RequestedFilings(ticker_or_cik="NFLX"))
    metadatas = dl.get_filing_metadatas(RequestedFilings(limit=1, ticker_or_cik="NFLX", form_type="10-Q"))

## Download the HTML files

After obtaining the Primary Document URL, for example from the metadata,
you can proceed to download the HTML using this URL.

``` python
for metadata in metadatas:
    html = dl.download_filing(url=metadata.primary_doc_url).decode()
    print(html[:50])
    break  # same for all filings, let's just print the first one
```

    '<?xml version="1.0" ?><!--XBRL Document Created wi'

# Alternative implementation: Wrapper

Files are downloaded to a temporary folder, immediately read into
memory, and then deleted. Let’s demonstrate how to download a single
file (latest 10-Q filing details in HTML format) to memory. The “glob”
pattern is used to select which files are read to memory.

``` python
from sec_edgar_downloader import Downloader as SecEdgarDownloader
from sec_downloader.download_storage import DownloadStorage

ONLY_HTML = "**/*.htm*"

storage = DownloadStorage(filter_pattern=ONLY_HTML)
with storage as path:
    dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
    dl.get("10-Q", "AAPL", limit=1, download_details=True)
# all files are now deleted and only stored in memory

content = storage.get_file_contents()[0].content
print(f"{content[:50]}...")
```

    '<?xml version="1.0" ?><!--XBRL Document Created wi...'

Downloading multiple documents:

``` python
storage = DownloadStorage()
with storage as path:
    dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
    dl.get("10-K", "GOOG", limit=2)
# all files are now deleted and only stored in memory

for path, content in storage.get_file_contents():
    print(f"Path: {path}\nContent [len={len(content)}]: {content[:30]}...\n")
```

    ('Path: sec-edgar-filings/GOOG/10-K/0001652044-22-000019/full-submission.txt\n'
     'Content [len=15044932]: <SEC-DOCUMENT>0001652044-22-00...\n')
    ('Path: sec-edgar-filings/GOOG/10-K/0001652044-23-000016/full-submission.txt\n'
     'Content [len=15264470]: <SEC-DOCUMENT>0001652044-23-00...\n')

# Contributing

Follow these steps to install the project locally for development:

1.  Install the project with the command `pip install -e ".[dev]"`.

> **Note** We highly recommend using virtual environments for Python
> development. If you’d like to use virtual environments, follow these
> steps instead:
>
> - Create a virtual environment `python3 -m venv .venv`
> - Activate the virtual environment `source .venv/bin/activate`
> - Install the project with the command `pip install -e ".[dev]"`
