Metadata-Version: 1.1
Name: congress-crawler
Version: 0.0.2
Summary: Gathers data on the U.S. Congress.
Home-page: https://github.com/Hear-Ye/congress
Author: Andrew Chen Wang
Author-email: acwangpython@gmail.com
License: CC0
Description: # Hear-Ye/congress
        
        This repository is a port from 
        [unitedstates/congress](https://github.com/unitedstates/congress). We required a 
        versioning policy which the `unitedstates` organization does not provide, so we made 
        this repository. Most changes here are still coming from the main repository,
        and any changes we make here will most likely end up in the main repository
        as well via PRs.
        
        Additionally, we needed to store older data in bulk for our open source developers,
        so we decided to utilize GitHub Actions and its cron job. We update our releases
        of bulk data on a daily cycle. This can be found at:
        [Hear-Ye/congress-data](https://github.com/Hear-Ye/congress-data)
        
        ## New Installation
        
        Run `pip install congress-crawler`
        
        ### Notes!
        
        After reviewing much of this repository, lots of code is just missing and not updated.
        Be careful before using either this port or the main repository. Additionally, we want
        to maintain the same license as the main repository in the spirit of open source and
        being public domain. We firmly believe tools are the main gears of our society, and thus
        this tool in particular should remain free.
        
        The following is the documentation (mostly intact with more features and docs from our 
        repository) from [unitedstates/congress](https://github.com/unitedstates/congress).
        
        ---
        
        ## unitedstates/congress
        
        Public domain code that collects data about the bills, amendments, roll call votes, and other core data about the U.S. Congress.
        
        Includes:
        
        * A data importing script for the [official bulk bill status data](https://github.com/usgpo/bill-status) from Congress, the official source of information on the life and times of legislation.
        
        * Scrapers for House and Senate roll call votes.
        
        * A document fetcher for GovInfo.gov, which holds bill text, bill status, and other official documents.
        
        * A defunct THOMAS scraper for presidential nominations in Congress.
        
        Read about the contents and schema in the [documentation](https://github.com/unitedstates/congress/wiki) in the github project wiki.
        
        For background on how this repository came to be, see [Eric's blog post](https://sunlightfoundation.com/blog/2013/08/20/a-modern-approach-to-open-data/).
        
        ### Setting Up
        
        This project supports Python 3.6+.
        
        **System dependencies**
        
        On Ubuntu, you'll need `wget`, `pip`, and some support packages:
        
        ```bash
        sudo apt-get install git python3-dev libxml2-dev libxslt1-dev libz-dev python3-pip python3-venv
        ```
        
        On OS X, you'll need developer tools installed ([XCode](https://developer.apple.com/xcode/)), and `wget`.
        
        ```bash
        brew install wget
        ```
        
        **Python dependencies**
        
        It's recommended you use a `virtualenv` (virtual environment) for development. Create a virtualenv for this project:
        
        ```bash
        python3 -m venv congress
        source congress/bin/activate
        ```
        Finally, with your virtual environment activated, install Python packages:
        
        ```bash
        pip3 install -r requirements.txt
        ```
        
        ### Collecting the data
        
        The general form to start the scraping process is:
        
            ./run <data-type> [--force] [other options]
        
        where data-type is one of:
        
        * `bills` (see [Bills](https://github.com/unitedstates/congress/wiki/bills)) and [Amendments](https://github.com/unitedstates/congress/wiki/amendments))
        * `votes` (see [Votes](https://github.com/unitedstates/congress/wiki/votes))
        * `nominations` (see [Nominations](https://github.com/unitedstates/congress/wiki/nominations))
        * `committee_meetings` (see [Committee Meetings](https://github.com/unitedstates/congress/wiki/committee-meetings))
        * `govinfo` (see [Bill Text](https://github.com/unitedstates/congress/wiki/bill-text))
        * `statutes` (see [Bills](https://github.com/unitedstates/congress/wiki/bills) and [Bill Text](https://github.com/unitedstates/congress/wiki/bill-text))
        
        To get data for bills, resolutions, and amendments, run:
        
        ```bash
        ./run govinfo --bulkdata=BILLSTATUS
        ./run bills
        ```
        
        The bills script will output bulk data into a top-level `data` directory, then organized by Congress number, bill type, and bill number. Two data output files will be generated for each bill: a JSON version (data.json) and an XML version (data.xml).
        
        ### Common options
        
        Debugging messages are hidden by default. To include them, run with --log=info or --debug. To hide even warnings, run with --log=error.
        
        To get emailed with errors, copy config.yml.example to config.yml and fill in the SMTP options. The script will automatically use the details when a parsing or execution error occurs.
        
        The --force flag applies to all data types and supresses use of a cache for network-retreived resources.
        
        ### Data Output
        
        The script will cache downloaded pages in a top-level `cache` directory, and output bulk data in a top-level `data` directory.
        
        Two bulk data output files will be generated for each object: a JSON version (data.json) and an XML version (data.xml). The XML version attempts to maintain backwards compatibility with the XML bulk data that [GovTrack.us](https://www.govtrack.us) has provided for years. Add the --govtrack flag to get fully backward-compatible output using GovTrack IDs (otherwise the source IDs used for legislators is used).
        
        See the [project wiki](https://github.com/unitedstates/congress/wiki) for documentation on the output format.
        
        ### Contributing
        
        Pull requests with patches are awesome. Unit tests are strongly encouraged ([example tests](https://github.com/unitedstates/congress/blob/master/test/test_bill_actions.py)).
        
        The best way to file a bug is to [open a ticket](https://github.com/unitedstates/congress/issues).
        
        
        ### Running tests
        
        To run this project's unit tests:
        
        ```bash
        ./test/run
        ```
        
        ### Who's Using This Data
        
        The [Sunlight Foundation](https://sunlightfoundation.com) and [GovTrack.us](https://www.govtrack.us) are the two principal maintainers of this project.
        
        Both Sunlight and GovTrack operate APIs where you can get much of this data delivered over HTTP:
        
        * [GovTrack.us API](https://www.govtrack.us/developers/api)
        * [Sunlight Congress API](https://sunlightlabs.github.io/congress/)
        
        ## Public domain
        
        This project is [dedicated to the public domain](LICENSE). As spelled out in [CONTRIBUTING](CONTRIBUTING.md):
        
        > The project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the [CC0 1.0 Universal public domain dedication](https://creativecommons.org/publicdomain/zero/1.0/).
        
        > All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.
        
        [![Build Status](https://travis-ci.org/unitedstates/congress.svg?branch=master)](https://travis-ci.org/unitedstates/congress)
        
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Framework :: Django
Classifier: Intended Audience :: Developers
Classifier: License :: CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Internet :: WWW/HTTP
