Metadata-Version: 2.4
Name: geopops
Version: 0.1.2
Summary: GeoPops
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: curl-cffi>=0.13.0
Requires-Dist: geopandas
Requires-Dist: numpy<2.3
Requires-Dist: pandas>=2.3.2
Requires-Dist: requests>=2.32.5
Requires-Dist: shapely
Requires-Dist: starsim>=3.0.3
Requires-Dist: urllib3>=2.5.0

# GeoPops
**Full documentation and tutorials coming soon!**
GeoPops is in development, and we welcome feedback. Please log any issues.

**GeoPops** is a package for generating geographically and demographically realistic synthetic populations for any US Census location using publically available data. Population generation includes three steps:
1. Generate individuals within households using combinatorial optimization (CO)
2. Assign individuals to schools and workplace locations using enrollment data and commute flows
3. Connect individuals within locations using graph algorithms

Resulting files include a list of agents with attributes (e.g., age, gender, income) and networks detailing their connections within home, school, workplace, and group quarters (e.g., correctional facilities, nursing homes) locations. GeoPops is meant to produce reasonable approximations of state and county population characteristics with granularity down to the Census Block Group (CBG).   GeoPops builds on a previous package, [GREASYPOP-CO](https://github.com/CDDEP-DC/GREASYPOP-CO/tree/main), and incorporates the following changes:
- All code wrapped in convenient Python package that can be pip installed
- Compatibility with Census data beyond 2019 (still developing)
- Automated data downloading
- Users can adjust all config parameters from the front-end
- Class for exporting files compatible with the agent-based modeling software [Starsim](https://starsim.org/)

## How to use
First, create a Julia environment with the dependencies listed below. It may be easiest to store the environment in the same folder you will use for output files. While called with Python commands, combinatorial optimization, school and workplace assignment, and network generation steps occur in Julia to decrease run time. Try running the following in the terminal.
```
cd "YOUR_PATH"
curl -fsSL https://install.julialang.org | sh
juliaup add 1.9.0        # Install Julia 1.9.0
juliaup default 1.9.0    # Make 1.9.0 the default (optional)
julia +1.9.0 --version   # Run that version once
juliaup update           # Update installed versions
julia                    # Launch Julia and see version
Base.active_project()    # Get path where environment is located. Copy this - will need later
]                        # Enter package mode. prompt changes to "(@v1.9) pkg>"
add CSV@0.10.10          # Add required package versions
add DataFrames@1.5.0
add Graphs@1.8.0 
add InlineStrings@1.4.0 
add JSON@0.21.4
add StatsBase@0.33.21
add Distributions@0.25.112
add MatrixMarket@0.4.0
add ProportionalFitting@0.3.0
status                   # View list of packages
```


You'll also need a Python environment with the dependencies listed in the GeoPops `pyproject.toml`. Install GeoPops from [PyPI](https://pypi.org/project/geopops/).
```
pip install geopops
```

Next, obtain a Census API key [here](https://api.census.gov/data/key_signup.html), which will be used for pulling Census data. 

Now in a Python or Notebook script, create a dictionary of parameters. Default parameters are stored in a package file called `config.json`. Pass your dictionary into `WriteConfig()` to overwrite config.json with the parameters for your population of interest. Here's an example to for Howard County, MD.
```
pars_geopops = {'path': 'YOUR_OUTPUT_DIR', # designate folder for output files
                'census_api_key': "YOUR_CENSUS_API_KEY", 
                'julia_env_path': "YOUR_JULIA_ENV_PATH",
                'main_year': 2019, # year of data
                'geos': ["24027"], # state or county fips code of main geographical area
                'commute_states': ["24"], # fips of commute states to use
                'use_pums': ["24"]} # Same as commute_states

c = geopops.WriteConfig(**pars_geopops) # Overwrite config.json with your parameters
c.get_pars() # View config.json as dictionary
```
The commands below will create your popoulation and store files in the output directory defined above. Downloaded raw data files are stored in the subfolders census, geo, pums, school, and work. Files created in the preprocessing step are stored in the subfolder called processed. The population in jlse format is stored in the subfolder jlse. `Export()` outputs csv versions into the subfolder pop_export. `ForStarsim()` outputs files formated for use with Starsim into the subfolder pop_export/starsim.
```
geopops.DownloadData()          # Download all Census and other data sources
geopops.ProcessData()           # Preprocessing for next steps
j = geopops.RunJulia()
j.run_all()                     # Run Julia scripts (much faster than Python). Can also run separately
# j.CO()                        # Combinatorial optimization. Output in jlse folder                    
# j.SynthPop()                  # School/workplace assignment and network generation
# j.Export()                    # Export to csv format
```
The `ForStarsim()` classes has nested classses which can be passed into a Starsim simulation to run a model on your GeoPops popopulation.
```
geopops.ForStarsim.People()             # Creates a Starsim People object
geopops.ForStarsim.GPNetwork()          # Creates a Starsim Network object
geopops.ForStarsim.SubgroupTracking()   # Creates a Starsim Analyzer object for demographic or geographic subgroup tracking
```
## Tutorials
See tutorials/MIDAS for more detailed usage as well as a Notebook tutorial.

