Metadata-Version: 2.4
Name: fetch-houston2013
Version: 0.5.1
Summary: Download and load Houston 2013 Dataset
Project-URL: Documentation, https://github.com/songyz2023/fetch-houston2013#readme
Project-URL: Issues, https://github.com/songyz2023/fetch-houston2013/issues
Project-URL: Source, https://github.com/songyz2023/fetch-houston2013
Author-email: songyz2023 <songyz2023dlut@outlook.com>
License-Expression: Apache-2.0
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: >=3.10
Requires-Dist: jaxtyping>=0.3.0
Requires-Dist: numpy>=1.25.0
Requires-Dist: rasterio>=1.2.0
Requires-Dist: scipy>=1.12.0
Description-Content-Type: text/markdown

# fetch houston2013 muufl and trento

[![PyPI - Version](https://img.shields.io/pypi/v/fetch-houston2013.svg)](https://pypi.org/project/fetch-houston2013)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/fetch-houston2013.svg)](https://pypi.org/project/fetch-houston2013)

Download and load Houston 2013 Dataset, Trento dataset and Muufl dataset easily and swiftly.

- Automatically download and cache all needed files
- Verify checksums to avoid data poisoning
- Use sparse matrix to representing ground truth, less memory usage and easier iteration

![screenshot](screenshot.jpg)

## Usage
1. install this package
```bash
pip install fetch-houston2013
```
2. import and get the dataset
```python
from fetch_houston2013 import fetch_houston2013, fetch_muufl, fetch_trento, split_spmatrix
# For Houston 2013
hsi, dsm, train_label, test_label, info = fetch_houston2013()
# For Muufl
casi, lidar, truth, info = fetch_muufl()
train_label, test_label = split_spmatrix(truth, 20)
# For Trento
casi, lidar, truth, info = fetch_trento()
train_label, test_label = split_spmatrix(truth, 20)
```
3. tips: train_label and test_label are [sparse matrix](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_array.html), you can either convert them to np.array easily by
```python
train_label=train_label.todense()
test_label =test_label.todense()
```
or directly use them for getting the value in a very fast way:
```python
    def __getitem__(self, index):
      i = self.truth.row[index]
      j = self.truth.col[index]
      label = self.truth.data[index].item()
      x_hsi = self.hsi[:, i, j]
      x_dsm = self.dsm[:, i, j]
      return x_hsi, x_dsm, label
```

## Help
1. Remove `~/scikit_learn_data` to clean cache and try again.  
2. We download dataset from official website and pastbin.com. Make sure you can access these websites.

## Build and Publish
```bash
# install uv
uv build
bash load_secrets.bash
uv publish
```

## Contribution
We welcome all contributions, including issues, pull requests, feature requests and discussions.

## Credits
```text
Houston2013 dataset: https://machinelearning.ee.uh.edu/?page_id=459
paperswithcode: https://paperswithcode.com/dataset/houston
Muufl dataset: https://github.com/GatorSense/MUUFLGulfport
Dafault url of Trento dataset is https://github.com/tyust-dayu/Trento/tree/b4afc449ce5d6936ddc04fe267d86f9f35536afd
The 2013_IEEE_GRSS_DF_Contest_Samples_VA.txt in this repo is exported from original 2013_IEEE_GRSS_DF_Contest_Samples_VA.roi.
Note: If this data is used in any publication or presentation the following reference must be cited:
P. Gader, A. Zare, R. Close, J. Aitken, G. Tuell, “MUUFL Gulfport Hyperspectral and LiDAR Airborne Data Set,” University of Florida, Gainesville, FL, Tech. Rep. REP-2013-570, Oct. 2013.
If the scene labels are used in any publication or presentation, the following reference must be cited:
X. Du and A. Zare, “Technical Report: Scene Label Ground Truth Map for MUUFL Gulfport Data Set,” University of Florida, Gainesville, FL, Tech. Rep. 20170417, Apr. 2017. Available: http://ufdc.ufl.edu/IR00009711/00001.
If any of this scoring or detection code is used in any publication or presentation, the following reference must be cited:
T. Glenn, A. Zare, P. Gader, D. Dranishnikov. (2016). Bullwinkle: Scoring Code for Sub-pixel Targets (Version 1.0) [Software]. Available from https://github.com/GatorSense/MUUFLGulfport/.
```

## License
```text
Copyright 2025 songyz2023

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
```