Metadata-Version: 2.4
Name: nextcloud-esamaap
Version: 0.1.0
Summary: Read and upload files to/from Nextcloud directly in Jupyter notebooks without downloading them locally
Author-email: Joseph Melizza <joseph.melizza@serco.com>
License: MIT
Keywords: nextcloud,jupyter,webdav,cloud storage,remote files,geospatial
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: System :: Filesystems
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.28.0
Requires-Dist: pandas>=1.5.0
Requires-Dist: numpy>=1.23.0
Requires-Dist: openpyxl>=3.0.0
Requires-Dist: pillow>=9.0.0
Provides-Extra: lidar
Requires-Dist: laspy[laszip]>=2.4.0; extra == "lidar"
Provides-Extra: geo
Requires-Dist: geopandas>=0.12.0; extra == "geo"
Requires-Dist: fiona>=1.9.0; extra == "geo"
Requires-Dist: shapely>=2.0.0; extra == "geo"
Provides-Extra: shapefile
Requires-Dist: simpledbf>=0.2.6; extra == "shapefile"
Provides-Extra: hdf5
Requires-Dist: h5py>=3.7.0; extra == "hdf5"
Provides-Extra: advanced
Requires-Dist: pyarrow>=10.0.0; extra == "advanced"
Requires-Dist: tables>=3.7.0; extra == "advanced"
Requires-Dist: netcdf4>=1.6.0; extra == "advanced"
Requires-Dist: xarray>=2022.12.0; extra == "advanced"
Requires-Dist: zarr>=2.13.0; extra == "advanced"
Requires-Dist: pyyaml>=6.0; extra == "advanced"
Provides-Extra: jupyter
Requires-Dist: jupyter>=1.0.0; extra == "jupyter"
Requires-Dist: matplotlib>=3.5.0; extra == "jupyter"
Provides-Extra: all
Requires-Dist: laspy[laszip]>=2.4.0; extra == "all"
Requires-Dist: geopandas>=0.12.0; extra == "all"
Requires-Dist: fiona>=1.9.0; extra == "all"
Requires-Dist: shapely>=2.0.0; extra == "all"
Requires-Dist: simpledbf>=0.2.6; extra == "all"
Requires-Dist: h5py>=3.7.0; extra == "all"
Requires-Dist: pyarrow>=10.0.0; extra == "all"
Requires-Dist: tables>=3.7.0; extra == "all"
Requires-Dist: netcdf4>=1.6.0; extra == "all"
Requires-Dist: xarray>=2022.12.0; extra == "all"
Requires-Dist: zarr>=2.13.0; extra == "all"
Requires-Dist: pyyaml>=6.0; extra == "all"
Requires-Dist: jupyter>=1.0.0; extra == "all"
Requires-Dist: matplotlib>=3.5.0; extra == "all"
Dynamic: license-file

**Read and upload files to/from Nextcloud directly in Jupyter notebooks without downloading them locally.**

This package provides seamless integration between Nextcloud and Jupyter notebooks, allowing you to work with cloud-stored files as if they were local, supporting 25+ file formats including LiDAR point clouds, geospatial data, and scientific datasets.

## 🌟 Features

- ✅ **Read files remotely** - No local downloads needed
- ✅ **Download files/folders** - Save to local disk when needed
- ✅ **Upload files/folders** - Via WebDAV protocol
- ✅ **Move & rename** - Organize files on Nextcloud
- ✅ **30+ file formats** - From CSV to LiDAR point clouds
- ✅ **Auto-detection** - Automatically detect and read any supported format
- ✅ **Memory efficient** - Stream files directly into memory
- ✅ **Jupyter optimized** - Perfect for notebooks and data science workflows

## 📦 Installation

### Basic installation
```bash
pip install nextcloud-jupyter
```

### With optional dependencies

For **LiDAR/Point Cloud** support (.las, .laz):
```bash
pip install nextcloud-jupyter[lidar]
```

For **Geospatial** support (.shp, .geojson, .kml, .kmz):
```bash
pip install nextcloud-jupyter[geo]
```

For **HDF5** support (.hdf5, .h5, .hd5):
```bash
pip install nextcloud-jupyter[hdf5]
```

For **Advanced formats** (Parquet, NetCDF, Zarr, etc.):
```bash
pip install nextcloud-jupyter[advanced]
```

For **Jupyter** with visualization:
```bash
pip install nextcloud-jupyter[jupyter]
```

For **everything**:
```bash
pip install nextcloud-jupyter[all]
```

## 🚀 Quick Start

### 1. Generate Nextcloud App Password

1. Log in to your Nextcloud web interface
2. Go to **Settings** → **Security**
3. Under "Devices & sessions", create a new app password
4. Copy the generated password

### 2. Read Files from Nextcloud

```python
from nextcloud_jupyter import NextcloudReader

# Initialize reader
nc = NextcloudReader(
    nextcloud_url="https://your-nextcloud-server.com",
    username="your_username",
    app_password="your_app_password"
)

# Read a CSV file directly from Nextcloud
df = nc.read_csv('/inputs/data.csv')
print(df.head())
```

### 3. Upload Files to Nextcloud

```python
from nextcloud_jupyter import NextcloudUploader

# Initialize uploader
uploader = NextcloudUploader(
    nextcloud_url="https://your-nextcloud-server.com",
    username="your_username",
    app_password="your_app_password"
)

# Upload a file
uploader.upload_file("local_data.csv", "/inputs/data.csv")

# Upload entire folder
uploader.upload_folder("./my_folder", "/inputs")
```

### 4. Download Files from Nextcloud

```python
# Download a single file
nc.download_file('/inputs/data.csv', './downloads/data.csv')

# Download entire folder
nc.download_folder('/inputs/project_data', './downloads')

# Download multiple files
files = ['/data/file1.csv', '/data/file2.csv', '/reports/summary.pdf']
results = nc.download_files(files, './downloads')
print(f"Downloaded: {len(results['successful'])} files")
```

## 📚 Supported File Types

### Data Formats
- **CSV** (`.csv`) - `read_csv()`
- **Excel** (`.xlsx`, `.xls`) - `read_excel()`
- **JSON** (`.json`) - `read_json()`
- **Parquet** (`.parquet`) - `read_parquet()`
- **Feather** (`.feather`) - `read_feather()`
- **HDF5** (`.hdf5`, `.h5`, `.hd5`) - `read_hdf5()`

### Arrays
- **NumPy** (`.npy`) - `read_numpy()`
- **Compressed NumPy** (`.npz`) - `read_npz()`

### LiDAR / Point Clouds
- **LAS** (`.las`) - `read_las()`
- **LAZ** (`.laz`) - `read_laz()` (compressed)

### Geospatial
- **Shapefile** (`.shp`) - `read_shapefile()`
- **DBF** (`.dbf`) - `read_dbf()` (shapefile attributes)
- **PRJ** (`.prj`) - `read_prj()` (projection info)
- **SHX** (`.shx`) - `read_shx()` (shapefile index)
- **GeoJSON** (`.geojson`) - `read_geojson()`
- **KML** (`.kml`) - `read_kml()` (Google Earth)
- **KMZ** (`.kmz`) - `read_kmz()` (compressed KML)
- **NetCDF** (`.nc`) - `read_netcdf()`

### Archives
- **ZIP** (`.zip`) - `read_zip()`
- **TAR** (`.tar`, `.tar.gz`, `.tgz`) - `read_tar()`

### Configuration
- **YAML** (`.yaml`, `.yml`) - `read_yaml()`
- **XML** (`.xml`) - `read_xml()`

### Other
- **Python Pickle** (`.pkl`, `.pickle`) - `read_pickle()`
- **Images** (`.png`, `.jpg`, `.jpeg`, `.tiff`, etc.) - `read_image()`
- **Text** (`.txt`) - `read_text_file()`

### Auto-Detection
```python
# Automatically detect file type and read
data = nc.read_auto('/inputs/unknown_file.csv')
```

## 💡 Usage Examples

### Working with LiDAR Point Clouds

```python
# Read LAS file
las_data = nc.read_las('/inputs/lidar/scan.las')
x, y, z = las_data.x, las_data.y, las_data.z
print(f"Points: {len(las_data.points):,}")

# Read LAZ (compressed)
laz_data = nc.read_laz('/inputs/lidar/scan.laz')
points = np.vstack([laz_data.x, laz_data.y, laz_data.z]).T
```

### Working with Compressed NumPy Arrays

```python
# Read NPZ with multiple arrays
npz_data = nc.read_npz('/inputs/arrays/data.npz')
array1 = npz_data['array1']
array2 = npz_data['array2']
npz_data.close()
```

### Working with Geospatial Data

```python
# Read GeoJSON
gdf = nc.read_geojson('/inputs/maps/boundaries.geojson')
gdf.plot()

# Read KML
kml_data = nc.read_kml('/inputs/maps/locations.kml')

# Read shapefile components
df_attributes = nc.read_dbf('/inputs/gis/data.dbf')
projection = nc.read_prj('/inputs/gis/data.prj')
```

### Working with HDF5 Files

```python
# Read HDF5
with nc.read_hdf5('/inputs/data/measurements.hdf5') as h5f:
    dataset = h5f['temperature'][:]
    print(dataset.shape)
```

### Processing Multiple Files

```python
# List files in a folder
files = nc.list_folder('/inputs/data')

# Process all CSV files
for file in files:
    if file['name'].endswith('.csv') and not file['is_folder']:
        df = nc.read_csv(f"/inputs/data/{file['name']}")
        # Process dataframe...
```

### Check File Existence

```python
if nc.file_exists('/inputs/data.csv'):
    df = nc.read_csv('/inputs/data.csv')
else:
    print("File not found!")
```

### Download Files and Folders

```python
# Download single file
nc.download_file(
    remote_path='/inputs/data.csv',
    local_path='./downloads/data.csv'
)

# Download entire folder with all contents
nc.download_folder(
    remote_path='/inputs/project_data',
    local_path='./downloads'
)

# Download multiple files at once
files_to_download = [
    '/inputs/data1.csv',
    '/inputs/data2.csv',
    '/reports/summary.pdf'
]
results = nc.download_files(files_to_download, './downloads')

# Download with preserved folder structure
results = nc.download_files(
    remote_files=['/inputs/folder1/file.csv', '/reports/2024/data.pdf'],
    local_dir='./downloads',
    preserve_structure=True  # Keeps Nextcloud folder structure
)

# Download only specific file types
files = nc.list_folder('/inputs/data')
csv_files = [f"/inputs/data/{f['name']}" for f in files if f['name'].endswith('.csv')]
nc.download_files(csv_files, './downloads/csv_only')
```

### Move and Rename Files/Folders

```python
# Move a file
nc.move_file('/inputs/data.csv', '/archive/data.csv')

# Move with overwrite
nc.move_file('/inputs/report.pdf', '/archive/report.pdf', overwrite=True)

# Rename a file
nc.rename_file('/inputs/old_name.csv', 'new_name.csv')

# Move a folder
nc.move_folder('/inputs/project1', '/archive/project1')

# Move multiple files
moves = [
    ('/inputs/old1.csv', '/archive/old1.csv'),
    ('/inputs/old2.csv', '/archive/old2.csv'),
]
results = nc.move_files(moves)
```

## 🛡️ Security

- Never commit your app password to version control
- Use environment variables for credentials:

```python
import os
from nextcloud_jupyter import NextcloudReader

nc = NextcloudReader(
    nextcloud_url=os.getenv('NEXTCLOUD_URL'),
    username=os.getenv('NEXTCLOUD_USERNAME'),
    app_password=os.getenv('NEXTCLOUD_APP_PASSWORD')
)
```
