Metadata-Version: 2.0
Name: gridfs-fuse
Version: 0.1.2.dev1
Summary: A FUSE wrapper around MongoDB gridfs using python and llfuse.
Home-page: https://github.com/axiros/py_gridfs_fuse
Author: UNKNOWN
Author-email: UNKNOWN
License: BSD License
Description-Content-Type: UNKNOWN
Platform: UNKNOWN
Requires-Dist: llfuse
Requires-Dist: pymongo

# python gridfs fuse
A FUSE wrapper around MongoDB gridfs using python and llfuse.

## Usage

```bash
gridfs_fuse --mongodb-uri="mongodb://127.0.0.1:27017" --database="gridfs_fuse" --mount-point="/mnt/gridfs_fuse" # --options=allow_other
```

### fstab example
```
mongodb://127.0.0.1:27017/gridfs_fuse.fs  /mnt/gridfs_fuse  gridfs  defaults,allow_other  0  0 
```
Note this assumes that you have the `mount.gridfs` program (or `mount_gridfs` on MacOS X) symlinked 
into `/sbin/` e.g. 
```bash
sudo ln -s $(which mount.gridfs) /sbin/`
```

## Requirements
 * pymongo
 * llfuse

## Install
Ubuntu 16.04:
```bash
sudo apt-get install libfuse python-llfuse
sudo -H pip install gridfs-fuse
```

MacOSX:
```bash
brew install osxfuse
sudo -H pip install gridfs-fuse
```


## Operations supported
 * create/list/delete directories => folder support.
 * read files.
 * delete files.
 * open and write once (like HDFS).
 * rename


## Operations not supported
 * modify an existing file.
 * resize an existing file.
 * hardlink
 * symlink
 * statfs


## Performance
### Setup
* AWS d2.xlarge machine.
  * 4 @ 2.40Ghz (E5-2676)
  * 30 gigabyte RAM
* filesystem: ext4
* block device: three instance storage disks combined with lvm.
```
lvcreate -L 3T -n mongo -i 3 -I 4096 ax /dev/xvdb /dev/xvdc /dev/xvdd
```
* mongodb 3.0.1
* mongodb storage engine WiredTiger
* mongodb compression: snappy
* mongodb cache size: 10 gigabyte

### Results
* sequential write performance: ~46 MB/s
* sequential read performance: ~90 MB/s

Write performance was tested by copying 124 files, each having a size of 9 gigabytes and different content.
Compression factor was about factor three.
Files were copied one by one => no parallel execution.

Read performance was tested by randomly picking 10 files out of the 124.
Files were read one by one => no parallel execution.

```bash
# Simple illustration of the commands used (not the full script).

# Write
pv -pr /tmp/big_file${file_number} /mnt/gridfs_fuse/

# Read
pv -pr /mnt/gridfs_fuse${file_number} > /dev/null
```


