Metadata-Version: 2.1
Name: jumpDB
Version: 0.0.4
Summary: A simple kv store 
Home-page: https://github.com/NavyaZaveri/jumpDB
Author: Navya Zaveri
Author-email: author@example.com
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: attrs (==19.3.0)
Requires-Dist: Babel (==2.8.0)
Requires-Dist: bitarray (==1.2.1)
Requires-Dist: certifi (==2019.11.28)
Requires-Dist: chardet (==3.0.4)
Requires-Dist: docutils (==0.16)
Requires-Dist: idna (==2.8)
Requires-Dist: imagesize (==1.2.0)
Requires-Dist: importlib-metadata (==1.4.0)
Requires-Dist: Jinja2 (==2.10.3)
Requires-Dist: MarkupSafe (==1.1.1)
Requires-Dist: more-itertools (==8.1.0)
Requires-Dist: packaging (==20.0)
Requires-Dist: pluggy (==0.13.1)
Requires-Dist: py (==1.8.1)
Requires-Dist: pybloom-live (==3.0.0)
Requires-Dist: Pygments (==2.5.2)
Requires-Dist: pyparsing (==2.4.6)
Requires-Dist: pytest (==5.3.3)
Requires-Dist: pytz (==2019.3)
Requires-Dist: recommonmark (==0.6.0)
Requires-Dist: requests (==2.22.0)
Requires-Dist: six (==1.14.0)
Requires-Dist: snowballstemmer (==2.0.0)
Requires-Dist: sortedcontainers (==2.1.0)
Requires-Dist: urllib3 (==1.25.8)
Requires-Dist: wcwidth (==0.1.8)
Requires-Dist: zipp (==1.0.0)

## JumpDB

JumpDB is a simple key-value store that exploits Sorted String Tables.

Here's a [tutorial](https://navyazaveri.github.io/algorithms/2020/01/12/write-a-kv-store-from-scratch.html)  which goes a little more in-depth into how it works.



### Install  

```
pip3 install jumpDB
```



### Usage 

```
from jumpDB import DB

db = DB(max_inmemory_size=2, persist_segments=False)
db["k1"] = "v1"
db["k2"] = "v2"
del db["k2"]
assert db["k1"] == "v1"
assert "k2" not in db
```


### API

* `get(key)` => Finds the corresponding value to the given key 

* `set(key, value)` => Insert the entry into the db 

* `delete(key)` => Deletes the key from the db 

* `contains(key)` => Checks if the given key is present in the db 



### Design & Implementation 

Th design is essentially a simplified version of [levelDB](https://en.wikipedia.org/wiki/LevelDB). 

Every write is initially inserted into an in-memory data structure (typically called "memtable")
 -- in this case,  a red-black tree. 

When the memtable's size exceeds a certain threshold, all entries are written out into a segment file. 
Exploiting the properties of a red-black BST, we can ensure all entries will be efficiently written in sorted order.
The resulting file is immutable and called a sorted-string table (SST).

Whilst performing the above write, we also maintain a sparse index table, keeping track of the 
file offset of every in 1 in x entries. 

When a read comes in, we first look into the memtable for the corresponding k-v pair; if it doesn't exist, 
we look at the *closest* entry (by key) in the sparse table. We *jump* to the file offset of that entry and then linearly scan forwards 
 until we find the desired key-value pair. This is only possible because the SST is sorted by key.


Periodically, the segments are merged (also called "compaction"); this ensures a reduction 
in memory footprint as the resulting merged segments(s) would only hold the most recent entries. 

An addition optimisation includes the use of bloom-filters to check if a key is not present in 
the DB. This saves us from performing heavy disk reads for keys that haven't been inserted into the db. 



### Tests 
Run `pytest -v`


### License 
BSD 2-Clause License


