Metadata-Version: 2.4
Name: primesgen
Version: 1.0.5
Summary: finds primes using Eratosthenes sieving algorithm
Author-email: Robert Koegler <robert.koegler@gmail.com>
Maintainer-email: Robert Koegler <robert.koegler@gmail.com>
License-Expression: MIT AND Apache-2.0
Description-Content-Type: text/x-rst
Requires-Dist: numpy
Requires-Dist: psutil
Requires-Dist: scipy

#########
primesgen
#########

primesgen is a collection of implementations of the Eratosthenes Sieve algorithm, leveraging numpy

all timing results displayed are from a 2.3 GHz 8-Core Intel Core i9/16 GB 2400 MHz DDR4 machine (Darwin)

config
======
config houses some platform dependent constant used internally (primarily for the factory methods and segmentation), to
use:

.. code-block:: python

    from primesgen.config import multiples_cutoff, multiples_block_size, primes_block_size, primes_cutoff, segment_size

the L1D cache size determination OS dependent and also is in config.py:

.. code-block:: python

    def get() -> int:
        os = platform.system()
        if os == 'Darwin':
            return _CacheSize._get_darwin()
        elif os == 'Linux':
            return _CacheSize._untested_get_linux()
        raise NotImplementedError(f'not implemented for your operating system: {os}')

implementations
===============
the 2 implementations can either return a list of primes (all in an array), or a generator

primesgen.Erastothenes
----------------------

fastest, and it's quite simple. Here is a quick example with timing:

.. code-block:: python

    from primesgen.sieve import Eratosthenes

    max_val = 1_000_000_000
    as_list = Eratosthenes().primes(max_val)

finds all 50,847,533 primes < 1,000,000,000 in 8.9 seconds

primesgen.SegmentedEratosthenes
-------------------------------
segments the sieve which allows processing larger numbers than the other implementations

multiples and primes
^^^^^^^^^^^^^^^^^^^^
the segmented algorithm accumulates prime values and segment multiples (1 for each prime < sqrt(max value)) for each
segment processed

these will ultimately run out of memory

if you don't care to manage the details, just use the factory methods

.. code-block:: python

    from primesgen.multiples import Multiples
    from primesgen.primes import Primes

    primes = Primes.factory()
    multiples = Multiples.factory()

you can also specify in memory types

.. code-block:: python

    from primesgen.multiples import MultiplesInMem
    from primesgen.primes import PrimesInMem

    max_val = 1_000_000_000
    primes = PrimesInMem()
    multiples = MultiplesInMem(max_val)

you can also specify backed by disk types, which only keep a portion of data on the data in memory with the rest backed
by files (files are in /var/tmp/primesgen*, with previous run(s) deleted each time)

.. code-block:: python

    from primesgen.config import multiples_block_size, primes_block_size
    from primesgen.multiples import MultiplesOnDisk
    from primesgen.primes import PrimesOnDisk

    max_val = 1_000_000_000
    primes = PrimesOnDisk(primes_block_size)
    multiples = MultiplesOnDisk(max_val, multiples_block_size)

finally, you can use a type for Primes that broadcasts the primes to a queue

you still need to decide what kind of multiples to use (here I chose OnDisk)

.. code-block:: python

    import queue
    from primesgen.config import multiples_block_size
    from primesgen.multiples import MultiplesOnDisk
    from primesgen.primes import PrimesToQueue

    max_val = 1_000_000_000
    primes_queue = queue.Queue(maxsize=0)
    primes = PrimesToQueue(primes_queue)
    multiples = MultiplesOnDisk(max_val, multiples_block_size)

full example
^^^^^^^^^^^^

using the PrimesToQueue and MultiplesOnDisk types (with a custom queue consumer that prints to screen)

.. code-block:: python

    import queue
    import time
    from threading import Thread

    from primesgen.config import segment_size, multiples_block_size
    from primesgen.multiples import Multiples, MultiplesOnDisk
    from primesgen.primes import Primes
    from primesgen.sieve import SegmentedEratosthenes


    def example_queue_consumer(in_queue: queue.Queue[int]) -> None:
        # this is an example of someone consuming the primes if a queue is provided
        print_count = 10_000_000
        count = 0
        partial_count = 0
        while True:
            prime = in_queue.get()
            in_queue.task_done()
            if prime is not None:
                count += 1
                partial_count += 1
            if partial_count >= print_count or prime is None:
                print(f'\tfound {count:,} primes thus far')
                partial_count = 0
            if prime is None:
                break

    max_val = 1_000_000_000
    primes_queue = queue.Queue(maxsize=0)
    primes = Primes.factory(in_queue=primes_queue)
    multiples = MultiplesOnDisk(max_val, multiples_block_size)
    sieve = SegmentedEratosthenes(segment_size, primes, multiples)
    print_thread = Thread(target=example_queue_consumer, args=(primes_queue,))
    sieve_thread = Thread(target=sieve.primes, args=(max_val,))
    start = time.time()
    sieve_thread.start()
    print_thread.start()
    sieve_thread.join()
    duration = time.time() - start
    print_thread.join()

.. code-block::

    <consumer> received 10,000,000 primes thus far (34.957 seconds)
    <consumer> received 20,000,000 primes thus far (75.803 seconds)
    <consumer> received 30,000,000 primes thus far (118.922 seconds)
    <consumer> received 40,000,000 primes thus far (155.940 seconds)
    <consumer> received 50,000,000 primes thus far (185.011 seconds)
    <consumer> received 50,847,533 primes thus far (185.791 seconds)

finds/broadcasts all 50,847,533 primes <= 1,000,000,000 in 184.0 seconds

example: full control
^^^^^^^^^^^^^^^^^^^^^

using the PrimesOnDisk and MultiplesOnDisk types, with all the block and segment sizes as vars you can tweak

.. code-block:: python

    from primesgen.multiples import MultiplesOnDisk
    from primesgen.primes import PrimesOnDisk
    from primesgen.sieve import SegmentedEratosthenes

    max_val = 100_000_000

    primes_block_size = 100_000
    mulitples_block_size = 50_000
    segment_max_size = 32_000

    primes = PrimesOnDisk(primes_block_size)
    multiples = MultiplesOnDisk(max_val, mulitples_block_size)
    sieve = SegmentedEratosthenes(segment_max_size, primes, multiples)
    primes_list = sieve.primes(max_val)

Tests
=====
pytest results should look like

.. code-block::

    === test session starts ====
    configfile: pyproject.toml
    collected 11 items

    tests/test_mulitples.py ..
    tests/test_primes.py ...
    tests/test_sieve.py ......

    === 11 passed in 21.05s ====
