Metadata-Version: 2.4
Name: primesgen
Version: 1.0.2
Summary: finds primes using Eratosthenes sieving algorithm
Author-email: Robert Koegler <robert.koegler@gmail.com>
Maintainer-email: Robert Koegler <robert.koegler@gmail.com>
License-Expression: MIT AND Apache-2.0
Description-Content-Type: text/x-rst
Requires-Dist: numpy
Requires-Dist: sympy

#########
primesgen
#########

primesgen is a collection of implementations of the Eratosthenes Sieve algorithm

all timing results displayed are from a 2.3 GHz 8-Core Intel Core i9/16 GB 2400 MHz DDR4 machine (Darwin).

Basic Usage
===========

get_primes_less_than
--------------------

.. code-block:: python

   import time
   from src.primesgen import get_primes_less_than

   max_val = 1_000_000_000
   start = time.time()

   primes = get_primes_less_than(max_val)

   duration = time.time() - start
   print(f'\tfound all {len(primes):,} primes <= {max_val:,} in {duration:.2f} seconds')

.. code-block::

   > found all 50,847,533 primes <= 1,000,000,000 in 15.64 seconds

Globals and L1D Cache Size
==========================

There are 3 global values, and a function to get the L1D cache size in __init__.py.

.. code-block:: python

    '''non-segmented only, boundary between using Eratosthenes and IndexEratosthenes'''
    sieve_factory_erat_indexed_cutoff = 800_000_000
    '''segmented only, number of multiples allowed in mem before the OnDiskMultiples writes to disk'''
    on_disk_multiples_block_size = 25_000
    ''' segmented only, boundary between using InMemMultiples and OnDiskMultiples'''
    multiples_factory_ondisk_inmem_cutoff = 1_000_000

    ''' 'Darwin' works
        'Linux' is untested
        'Windows' (or others) is not implemented'''
    def get_cache_size() -> int:

Implementations
===============

primesgen.Erastothenes
----------------------

Faster for numbers < 1,000,000,000

.. code-block:: python

    def _core(in_max_val: int) -> list[int]:
        sieve = np.ones(in_max_val + 1, dtype=bool)
        sieve[0] = sieve[1] = sieve[2::2] = False
        max_sqrt_value = math.ceil(math.sqrt(in_max_val + 1)) + 1
        for prime in (value for value in range(3, max_sqrt_value, 2) if sieve[value]):
            sieve[prime * prime::prime] = False
        return list(np.where(sieve)[0].astype(int))

primesgen.IndexedEratosthenes
-----------------------------
Uses a 1/2 size sieve but requires extra calculations (see _to_val and _to_index). Faster for numbers > 1,000,0000. Here
is a snippet:

.. code-block:: python

    def _to_val(index: int) -> int: return 2 * index + 3
    def _to_index(value: float | int) -> int: return round(value * .5 - 1.5)

    def _core(in_max_val: int) -> list[int]:
        max_val = in_max_val - 1 if in_max_val % 2 == 0 else in_max_val
        sieve = np.ones(IndexedEratosthenes._to_index(max_val) + 1, dtype=bool)
        max_sqrt_value_index = IndexedEratosthenes._to_index(math.ceil(math.sqrt(max_val + 1)) + 1)
        for p_val in (IndexedEratosthenes._to_val(p_index) for p_index in range(0, max_sqrt_value_index + 1) if
                      sieve[p_index]):
            sieve[IndexedEratosthenes._to_index(p_val * p_val)::p_val] = False
        return [IndexedEratosthenes._to_val(p_index) for p_index in np.where(sieve)[0]]


primesgen.SegmentedEratosthenes
-------------------------------
Segments the sieve which allows processing larger numbers than the other implementations.


Using Defaults for the multiples and primes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: python

    import time

    from src import get_cache_size
    from src.multiples import factory as multiples_factory
    from src.primes import factory as primes_factory
    from src.primesgenseg import SegmentedEratosthenes

    max_val = 1_000_000_000
    primes = primes_factory()
    multiples = multiples_factory(max_val)
    segment_max_size = get_cache_size()
    start = time.time()

    prime_values = SegmentedEratosthenes(segment_max_size, primes, multiples).primes(max_val)

    duration = time.time() - start
    print(f'\tfound all {len(prime_values):,} primes <= {max_val:,} in {duration:.2f} seconds')

.. code-block::

    found all 50,847,533 primes <= 1,000,000,000 in 117.92 seconds

The defaults used are

.. code-block:: python

    primes = primes_factory()
    multiples = multiples_factory(max_val)

**primes_factory** (default) creates primes that will reside in memory until return.

If a number >= **multiples_factory_ondisk_inmem_cutoff**, then the multiples in memory will be backed up disk
(**/var/tmp/**) as needed, only having at most **on_disk_multiples_block_size** in memory.


Using a queue for primes
^^^^^^^^^^^^^^^^^^^^^^^^

Give the primes_factory a queue.Queue, prime values will be broadcast immediately and not stored in memory. Here is an
example with a custom queue consumer:

.. code-block:: python

    def example_queue_consumer(in_queue: queue.Queue[int]) -> None:
        # this is an example of someone consuming the primes if a queue is provided
        print_count = 1_000_000
        count = 0
        partial_count = 0
        while True:
            prime = in_queue.get()
            in_queue.task_done()
            if prime is not None:
                count += 1
                partial_count += 1
            if partial_count >= print_count or prime is None:
                print(f'\tfound {count:,} primes thus far')
                partial_count = 0
            if prime is None:
                break

    max_value = 1_000_000_000
    primes = primes_factory(primes_queue)
    multiples = multiples_factory(max_val)
    sieve = SegmentedEratosthenes(get_cache_size(), primes, multiples)

    sieve_thread = Thread(target=sieve.primes, args=(max_value,))
    print_thread = Thread(target=Queue.example_queue_consumer, args=(primes_queue,))
    sieve_thread.start()
    print_thread.start()
    sieve_thread.join()
    print_thread.join()

.. code-block::

    found 1,000,000 primes thus far
    found 2,000,000 primes thus far
    found 3,000,000 primes thus far
    ...
    found 49,000,000 primes thus far
    found 50,000,000 primes thus far
    found 50,847,533 primes thus far

Tests
=====
pytest results should look like



.. code-block::

    ============================= test session starts ==============================
    collecting ... collected 10 items

    test_primesgen.py::test_100_000
    test_primesgen.py::test_1_000_000
    test_primesgen.py::test_10_000_000
    test_primesgen.py::test_100_000_000
    test_primesgen.py::test_35_123_231
    test_primesgenseg.py::test_84_max_val_9_segment_size PASSED
    test_primesgenseg.py::test_130_max_val_9_segment_size
    test_primesgenseg.py::test_segment_boundaries
    test_primesgenseg.py::test_100_000
    test_primesgenseg.py::test_120_001

    ============================== 10 passed in 4.66s ==============================
    PASSED
