Metadata-Version: 2.4
Name: tracemem
Version: 0.5.0
Summary: A lightweight tool to measure and trace the full memory of a Python session
Author-email: Nyggus <nyggus@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/nyggus/tracemem/
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: psutil
Requires-Dist: rounder
Provides-Extra: dev
Requires-Dist: wheel; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: setuptools; extra == "dev"
Requires-Dist: build; extra == "dev"
Dynamic: license-file

# `tracemem`: Memory tracker for Python sessions

`tracemem` enables you to check the full memory used by a Python session. It also offers simple tools to keep the memory used by the session in subsequent moments, which is why we can say `tracemem` lets you track session memory used by a Python session.

`tracemem` is a very lightweight package for profiling memory use. It's a very simple wrapper around `psutil.Process.memory_info()`. To measure the memory used by the session, `tracemem` uses the `psutil.Process.memory_info.rss` (Resident Set Size) object, which is the actual physical memory (RAM) used by the process, representing the real memory footprint of the Python session.

`tracemem`'s only purpose is to measure memory usage by a Python session, so you cannot, for instance, measure a memory used by a particular function or object. For this, you can use other tools, such as

* [`pympler`](https://pypi.org/project/Pympler/)
* [`psutil`](https://pypi.org/project/psutil/)
* [`memory_profiler`](https://pypi.org/project/memory-profiler/)
* [`perftester`](https://pypi.org/project/perftester/)

and others.

## Usage

Since this is a profiling tool, `tracemem` code can be by is seldom used by applications; typically, it's used for profiling purposes. Hence, to make using the tool easier, it's objects are available as `builtins` global variables, that is, as variables obtained from any module used in the session. Hence, you do not have to import them in every module in which you're using the tools. So, to use this functionality, it's enough to import `tracemem `in any of the modules of your application; after this import, all `tracemem` functions and objects are available inside the Python session, hence, in any module of your application.

Nevertheless, `tracemem` also offers the traditional structure of a Python packages's API. Thus, after importing `tracemem`, you can use its functions as, for example, `tracemem.MEMPRINT()` — but only in the very module/script in which you imported it. If you want to use `tracemem` as the global module, you need to do it as it's described above. Hereafter, we will use `tracemem` as a global module, but — just in case — do remember about the traditional API the package offers.

Here's a list of all `tracemem` functions:

* `MEMPOINT()`, which creates a memory point in your session (see below)
* `MEMORY()`, which prints the memory usage, without creating a memory point
* `MEMPRINT()`, which prints `MEMLOGS` (see below)
* `tracemem()`, a decorator function that creates a memory point before and after calling the decorated method

In addition, `tracemem` offers one more object:

* `MEMLOGS`, an object of a  `MemLogsList` class, a list-like container that keeps all memory points created during a session

To use `tracemem`, you only need to import it:

```python-repl
>>> import tracemem

```

### `MEMPOINT()`: Creating memory points

The main function is `MEMPOINT()`, which creates a memory point — a measurement point of the memory used by a Python session — and adds it to `MEMLOGS`, a list-like container of memory points.

> **A memory point**: A measurement point of the memory used by a Python session.

The first memory point is when `tracemem` is imported. We can see this by checking the `MEMLOGS` object, which can be accessed from the `builtins` global space:

```python-repl
>>> MEMLOGS
[MemLog(ID='tracemem import', memory=...)]
>>> MEMPOINT()
>>> len(MEMLOGS)
2
>>> MEMPOINT("The second MEMPOINT")
>>> len(MEMLOGS)
3
>>> MEMLOGS
[MemLog(ID='tracemem import', memory=...),
 MemLog(ID='None', memory=...),
 MemLog(ID='The second MEMPOINT', memory=...)]

```

(The measured memory usage is not included in the doctests, as they would fail.)

A memory point creates a point with an ID, which by default is `None`; `MEMPOINT()` adds such a memory point `MEMLOGS`. When you create two points with the same ID, say "my id", the second time it will be replaced with "my id-2", and so on. Note that while you can use any object as an ID, its string representation will be used instead:

```python-repl
>>> MEMPOINT()
>>> MEMLOGS[-1].ID
'None-2'

```

In addition to IDs, memory points contain their essence: the memory used by the current session, in bytes. Let's see what happens when we add a big list to the scope and then remove it:

```python-repl
>>> li = [i for i in range(10_000_000)]
>>> MEMPOINT("After adding a list with 10 mln elements")
>>> del li
>>> MEMPOINT("After removing this list")
>>> MEMLOGS[-2].memory > MEMLOGS[-1].memory
True

```

This basically means that adding so big a list to the scope makes the session use over a hundred times more memory.

#### Returning memory from `MEMPOINT()`

If you wish, you can make the `MEMPOINT()` function both log memory and return it — enough to set a (required) keyword argument `return_memory` to `True`. Note, however, that it returns it in bytes:

```python-repl
>>> MEMPOINT("With a return value", return_memory=True)
2...

```

### `MEMLOGS`: A container of memory points

`MEMLOGS` is actually not a list but an object of a `tracemem.MemLogsList` class:

```python-repl
>>> type(MEMLOGS).__name__
'MemLogsList'

```

This class inherits from `collections.UserList`, but it works in quite a different way than a regular list. First of all, it's a singleton class, so `MEMLOGS` is its only instance. The only method to update it is to use the `MEMPOINT()` function. You cannot append anything to it, and item assignment does not work for it, either; neither do multiplication and adding.

Note that `MEMLOGS` elements are instances of a `MemLog` named tuple (`collections.namedtuple`, to be precise). So, you can access its two items as if it were a regular tuple, or using the names of its two attributes, `ID` and `memory`:

```python-repl
>>> MEMPOINT("Just checking")
>>> m = MEMLOGS[-1]
>>> type(m).__name__
'MemLog'
>>> m.ID
'Just checking'
>>> m[0]
'Just checking'
>>> type(m.memory).__name__
'int'
>>> type(m[1]).__name__
'int'

```

You can use several additional methods and properties of the `MEMLOGS` object:

* `.memories`, a property that returns all the memories reported until the moment
* `IDs`, like above but for IDs
* `.filter()`, a method for filtering `MEMLOGS`
* `.map()`, a method for applying a function to all elements of `MEMLOGS`

Let's see how this works:

```python-repl
>>> type(MEMLOGS.memories), len(MEMLOGS.memories)
(<class 'list'>, 8)
>>> MEMLOGS.IDs
['tracemem import',
 'None',
 'The second MEMPOINT',
 'None-2',
 'After adding a list with 10 mln elements',
 'After removing this list',
 'With a return value',
 'Just checking']

```

The `.filter()` methods accepts one argument, that is, a predicate to be used for filtering, just like you'd use with the built-in `filter()` function. For the `.filter()` method, however, you need to create a predicate working with `MemLog` elements. Unlike the built-in `filter()` function, it does not create a generator but a list. This is because `MEMLOGS` is not expected to be a large object.

```python-repl
>>> def memory_over(memlog: tracemem.MemLog) -> bool:
...     return memlog.memory > 100 * (1024 ** 2)
>>> MEMLOGS.filter(memory_over)
[MemLog(ID='After adding a list with 10 mln elements', memory=...)]

```

We can of course use a `lambda` function instead:

```python-repl
>>> MEMLOGS.filter(lambda m: m.memory > 100 * (1024 ** 2))
[MemLog(ID='After adding a list with 10 mln elements', memory=...)]
>>> MEMLOGS.filter(lambda m: m.memory < 1_000_000)
[]
>>> MEMLOGS.filter(lambda m: "after" in m.ID.lower() or "before" in m.ID.lower())
[MemLog(ID='After adding a list with 10 mln elements', memory=...),
 MemLog(ID='After removing this list', memory=...)]

```

And here's the `.map()` method in action. Like the `.filter()` method, it returns a list:

```python-repl
>>> as_MB = MEMLOGS.map(lambda m: m.memory / 1024 / 1024)
>>> all(m < 500 for m in as_MB)
True
>>> MEMLOGS.map(lambda m: m.ID.lower())
['tracemem import',
 'none',
 'the second mempoint',
 'none-2',
 'after adding a list with 10 mln elements',
 'after removing this list',
 'with a return value',
 'just checking']
>>> memlogs = MEMLOGS.map(lambda m: (m.ID.lower(), round(m.memory / 1024 / 1024)))
>>> memlogs[:2]
[('tracemem import', ...), ('none', ...)]

```

## `MEMPRINT()`: Printing `MEMLOGS`

To print `MEMLOGS`, you can use a dedicated function `MEMPRINT()`, which converts memories to MB and pretty-prints the memory points collected in `MEMLOGS`:

```python-repl
>>> MEMPRINT()
 0   ...    → tracemem import
 1   ...    → None
 2   ...    → The second MEMPOINT
 3   ...    → None-2
 4   ...    → After adding a list with 10 mln elements
 5   ...    → After removing this list
 6   ...    → Just checking

```

## `@MEMTRACE`: Creating memory points by decorating a function

If you want to log the full-memory usage of a particular function, you can use the `@MEMTRACE` decorator. It creates two memory points: right before and right after calling the function. Just like the other `tracemem` tools, you do not need to import the decorator:

```python-repl
>>> @MEMTRACE
... def create_huge_list(n):
...     return [i for i in range(n)]
>>> li = create_huge_list(10_000_000)
>>> del li
>>> MEMPOINT()
>>> MEMLOGS[-3:]
[MemLog(ID='Before create_huge_list()', memory=...),
 MemLog(ID='After create_huge_list()', memory=...),
 MemLog(ID='None-3', memory=...)]
>>> MEMLOGS[-2].memory > MEMLOGS[-1].memory
True

```

## `MEMORY()`: Printing current memory usage without creating a memory point

Above, we've seen the most common use of `tracemem`'s full-memory tracer. There's one additional function, `MEMORY()`, which returns the current full memory of the session:

```python-repl
>>> mem = MEMORY()
>>> type(mem)
<class 'int'>

```

The function does not create a memory point, so it does not log the memory usage to `MEMLOGS`:

```python-repl
>>> len(MEMLOGS)
11
>>> _ = MEMORY()
>>> len(MEMLOGS)
11
>>> MEMPOINT("Just once more")
>>> len(MEMLOGS)
12
>>> _ = MEMORY()
>>> len(MEMLOGS)
12

```

## `MB()`: Calculate MB from B

`psutil` works with memory in bytes. You can use the `MB()` function to convert it to megabytes:

```python-repl
>>> memory = 19853463
>>> tracemem.MB(memory)
18.933737754821777

```

As you see, MB are not rounded, but you can do it by providing a rounding function as `MB`'s argument:

```python-repl
>>> memory = 19853463
>>> tracemem.MB(memory, round)
19
>>> tracemem.MB(memory, round, ndigits=2) # ndigits is the argument of round()
18.93
>>> tracemem.MB(memory, round, 3) # ndigits is the argument of round()
18.934

```

`tracemem` uses [`rounder`](https://github.com/nyggus/rounder), which enables you to round complex Python objects with just one operation. Its `signif()` function, which rounds a number to significant digits, is imported along with `tracemem`:

```python-repl
>>> memory = 19853463
>>> tracemem.MB(memory, tracemem.signif, 3)
18.9
>>> tracemem.MB(memory, tracemem.signif, 5)
18.934

```

Since `rounder` will be installed in your virtual environment along with `tracemem`, you can import its functions directly from it:

```python-repl
>>> memory = 19853463
>>> import rounder
>>> tracemem.MB(memory, rounder.signif, 3)
18.9
>>> tracemem.MB(memory, rounder.signif, 5)
18.934

```

Look up [`rounder`](https://github.com/nyggus/rounder) for its other rounding functions.


## Why the `builtins` global scope?

Since this feature of `tracemem` is to be used to debug memory use from various modules, it'd be inconvinient to import the required objects in all these modules. That's why the required objects are kept in the global scope — but this can change in future versions.

### Linters

Since `tracemem` works in the global scope, linters will scream at you. If this bothers you, you can add `# type: ignore` in the line where you use a `tracemem`'s command, like here:

```python
import tracemem

# the code...

MEMORY() # type: ignore

# the code...

```

## Unit testing

The package is covered with documentation tests and unit tests, located in this README and in the main module, [tracemem.py](tracemem.py). To run them, you need to use three `doctest` flags: `doctest.ELLIPSIS`, `doctest.NORMALIZE_WHITESPACE` and `doctest.IGNORE_EXCEPTION_DETAIL`:

```bash
(venv-tracemem) $ python -m doctest README.md -o ELLIPSIS -o NORMALIZE_WHITESPACE -o IGNORE_EXCEPTION_DETAIL
(venv-tracemem) $ python tracemem/tracemem.py

```

Remember to run the commands in the virtual environment, here called `venv-tracemem`.

For the moment, `doctest` is the only testing framework used in `tracemem`, but if it occurrs to be insufficient, `pytest`would be implemented.

## Operating systems

The package is developed in Linux (actually, under WSL) and checked in Windows 11, so it works in both these environments.

## Contribution

Any contribution will be welcome. You can submit an issue in the [repository](https://github.com/nyggus/perftester). You can also create your own pull requests.
