Metadata-Version: 2.4
Name: org_analyze
Version: 0.1.5
Summary: Collect data from org-mode/org-roam pages and do some simple analyzing it.
Home-page: https://github.com/ojari/OrgAnalyze
Author: Jari Ojanen
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Developers
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: requires-python
Dynamic: summary

# OrgAnalyze

Collect data from org-mode/org-roam pages and do some simple analyzing it

Items parsed:
 - Lines starting with "CLOCK:" or "#+CLK:" as OrgClock
 - Headers starting with "*", "**", etc. as OrgHeader
 - Tables starting with "|", as OrgTable

## read_org_clocks_2

This function parses all `*.org` files in a given directory. It extracts all clocking information and associates it with its parent header (Feature) and sub-header (Task).

The function returns a tuple containing a list of column names and a list of rows. This structure is ideal for creating a pandas DataFrame.

### Example Usage

Let's say you have an org file `tasks.org` in a directory called `my_orgs` with the following content:

```org
* Feature A
** Task 1
CLOCK: [2025-10-25 Sat 10:00]--[2025-10-25 Sat 11:30] =>  1:30
** Task 2
CLOCK: [2025-10-25 Sat 12:00]--[2025-10-25 Sat 13:00] =>  1:00

* Feature B
** Task 3
CLOCK: [2025-10-25 Sat 14:00]--[2025-10-25 Sat 14:30] =>  0:30
```

You can parse this file and analyze the data with pandas like this:

```python
import pandas as pd
from org_analyze.clocks import read_org_clocks_2

# 1. Parse the org files in the directory
columns, rows = read_org_clocks_2('my_orgs')

# 2. Create a pandas DataFrame
df = pd.DataFrame(rows, columns=columns)

# 3. Analyze the data: Group by feature (head1) and sum the duration
feature_hours = df.groupby('head1')['duration'].sum()

print("Total hours per feature:")
print(feature_hours)

```

Output:

```
Total hours per feature:
head1
Feature A    2.5
Feature B    0.5
Name: duration, dtype: float64
```
