Metadata-Version: 2.4
Name: discrete-fourier
Version: 0.3.5
Summary: Calculate discrete Fourier coefficients, detect periods, and validate periodic patterns
License: MIT
License-File: LICENSE
Author: Ricardo Marcelo Alvarez
Author-email: rmalvarezkai@gmail.com
Requires-Python: >=3.13,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: numpy (>=1.20.0)
Description-Content-Type: text/markdown

# Discrete Fourier

A Python library for computing Discrete Fourier Series coefficients and reconstructing signals from discrete data. This implementation provides efficient methods for Fourier analysis, signal reconstruction, and derivative calculations.

## Features

- **Fourier Coefficient Calculation**: Compute real Fourier series coefficients (a_k, b_k) from discrete data
- **Signal Reconstruction**: Evaluate the Fourier series at any position (interpolation and extrapolation)
- **Magnitude Filtering**: Filter reconstruction by top N components or by percentage of maximum magnitude
- **First Derivative**: Calculate the rate of change of the reconstructed signal
- **Second Derivative**: Compute the curvature/acceleration of the signal
- **Period Detection**: Find dominant periods and cyclical patterns in data
- **Period Validation**: Verify if detected periods are real patterns with confidence scores
- **NumPy-based**: Efficient vectorized computations for fast performance

## Why Not Just Use NumPy FFT?

While NumPy and SciPy provide excellent FFT implementations, this library offers distinct advantages for working with **real Fourier series**:

### **Real Coefficients vs Complex Numbers**

**NumPy FFT:**
```python
import numpy as np

fft_result = np.fft.fft(data)
# Returns: [c0, c1, c2, ...] - complex numbers
# Example: [(2.5+0j), (1.2+0.8j), (-0.5-1.2j), ...]
# Requires understanding of complex arithmetic
```

**This Library:**
```python
from discrete_fourier import DiscreteFourier

coefs = DiscreteFourier.calculate_fourier_coefs(data)
# Returns: (a_k, b_k) - separate real arrays
# a_k: [2.5, 1.2, -0.5, ...]  (cosine coefficients)
# b_k: [0.0, 0.8, -1.2, ...]  (sine coefficients)
# More interpretable, no complex numbers needed
```

### **Point Evaluation**

**NumPy FFT:**
```python
# To evaluate at a single point, you must:
# 1. Compute full inverse FFT
reconstructed = np.fft.ifft(fft_result).real
value_at_50 = reconstructed[50]  # Only works for original indices

# 2. For arbitrary positions (like t=50.5), need custom interpolation
```

**This Library:**
```python
# Evaluate at any position directly
value = DiscreteFourier.calculate_fourier_value(coefs, 50.5)
# Works for any t, including beyond original data range
```

### **Analytical Derivatives**

**NumPy FFT:**
```python
# No built-in derivative calculation
# Must write custom code to differentiate Fourier series
# Requires understanding of complex derivative formulas
```

**This Library:**
```python
# Built-in analytical derivatives
first_deriv = DiscreteFourier.calculate_fourier_derivative_value(coefs, t)
second_deriv = DiscreteFourier.calculate_fourier_double_derivative_value(coefs, t)
# Ready to use for trend analysis, extrema detection, etc.
```

### **Comparison Table**

| Feature | NumPy/SciPy FFT | discrete_fourier |
|---------|----------------|------------------|
| Coefficient format | Complex numbers | Real (a_k, b_k) |
| Learning curve | Requires complex math | Simple real arithmetic |
| Point evaluation | Reconstruct full signal | Direct evaluation at any t |
| Derivatives | Manual implementation | Built-in 1st & 2nd derivatives |
| Extrapolation | Not straightforward | Natural (periodic) |
| Use case | General FFT operations | Real Fourier series focus |

### **When to Use This Library**

✅ **Use discrete_fourier when:**
- Working with real-valued signals (not complex)
- Need to evaluate series at arbitrary positions
- Require derivative calculations
- Want interpretable cosine/sine coefficients
- Teaching or learning Fourier series concepts
- Prefer simple API over FFT theory

❌ **Use NumPy/SciPy FFT when:**
- Need full FFT/IFFT transformations
- Working with complex signals
- Require 2D/3D transforms
- Need specialized FFT algorithms (Bluestein, Rader, etc.)
- Performance critical applications with large datasets

## Installation

```bash
pip install discrete_fourier
```

## Quick Start

```python
from discrete_fourier import DiscreteFourier

# Sample data
data = [1.0, 2.5, 4.0, 3.5, 2.0, 1.5]

# Calculate Fourier coefficients
coefs = DiscreteFourier.calculate_fourier_coefs(data)

# Reconstruct values at original positions
for i in range(len(data)):
    value = DiscreteFourier.calculate_fourier_value(coefs, i + 1)
    print(f"Position {i+1}: Original={data[i]:.2f}, Reconstructed={value:.2f}")

# Reconstruct with filtering (keeps only top 3 frequency components)
for i in range(len(data)):
    filtered = DiscreteFourier.calculate_fourier_value(coefs, i + 1, filter_top_n=3)
    print(f"Position {i+1}: Filtered={filtered:.2f}")

# Reconstruct with percentage filtering (keeps coefficients >= 5% of max magnitude)
for i in range(len(data)):
    filtered_porc = DiscreteFourier.calculate_fourier_value(coefs, i + 1, filter_porc=5.0)
    print(f"Position {i+1}: Filtered (5%)={filtered_porc:.2f}")

# Predict future values (extrapolation)
future_value = DiscreteFourier.calculate_fourier_value(coefs, len(data) + 1)
print(f"Next value: {future_value:.2f}")

# Calculate derivatives
derivative = DiscreteFourier.calculate_fourier_derivative_value(coefs, 3)
second_deriv = DiscreteFourier.calculate_fourier_double_derivative_value(coefs, 3)
print(f"At position 3: f'={derivative:.2f}, f''={second_deriv:.2f}")

# Find dominant period
dominant = DiscreteFourier.find_dominant_period(data)
print(f"Dominant period: {dominant['period']:.2f} points")

# Validate the period
validation = DiscreteFourier.validate_period(data, dominant['period'])
print(f"Period is valid: {validation['valid']}")
print(f"Confidence: {validation['confidence']:.2f}")

# Find top 3 periods
top_periods = DiscreteFourier.find_top_periods(data, n_periods=3)
for i, p in enumerate(top_periods, 1):
    print(f"{i}. Period: {p['period']:.1f}, k={p['k']}, {p['percent']:.1f}% of signal")
```

## API Reference

### `DiscreteFourier.calculate_fourier_coefs(data_in)`

Calculate Fourier series coefficients from discrete data.

**Parameters:**
- `data_in` (list or array-like): Input data sequence

**Returns:**
- `tuple`: (a_k, b_k) where:
  - `a_k`: Cosine coefficients (a_k[0] is the mean)
  - `b_k`: Sine coefficients (b_k[0] is always 0)

**Notes:**
- If input has odd length, the first element is removed to ensure even length
- Uses real Fourier series representation: `f(t) = a_0 + Σ[a_k*cos(2πkt/N) + b_k*sin(2πkt/N)]`

### `DiscreteFourier.calculate_fourier_value(fourier_coefs, t, filter_top_n=None, filter_porc=None)`

Reconstruct the signal value at position t, optionally with magnitude filtering.

**Parameters:**
- `fourier_coefs` (tuple): (a_k, b_k) from `calculate_fourier_coefs()`
- `t` (int or float): Position to evaluate (can be beyond original data range)
- `filter_top_n` (int, optional): If provided, filters coefficients to keep only those with the highest magnitudes. Finds the top N dominant periods and sets all coefficients with magnitude smaller than the minimum magnitude among the top N to zero.
- `filter_porc` (float, optional): If provided, filters coefficients by percentage of maximum magnitude. Calculates the maximum magnitude and sets a threshold at filter_porc percent of that maximum. All coefficients with magnitude below this threshold are set to zero. For example, filter_porc=2.5 keeps only coefficients with magnitude >= 2.5% of the maximum magnitude.

**Returns:**
- `float`: Reconstructed value at position t

**Notes:**
- Due to periodicity: f(t) = f(t + N) where N is the original data length
- Can be used for interpolation (within data range) or extrapolation (beyond data range)
- When `filter_top_n` or `filter_porc` is used, weak frequency components are eliminated, reducing noise while preserving the most significant periodic patterns
- Only one filter should be used at a time (filter_top_n or filter_porc)

**Filtering process (when filter_top_n is provided):**
1. Finds the top N periods using `find_top_periods()`
2. Determines the minimum magnitude among those top N periods
3. Calculates magnitude for each coefficient: sqrt(a_k² + b_k²)
4. Sets to zero all coefficients where magnitude < minimum magnitude
5. Computes the series with the filtered coefficients

**Filtering process (when filter_porc is provided):**
1. Calculates magnitude for each coefficient: sqrt(a_k² + b_k²)
2. Finds the maximum magnitude across all coefficients
3. Sets minimum threshold: min_magnitude = (max_magnitude * filter_porc) / 100
4. Sets to zero all coefficients where magnitude < min_magnitude
5. Computes the series with the filtered coefficients

**Example:**
```python
# Normal reconstruction
value = DiscreteFourier.calculate_fourier_value(coefs, 10)

# Filtered reconstruction (keeps only top 5 frequency components)
filtered_value = DiscreteFourier.calculate_fourier_value(coefs, 10, filter_top_n=5)

# Percentage-based filtering (keeps coefficients >= 10% of max magnitude)
filtered_by_percent = DiscreteFourier.calculate_fourier_value(coefs, 10, filter_porc=10.0)

# Compare original vs filtered reconstruction
import matplotlib.pyplot as plt
data = [1.0, 2.5, 4.0, 3.5, 2.0, 1.5, 2.0, 3.0, 4.0, 3.0, 2.0, 1.5]
coefs = DiscreteFourier.calculate_fourier_coefs(data)

t_vals = numpy.linspace(1, len(data), 100)
original_curve = [DiscreteFourier.calculate_fourier_value(coefs, t) for t in t_vals]
filtered_curve = [DiscreteFourier.calculate_fourier_value(coefs, t, filter_top_n=3) for t in t_vals]
filtered_by_pct = [DiscreteFourier.calculate_fourier_value(coefs, t, filter_porc=5.0) for t in t_vals]

plt.plot(t_vals, original_curve, label='Original')
plt.plot(t_vals, filtered_curve, label='Filtered (top 3)', linestyle='--')
plt.plot(t_vals, filtered_by_pct, label='Filtered (5%)', linestyle=':')
plt.scatter(range(1, len(data)+1), data, color='red', label='Data points')
plt.legend()
plt.show()
```

### `DiscreteFourier.calculate_fourier_derivative_value(fourier_coefs, t, filter_top_n=None, filter_porc=None)`

Calculate the first derivative (slope) at position t, optionally with magnitude filtering.

**Parameters:**
- `fourier_coefs` (tuple): (a_k, b_k) from `calculate_fourier_coefs()`
- `t` (int or float): Position to evaluate
- `filter_top_n` (int, optional): If provided, filters coefficients to keep only those with the highest magnitudes before computing the derivative. Uses the same filtering logic as `calculate_fourier_value()`.
- `filter_porc` (float, optional): If provided, filters coefficients by percentage of maximum magnitude before computing the derivative. Uses the same filtering logic as `calculate_fourier_value()`.

**Returns:**
- `float`: First derivative df/dt at position t

**Use cases:**
- Trend detection (positive = increasing, negative = decreasing)
- Finding local maxima/minima (where f'(t) = 0)
- Velocity calculation from position data

**Notes:**
- When `filter_top_n` or `filter_porc` is used, weak frequency components are filtered out before computing the derivative, providing a smoother trend indication
- Only one filter should be used at a time (filter_top_n or filter_porc)

**Example:**
```python
# Normal derivative
deriv = DiscreteFourier.calculate_fourier_derivative_value(coefs, 10)

# Filtered derivative (smoother, using only top 3 frequencies)
deriv_filtered = DiscreteFourier.calculate_fourier_derivative_value(coefs, 10, filter_top_n=3)

# Filtered derivative by percentage (keeps components >= 5% of max magnitude)
deriv_by_pct = DiscreteFourier.calculate_fourier_derivative_value(coefs, 10, filter_porc=5.0)
```

### `DiscreteFourier.calculate_fourier_double_derivative_value(fourier_coefs, t, filter_top_n=None, filter_porc=None)`

Calculate the second derivative (curvature) at position t, optionally with magnitude filtering.

**Parameters:**
- `fourier_coefs` (tuple): (a_k, b_k) from `calculate_fourier_coefs()`
- `t` (int or float): Position to evaluate
- `filter_top_n` (int, optional): If provided, filters coefficients to keep only those with the highest magnitudes before computing the second derivative. Uses the same filtering logic as `calculate_fourier_value()`.
- `filter_porc` (float, optional): If provided, filters coefficients by percentage of maximum magnitude before computing the second derivative. Uses the same filtering logic as `calculate_fourier_value()`.

**Returns:**
- `float`: Second derivative d²f/dt² at position t

**Use cases:**
- Concavity detection (positive = concave up, negative = concave down)
- Finding inflection points (where f''(t) = 0)
- Acceleration calculation from position data

**Notes:**
- When `filter_top_n` or `filter_porc` is used, weak frequency components are filtered out before computing the second derivative, providing smoother curvature analysis
- Only one filter should be used at a time (filter_top_n or filter_porc)

**Example:**
```python
# Normal second derivative
second_deriv = DiscreteFourier.calculate_fourier_double_derivative_value(coefs, 10)

# Filtered second derivative (smoother, using only top 3 frequencies)
second_deriv_filtered = DiscreteFourier.calculate_fourier_double_derivative_value(coefs, 10, filter_top_n=3)

# Filtered second derivative by percentage (keeps components >= 5% of max magnitude)
second_deriv_by_pct = DiscreteFourier.calculate_fourier_double_derivative_value(coefs, 10, filter_porc=5.0)
```

### `DiscreteFourier.find_dominant_period(data_in=None, fourier_coefs=None)`

Find the dominant (most prominent) period in the data.

**Parameters:**
- `data_in` (list or array-like, optional): Input data sequence
- `fourier_coefs` (tuple, optional): Pre-calculated (a_k, b_k) coefficients
  - Either `data_in` or `fourier_coefs` must be provided

**Returns:**
- `dict`: Dictionary containing:
  - `'period'` (float): Dominant period in data points
  - `'k'` (int): Frequency index with highest magnitude
  - `'magnitude'` (float): Magnitude of the dominant component
  - `'data_length'` (int): Original data length N

**Use cases:**
- Detecting the main cyclical pattern in time series
- Identifying the most important frequency component
- Understanding periodicity in seasonal data

**Important notes:**
- Maximum detectable period is N (the data length)
- To detect periods longer than N, you need more data
- The DC component (mean) is excluded from analysis

**Example:**
```python
data = [1, 2, 3, 4, 3, 2, 1, 2, 3, 4, 3, 2]  # Period of 6
result = DiscreteFourier.find_dominant_period(data)
print(f"Period: {result['period']:.1f} points")  # Should be close to 6
```

### `DiscreteFourier.find_top_periods(data_in=None, fourier_coefs=None, n_periods=3)`

Find the top N dominant periods ranked by magnitude.

**Parameters:**
- `data_in` (list or array-like, optional): Input data sequence
- `fourier_coefs` (tuple, optional): Pre-calculated (a_k, b_k) coefficients
  - Either `data_in` or `fourier_coefs` must be provided
- `n_periods` (int, default=3): Number of top periods to return

**Returns:**
- `list of dict`: List sorted by magnitude (highest first), each containing:
  - `'period'` (float): Period in data points (N/k)
  - `'k'` (int): Frequency index
  - `'magnitude'` (float): Magnitude of this component
  - `'percent'` (float): Percentage of total magnitude

**Use cases:**
- Identifying multiple cyclical patterns
- Understanding complex periodic behavior
- Comparing relative importance of different periods

**Example:**
```python
import numpy as np

# Signal with multiple periods
t = np.linspace(0, 100, 200)
signal = np.sin(2*np.pi*t/20) + 0.5*np.sin(2*np.pi*t/10)

top = DiscreteFourier.find_top_periods(signal.tolist(), n_periods=5)
for i, p in enumerate(top, 1):
    print(f"{i}. Period: {p['period']:.1f}, {p['percent']:.1f}%")
```

### `DiscreteFourier.validate_period(data_in, period, window_size=None, method='all')`

Validate if a detected period actually repeats in the data.

**Parameters:**
- `data_in` (list or array-like): Input data sequence
- `period` (float): Period to validate (in data points)
- `window_size` (int, optional): Size of comparison window. If None, uses min(period/2, 50)
- `method` (str, default='all'): Validation method
  - `'correlation'`: Returns only correlation coefficient
  - `'rmse'`: Returns only normalized RMSE
  - `'cosine'`: Returns only cosine similarity
  - `'all'`: Returns full validation metrics (recommended)

**Returns:**
- `dict` (when method='all'): Dictionary containing:
  - `'valid'` (bool): True if period appears valid
  - `'confidence'` (float): Overall confidence score (0-1)
  - `'correlation'` (float): Pearson correlation (-1 to 1)
  - `'rmse'` (float): Normalized RMSE (0-1, lower is better)
  - `'cosine_similarity'` (float): Cosine similarity (0-1)
  - `'period'` (float): The period being validated
  - `'window_size'` (int): Size of comparison window used
- `float` (when method is specific metric): The requested metric value

**Use cases:**
- Verifying that detected periods are real patterns, not artifacts
- Filtering false positives in noisy data
- Assessing confidence before using period for predictions

**Interpretation:**
- **Correlation > 0.7**: Strong periodic pattern (high confidence)
- **Correlation 0.5-0.7**: Moderate periodicity
- **Correlation < 0.3**: Weak or no periodicity (likely false positive)
- **Confidence > 0.8**: Very reliable period
- **Confidence < 0.4**: Period questionable, use with caution

**Example:**
```python
# Find and validate dominant period
data = numpy.sin(numpy.linspace(0, 4*numpy.pi, 100))
dominant = DiscreteFourier.find_dominant_period(data)
validation = DiscreteFourier.validate_period(data, dominant['period'])

if validation['valid']:
    print(f"Period {validation['period']:.1f} is valid!")
    print(f"Confidence: {validation['confidence']:.2f}")
    print(f"Correlation: {validation['correlation']:.2f}")
else:
    print("Period may not be reliable")

# Validate all top periods
top_periods = DiscreteFourier.find_top_periods(data, n_periods=5)
for p in top_periods:
    val = DiscreteFourier.validate_period(data, p['period'])
    print(f"Period {p['period']:.1f}: confidence={val['confidence']:.2f}")
```

## Mathematical Background

The Discrete Fourier Series represents a periodic signal as a sum of sinusoids:

```
f(t) = a₀ + Σ[aₖ·cos(2πkt/N) + bₖ·sin(2πkt/N)]
```

Where:
- N is the number of data points
- k ranges from 1 to N/2
- a₀ is the mean (DC component)

The coefficients are computed using:

```
a₀ = mean(data)
aₖ = (2/N)·Σ[data[i]·cos(2πki/N)]  for k = 1..N/2
bₖ = (2/N)·Σ[data[i]·sin(2πki/N)]  for k = 1..N/2-1
```

## Use Cases

- **Signal Processing**: Analyze periodic signals and extract frequency components
- **Time Series Analysis**: Smooth noisy data and identify cyclical patterns
- **Data Interpolation**: Fill missing values in periodic sequences
- **Trend Analysis**: Calculate derivatives to detect trends and turning points
- **Financial Analysis**: Model cyclical patterns in market data
- **Scientific Computing**: Represent periodic phenomena mathematically
- **Education**: Teaching Fourier series with clear, interpretable real coefficients
- **Physics**: Modeling oscillatory systems (springs, pendulums, waves)
- **Economics**: Analyzing seasonal patterns and business cycles

## Limitations

- **Periodicity Assumption**: Fourier series assumes the signal is periodic. Extrapolation beyond the original data will repeat the pattern.
- **Even Length**: Input data is adjusted to even length (first element removed if odd).
- **Discontinuities**: Sharp jumps in data may cause Gibbs phenomenon (ringing artifacts).

## Example: Complete Workflow

```python
import numpy as np
from discrete_fourier import DiscreteFourier

# Create sample data (sine wave with noise)
N = 100
t = np.linspace(0, 4*np.pi, N)
data = np.sin(t) + 0.1 * np.random.randn(N)

# Step 1: Calculate Fourier coefficients
coefs = DiscreteFourier.calculate_fourier_coefs(data.tolist())

# Step 2: Reconstruct the smoothed signal
reconstructed = [
    DiscreteFourier.calculate_fourier_value(coefs, i+1) 
    for i in range(N)
]

# Step 3: Find local maxima (where derivative changes from + to -)
derivatives = [
    DiscreteFourier.calculate_fourier_derivative_value(coefs, i+1)
    for i in range(N)
]

# Step 4: Predict next 10 values
predictions = [
    DiscreteFourier.calculate_fourier_value(coefs, N + i + 1)
    for i in range(10)
]

print(f"Predicted next values: {predictions}")
```

## Requirements

- Python 3.8+
- NumPy

## License

See LICENSE file for details.

## Author

Ricardo Marcelo Alvarez  
Date: 2025-12-19

## Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

