upoqa.utils package

upoqa.utils.interp_set module

Interpolation Set

Maintain two classes which represent the elemental and overall interpolation sets respectively.

class upoqa.utils.interp_set.InterpSet(n, npt=None)[source]

Bases: object

The interpolation set for a quadratic surrogate model.

Parameters:
  • n (int) – Problem dimension

  • npt (int, optional) – Number of interpolation points. By default, it is set to 2 * n + 1. If the user provides a custom value, it should fall within the range [2 * n + 1, (n + 1) * (n + 2) / 2].

n

Problem dimension

Type:

int

npt

Number of interpolation points. By default, it is set to 2 * n + 1.

Type:

int

interp_set_Y

Interpolation points

Type:

ndarray, shape (npt, n)

interp_set_fval

Objective function values at the interpolation points

Type:

ndarray, shape (npt,)

x_opt_idx

Index of the interpolation point with the lowest objective function value.

Type:

int

x_anchor_idx

Index of the interpolation point used as the center for solving the trust region subproblem.

Type:

int

x_opt

See x_opt property

x_anchor

See x_anchor property

f_opt

See f_opt property

f_anchor

See f_anchor property

Attributes:
f_anchor

value of objective function at the anchor point (self.x_anchor)

f_opt

The lowest objective function value in the interpolation set

x_anchor

The interpolation point serving as the center for the trust region subproblem

x_opt

The interpolation point with the lowest objective function value

Parameters:
  • n (int)

  • npt (int | None)

Methods

get_anchor()

Retrieve the interpolation point used as the center for solving the trust region subproblem.

get_interp_set([idx])

Retrieve the interpolation points or a single point at a specified index.

get_opt()

Retrieve the interpolation point with the lowest objective function value from the interpolation set.

init_interp_set(fun, center[, step_size])

Initialize the interpolation set and computes the corresponding objective function values.

init_interp_set_Y(center[, step_size])

Initialize the interpolation set.

set_interp_set_fval(fvals)

Set objective function values of the interpolation set to the provided values fvals.

update_point_on_idx(x, idx, fval[, ...])

Replace the interpolation point at index idx with a new point x and update its corresponding function value fval.

update_x_opt()

Update the optimal interpolation point (self.x_opt) based on the current objective function values in the interpolation set.

property f_anchor: float

value of objective function at the anchor point (self.x_anchor)

property f_opt: float

The lowest objective function value in the interpolation set

get_anchor()[source]

Retrieve the interpolation point used as the center for solving the trust region subproblem.

Returns:

  • ndarray, shape (n,) – Interpolation point serving as the center for the trust region subproblem

  • float – Value of objective function at the anchor point

Return type:

Tuple[ndarray, float]

get_interp_set(idx=None)[source]

Retrieve the interpolation points or a single point at a specified index.

Parameters:

idx (int, optional) – Index of the interpolation point to retrieve. If not provided, return the entire interpolation set.

Returns:

  • ndarray, shape (npt, n) or ndarray, shape (n,) – The interpolation set if idx is None, or the interpolation point at the specified index idx.

  • ndarray, shape (npt,) or float – Values of objective function for the interpolation set if idx is None, or objective function value at the interpolation point with index idx.

Return type:

Tuple[ndarray, ndarray | float]

get_opt()[source]

Retrieve the interpolation point with the lowest objective function value from the interpolation set.

Returns:

  • ndarray, shape (n,) – Interpolation point with the lowest objective function value.

  • float – The lowest objective function value found in the interpolation set.

Return type:

Tuple[ndarray, float]

init_interp_set(fun, center, step_size=1.0)[source]

Initialize the interpolation set and computes the corresponding objective function values.

Parameters:
  • fun (callable) –

    A callable that takes a point in the interpolation set and returns its objective function value.

    fun(x: np.ndarray) -> (ndarray | float)

    It should internally track its evaluation count and raise MaxEvalNumReached if the maximum number of evaluations is exceeded, or ValueError if the function returns an invalid value.

  • center (ndarray, shape (n,)) – Center point around which the interpolation set is generated.

  • step_size (float, default=1.0) – The step size used to generate interpolation points around the center point.

init_interp_set_Y(center, step_size=1.0)[source]

Initialize the interpolation set.

The center point is set to center, the next 2n points are constructed as a CFD (central finite difference) stencil, and the remaining points are generated by combining previous points.

Parameters:
  • center (ndarray of shape (n,)) – Center point of the interpolation set. Must have size equal to the problem dimension.

  • step_size (float, default=1.0) – Step size used to generate interpolation points around the center.

Raises:

AssertionError – If center size does not match problem dimension

Return type:

None

References

[1]

Tom M. Ragonneau. Model-Based Derivative-Free Optimization Methods and Software. PhD thesis, Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China, 2022. URL: https://theses.lib.polyu.edu.hk/handle/200/12294.

Notes

The interpolation set is constructed as follows:

  • Point 0: center

  • Points 1 to n: center + step_size * e_i

  • Points n+1 to 2*n: center - step_size * e_i

  • Points > 2*n: combinations of previous points

set_interp_set_fval(fvals)[source]

Set objective function values of the interpolation set to the provided values fvals.

Parameters:
  • fval (nd.array, shape (npt,)) – The user-defined objective function values for the interpolation points.

  • fvals (ndarray)

Return type:

None

update_point_on_idx(x, idx, fval, update_x_anchor=True)[source]

Replace the interpolation point at index idx with a new point x and update its corresponding function value fval.

Parameters:
  • x (ndarray, shape (n,)) – The new point to be added to the interpolation set.

  • idx (int) – The index of the point to be updated.

  • fval (float) – The function value at x.

  • update_x_anchor (bool)

Return type:

None

update_x_opt()[source]

Update the optimal interpolation point (self.x_opt) based on the current objective function values in the interpolation set.

If self.x_anchor is None, it will be updated to self.x_opt.

Return type:

None

property x_anchor: ndarray

The interpolation point serving as the center for the trust region subproblem

property x_opt: ndarray

The interpolation point with the lowest objective function value

class upoqa.utils.interp_set.OverallInterpSet(n, set_size, max_size, ele_interp_sets, proj_onto_ele, ele_names)[source]

Bases: object

The interpolation set for an overall surrogate model of the partially separable form \(m(x) = \sum_i m_i(x)\), where each \(m_i\) is an element model.

Note that this interpolation set does not directly use the stored points to construct the interpolation system; its primary functionality is to manage the elemental interpolation sets.

Parameters:
  • n (int) – Problem dimension

  • set_size (int) – The initial number of points in the interpolation set

  • max_size (int) – Maximum number of points that the interpolation set can hold

  • ele_interp_sets (dict) – A dictionary of interpolation sets, one for each elemental surrogate model.

  • proj_onto_ele (callable) –

    Projection function mapping full-space points to element subspaces:

    (full_point: ndarray, element_name: Any) -> element_point: ndarray
    

    For example, proj_onto_ele([1.0, 2.0], "f_1") yields [1.0,], where "f_1" denotes a function that depends only on the first variable of x.

  • ele_names (List[Any])

n

Problem dimension

Type:

int

ele_interp_sets

A dictionary of interpolation sets, one for each element model.

Type:

dict

proj_onto_ele

Projection function mapping full-space points to element subspaces:

(full_point: ndarray, element_name: Any) -> element_point: ndarray

For example, proj_onto_ele([1.0, 2.0], "f_1") yields [1.0,], where "f_1" denotes a function that depends only on the first variable of x.

Type:

callable

ele_names

List of names for each element function

Type:

list

ele_idxs

List of indices corresponding to the element functions

Type:

list of int

npts

Number of points in each elemental interpolation set

Type:

np.ndarray, shape (ele_num,)

interp_set_Y

Interpolation points for the overall surrogate model

Type:

ndarray, shape (set_size, n)

interp_set_fval

Objective function values at the interpolation points for the overall surrogate model

Type:

ndarray, shape (set_size,)

interp_set_fval_eles

Values of element functions at the interpolation points for the overall surrogate model

Type:

list of list of float

interp_set_extra_fval

Values of the extra objective function at the interpolation points for the overall surrogate model. The “extra function” represents the differentiable white-box component of the objective.

Type:

ndarray

x_opt_idx

Index of the point with the lowest objective function value in the overall interpolation set

Type:

int

enable

Boolean array indicating whether each point in the interpolation set is enabled (True) or disabled (False).

Type:

np.ndarray, shape (set_size,)

ele_num

See ele_num property

set_size

See set_size property

disable

See disable property

capacity

See capacity property

unallocated_idx

See unallocated_idx property

allocated_idx

See allocated_idx property

x_opt

See x_opt property

f_opt

See f_opt property

Attributes:
allocated_idx

Indices of allocated (enabled) points in the interpolation set

capacity

Total capacity of the interpolation set, including enabled and disabled points.

disable

Boolean array indicating which points are disabled in the interpolation set

ele_num

Number of elements

f_opt

Lowest objective function value in the interpolation set

set_size

Number of currently enabled points in the interpolation set

unallocated_idx

Indices of unallocated (disabled) points in the interpolation set

x_opt

Interpolation point with the lowest objective function value

Parameters:
  • n (int)

  • set_size (int)

  • max_size (int)

  • ele_interp_sets (List[InterpSet])

  • proj_onto_ele (Callable[[ndarray, Any], ndarray])

  • ele_names (List[Any])

Methods

append(x, fval, fval_eles[, extra_fval])

Add a new point and its associated function values to the interpolation set.

clear()

Clear the entire interpolation set, resetting all points and associated data to their default values.

clear_invalid_point()

Delete the worst points from the interpolation set to ensure the number of maintained points does not exceed self.max_size.

clear_with_only_one_left()

Delete all but the optimal point from the interpolation set, leaving only one point.

delete_points(delete_idx)

Delete specified points from the interpolation set.

get_ele_fvals(idx)

Retrieve the element function values and the extra function value at a specific index in the interpolation set.

get_interp_set([idx])

Retrieve the interpolation points or a single point at a specific index.

get_opt([verbose])

Retrieve the interpolation point with the lowest objective function value and its associated data.

update_fval_with_new_weights_and_xforms(...)

Update all objective function values in the interpolation set based on new weights and transformations.

update_point_on_idx(x, idx_eles, fval_eles, fval)

Update the interpolation set with a new point, including its element function values and overall objective function value.

update_x_opt()

Update the optimal interpolation point (self.x_opt) based on the current objective function values in the interpolation set.

property allocated_idx: ndarray

Indices of allocated (enabled) points in the interpolation set

append(x, fval, fval_eles, extra_fval=0.0)[source]

Add a new point and its associated function values to the interpolation set.

Parameters:
  • x (ndarray, shape (n,)) – The point to be added to the interpolation set

  • fval (float) – Value of objective function at x

  • fval_eles (list of float) – Values of element functions at x

  • extra_fval (float, default=0.0) – Value of extra objective function at x. The “extra function” represents the differentiable white-box component of the objective.

Returns:

The index (or indices) where the point and its values were stored in the interpolation set.

Return type:

ndarray

property capacity: int

Total capacity of the interpolation set, including enabled and disabled points.

clear()[source]

Clear the entire interpolation set, resetting all points and associated data to their default values.

Return type:

None

clear_invalid_point()[source]

Delete the worst points from the interpolation set to ensure the number of maintained points does not exceed self.max_size.

Notes

Points are deleted based on their objective function values, with the worst (highest) values being removed first.

Return type:

None

clear_with_only_one_left()[source]

Delete all but the optimal point from the interpolation set, leaving only one point.

Notes

The optimal point is determined by the lowest objective function value.

Return type:

None

delete_points(delete_idx)[source]

Delete specified points from the interpolation set.

Parameters:

delete_idx (array_like or int) – Indices (or a single index) of the points to delete from the interpolation set.

Raises:

RuntimeWarning – If attempting to delete a point that does not exist in the interpolation set.

Return type:

None

property disable: ndarray

Boolean array indicating which points are disabled in the interpolation set

property ele_num: int

Number of elements

property f_opt: float

Lowest objective function value in the interpolation set

get_ele_fvals(idx)[source]

Retrieve the element function values and the extra function value at a specific index in the interpolation set.

Parameters:

idx (int) – Index of the point in the interpolation set

Returns:

  • list of float or None – Values of element functions at the interpolation point of the specified index. If the point does not exist, return None.

  • float or None – Value of the extra function at the interpolation point of the specified index. If the point does not exist, return None.

Return type:

Tuple[List[float] | List[None] | None, float | None]

get_interp_set(idx=None)[source]

Retrieve the interpolation points or a single point at a specific index.

Parameters:

idx (int, optional) – Index of the interpolation point to retrieve. If None, return the entire interpolation set.

Returns:

  • ndarray, shape (set_size, n) or ndarray, shape (n,) – The interpolation point set if idx is None, or the interpolation point at the specified index idx.

  • ndarray, shape (set_size,) or float – Objective function values at the interpolation points if idx is None, or objective function value at the interpolation point with index idx.

Return type:

Tuple[ndarray, ndarray | float]

get_opt(verbose=False)[source]

Retrieve the interpolation point with the lowest objective function value and its associated data.

Parameters:

verbose (bool, default=False) – If True, return additional information including the element function values and the extra function value.

Returns:

  • x_opt (ndarray, shape (n,)) – Interpolation point with the lowest objective function value

  • f_opt (float) – The lowest objective function value found in the interpolation set

  • fval_eles (list of float, optional) – Values of element functions at the optimal point, if verbose=True.

  • extra_fval (float, optional) – Value of the extra function at the optimal point, if verbose=True.

Return type:

Tuple[None, None] | Tuple[None, None, None, None] | Tuple[ndarray, float] | Tuple[ndarray, float, List[float], float]

property set_size: int

Number of currently enabled points in the interpolation set

property unallocated_idx: ndarray

Indices of unallocated (disabled) points in the interpolation set

update_fval_with_new_weights_and_xforms(weights, xforms, extra_fun_eval)[source]

Update all objective function values in the interpolation set based on new weights and transformations.

Parameters:
  • weights (ndarray, shape (ele_num,)) – Weights applied to the element function values

  • xforms (list of list of callables) – A list of transformations applied to the element function values. Each transformation is a callable function or None if no transformation is applied.

  • extra_fun_eval (callable) –

    The “extra function” included in the objective function, representing the differentiable white-box component of the objective:

    extra_fun_eval(x: ndarray) -> float

    where x is a 1-D array with shape (n,).

Returns:

True if the optimal point (x_opt) changes after the update, otherwise False.

Return type:

bool

update_point_on_idx(x, idx_eles, fval_eles, fval, extra_fval=0.0, need_update=None)[source]

Update the interpolation set with a new point, including its element function values and overall objective function value.

Parameters:
  • x (ndarray) – New point to be added to the interpolation set

  • idx_eles (list of int) – Point indices in the elemental interpolation sets corresponding to the new point.

  • fval_eles (list of float) – Element function values at the new point

  • fval (float) – Overall objective function value at the new point

  • extra_fval (float, default=0.0) – Extra objective function value at the new point. The “extra function” represents the differentiable white-box component of the objective.

  • need_update (list of bool, optional) – List of boolean flags indicating which elemental interpolation sets should be updated. If None, all elemental sets are updated. Default is None.

Returns:

The index of the updated point in the overall interpolation set.

Return type:

int

update_x_opt()[source]

Update the optimal interpolation point (self.x_opt) based on the current objective function values in the interpolation set.

Return type:

None

property x_opt: ndarray

Interpolation point with the lowest objective function value

upoqa.utils.manager module

Algorithm Manager

Maintain two classes that command the behaviors of the UPOQA algorithm.

class upoqa.utils.manager.ExitInfo(flag, msg, exception=None, traceback=None)[source]

Bases: object

Class that contains the exit information.

Parameters:
  • flag (int)

  • msg (str)

  • exception (Exception | None)

  • traceback (str | None)

flag

Flag that represents the type of the cause of an exit, and should be one of the following flags:

ExitStatus.AUTO_DETECT_RESTART_WARNING = 2  # warning, auto-detected restart criteria
ExitStatus.SLOW_WARNING = 1                 # warning, maximum number of slow (successful) iterations reached
ExitStatus.SUCCESS = 0                      # successful finish (rho = resolution_final, or reach maxiter or maxfev)
ExitStatus.INPUT_ERROR = -1                 # error, bad inputs
ExitStatus.TR_INCREASE_ERROR = -2           # error, trust region step increased model value
ExitStatus.LINALG_ERROR = -3                # error, linalg error (singular matrix encountered)
ExitStatus.INVALID_EVAL_ERROR = -4          # error, funtion values with invalid shape or type or NaN encountered
ExitStatus.UNKNOWN_ERROR = -5               # error, raised by unknown error source
Type:

int

msg

Message that describes the detailed reason for an exit.

Type:

str

exception

The exception that caused the exit.

Type:

Exception, optional

traceback

The traceback of the exception that caused the exit.

Type:

str, optional

Methods

message([with_stem])

Return the message that describes the detailed reason for an exit.

message(with_stem=True)[source]

Return the message that describes the detailed reason for an exit.

Returns:

The message that describes the detailed reason for an exit. if with_stem is True, the returned message will be prefixed with “Error” or “Warning”.

Return type:

str

Parameters:

with_stem (bool)

class upoqa.utils.manager.ExitStatus[source]

Bases: object

AUTO_DETECT_RESTART_WARNING = 2
INPUT_ERROR = -1
INVALID_EVAL_ERROR = -4
LINALG_ERROR = -3
SLOW_WARNING = 1
SUCCESS = 0
TR_INCREASE_ERROR = -2
UNKNOWN_ERROR = -5
exception upoqa.utils.manager.MaxEvalNumReached(ele_idx=None, ele_name='')[source]

Bases: Exception

Exception raised when the maximum number of evaluations of the objective function has been reached.

Parameters:
  • ele_idx (int | None)

  • ele_name (Any)

Return type:

None

class upoqa.utils.manager.UPOQAManager(resolution_init, resolution_final, maxiter, maxfev, proj_onto_ele, coords, xform_bounds, params, ele_names, disp=0)[source]

Bases: object

Manage the execution and state of the UPOQA algorithm.

Parameters:
  • resolution_init (float) – The initial value of trust region resolution and elemental radii.

  • resolution_final (float) – The final value of trust region resolution and elemental radii.

  • maxiter (int) – Maximum number of iterations to go before termination.

  • maxfev (list of int) – Maximum function evaluations per element (list parallel to ele_names).

  • proj_onto_ele (callable) –

    Projection function mapping full-space points to element subspaces:

    (full_point: ndarray, element_name: Any) -> element_point: ndarray
    

    For example, proj_onto_ele([1.0, 2.0], "f_1") yields [1.0,], where "f_1" denotes a function that depends only on the first variable of x.

  • coords (list of 1D arrays) – Variable indices for each element’s subspace (parallel to ele_names).

  • xform_bounds (list of tuples) – List in which each member is a tuple of the form (l_val, h_val), representing the interval type domain of each transformation function in xforms.

  • params (UPOQAParameterList) – Parameter list used for the algorithm.

  • ele_names (list) – List containing the names of all elements in order

  • disp (int, optional) –

    Verbosity level:

    • disp=0: silent

    • disp=1: brief

    • disp=2: verbose

    • disp=3: verbose with debug information

n

Problem dimension

Type:

int

npts

Number of interpolation points for each element

Type:

ndarray, shape (ele_num,)

resolution

Current trust region resolution

Type:

float

radii

Trust-region radii for each element.

Type:

ndarray

ele_names

List containing the names of all elements in order

Type:

list

ele_num

Number of elements

Type:

int

proj_onto_ele

Projection function mapping full-space points to element subspaces:

(full_point: ndarray, element_name: Any) -> element_point: ndarray

For example, proj_onto_ele([1.0, 2.0], "f_1") yields [1.0,], where "f_1" denotes a function that depends only on the first variable of x.

Type:

callable

interp_set

The interpolation set used by the algorithm

Type:

OverallInterpSet

model

The surrogate model used by the algorithm

Type:

OverallSurrogate

ele_models

See ele_models property

xforms

See xforms property

weights

See weights property

extra_fun

See extra_fun property

params

Parameter list used for the algorithm

Type:

UPOQAParameterList

maxiter

Maximum number of iterations to go before termination

Type:

int

maxfev

Maximum function evaluations per element (list parallel to ele_names).

Type:

list of int

nfev

Numbers of element function evaluations performed so far

Type:

list of int

nrun

Number of runs (restarts) performed so far

Type:

int

disp

Verbosity level:

  • disp=0: silent

  • disp=1: brief

  • disp=2: verbose

  • disp=3: verbose with debug information

Type:

int

x_opt

See x_opt property

f_opt

See f_opt property

Attributes:
average_radius

The average of elemental trust-region radii

ele_interp_sets

List of elemental interpolation sets

ele_models

List of elemental surrogate models

extra_fun

The white-box component \(f_0\) of the objective function, which is a list of length 3 containing callables to evaluate the function values, gradients, and Hessians.

f_opt

Value of objective function at the point with the lowest objective function value among the currently tracked iterates in the overall interpolation point set.

max_radius

The maximum of elemental trust-region radii

n

Problem dimension

npts

Number of interpolation points for each element

weights

Weights \(w_1, \ldots, w_q\) for each element in the model

x_opt

The point with the lowest objective function value among the currently tracked iterates in the overall interpolation point set.

xforms

Transformations \(h_1, \ldots, h_q\) applied to element function outputs

Parameters:
  • resolution_init (float)

  • resolution_final (float)

  • maxiter (int)

  • maxfev (List[int])

  • proj_onto_ele (Callable[[ndarray, Any], ndarray])

  • coords (List[List | ndarray])

  • xform_bounds (List[Tuple[float]])

  • params (UPOQAParameterList)

  • ele_names (List[Any])

  • disp (int)

Methods

build_fval(fval_eles, extra_fval)

Construct overall function value from element contributions.

calc_detailed_decrease(xk, x_hold, ...)

Obtain extra information about the decrease of values of surrogate model and objective function for each element.

check_slow_iteration()

Monitor optimization progress and detect slow convergence patterns.

couple_with_utils(interp_set, model)

Connect the manager with interpolation set and surrogate model utilities.

decide_to_restart_due_to_exploding_coeff()

Decide whether to restart using the auto-detection strategy.

find_best_start_point()

Refine the user-specified starting point by exploring multiple solutions to a Constraint Satisfaction Problem (CSP) using the Minimum Remaining Values (MRV) heuristic.

get_geometry_step(idx_eles, want_GI)

Compute and return the geometry-improving steps for the elemental interpolation sets.

get_index_to_remove([x, center])

Select the interpolation points to be deleted for each element based on possibly provided new point x.

get_reduction_ratio(fval, old_fval, ...[, rreg])

Compute the trust-region reduction ratios between actual and predicted decreases, both for the overall objective function and for each element function.

get_update_requests(idx_to_replaced_eles, ...)

Determine whether each elemental model and interpolation set need to be updated based on the trust region step sizes in each elemental subspace and the given intermediate quantities about the determinant when updating the KKT matrix for each elemental interpolation system.

has_reached_maxfev()

Check if any element has exceeded its maximum function evaluation limit.

init_coord_overlap_relation()

Initialize adjacency relationships between elements based on their coordinate overlaps.

prepare_for_a_restart([force])

Prepare for a restart.

reduce_resolution()

Reduce the resolution of the trust-region.

reset_restart_track_info()

Reset tracking information used for automatic restart detection.

set_radius(radius[, ele_idx])

Set the trust region radius to radius.

set_resolution(resolution)

Set the trust region resolution to resolution.

set_weights(weights)

Update the weights of all elements in the model.

set_xforms(xforms)

Update the transformation functions for element outputs.

shift_center_to(x)

Shift the centers of overall and element models to x if movement exceeds threshold, then update the coefficients of the models and the interpolation systems.

soft_restart(funs)

Perform a soft restart of the optimization process by resetting key parameters and improving the geometry of the interpolation set.

update_radius(tr_vios, ratio, ratios, ...[, ...])

Update elemental trust region radii based on the combined separation criterion suggested by [1].

update_restart_track_info(xk, old_grad, old_hess)

Track gradient and Hessian changes to detect when a restart is needed.

update_weights(weights)

Update model weights, handling both list and dictionary inputs.

update_xforms(xforms)

Update transformation functions with input validation.

wrap_xforms_with_bounds(xforms)

Wrap transformation functions with bounds checking for each element.

property average_radius: float

The average of elemental trust-region radii

build_fval(fval_eles, extra_fval)[source]

Construct overall function value from element contributions.

Combines element values (optionally transformed) with weights and extra value. Performs detailed NaN checking when debug mode is enabled.

Parameters:
  • fval_eles (list of float) – Function values from each element.

  • extra_fval (float) – Function value of the white-box component to include in the total.

Returns:

The calculated overall function value.

Return type:

float

Raises:

ValueError – If NaN values are detected in element contributions or final result.

calc_detailed_decrease(xk, x_hold, old_fval_eles, fval_eles)[source]

Obtain extra information about the decrease of values of surrogate model and objective function for each element. The returned values are then used to determine whether to expand or shrink the trust-region radius for each element.

Parameters:
  • xk (ndarray) – The current iterate of the optimization process

  • x_hold (ndarray) – The trial point xk + sk, where sk is the trust-region step.

  • old_fval_eles (list of float) – Objective function values of each element at the iterate xk.

  • fval_eles (list of float) – Objective function values of each element at the trial point xk + sk.

Returns:

  • decrease_eles (list of float) – Predicted decrease of each element model.

  • decrease_wx_m_eles (list of float) – Decrease of the weighted and transformed value of each element model. Note that for each element, the default weight is 1.0, and the default transformation (xform) is the identity.

  • decrease_wx_f_eles (list of float) – Decrease of the weighted and transformed value of each element function.

  • fval_wx_eles (list of float) – Weighted and transformed values of each element function at the trial point xk + sk.

Return type:

Tuple[List[float], List[float], List[float], List[float]]

check_slow_iteration()[source]

Monitor optimization progress and detect slow convergence patterns.

Returns:

  • bool – True if current iteration shows insufficient progress (slow iteration), False otherwise. Progress is measured by comparing the current function value with the value from history_for_slow iterations ago.

  • bool – True if maximum allowed consecutive slow iterations (max_slow_iters) has been reached, indicating potential need for restart/termination, False otherwise.

Return type:

Tuple[bool, bool]

couple_with_utils(interp_set, model)[source]

Connect the manager with interpolation set and surrogate model utilities.

Parameters:
  • interp_set (OverallInterpSet) – The overall interpolation set.

  • model (OverallSurrogate) – The overall surrogate model.

Return type:

None

decide_to_restart_due_to_exploding_coeff()[source]

Decide whether to restart using the auto-detection strategy.

Returns:

True if restart conditions are met, False otherwise.

Return type:

bool

Notes

The decision is based on three criteria:

  1. Trust region radius history analysis (flat/down/up trends)

  2. Gradient coefficient growth rate (log-linear regression slope)

  3. Hessian coefficient growth rate (log-linear regression slope)

A restart is triggered when:

  • No radius increases observed (only flat/decreasing)

  • Both gradient and Hessian coefficients show significant increasing trends

Correlation coefficients exceed minimum thresholds

property ele_interp_sets: List[InterpSet]

List of elemental interpolation sets

property ele_models: List[QuadSurrogate] | None

List of elemental surrogate models

property extra_fun: List[Callable[[ndarray], ndarray | list | float]] | None

The white-box component \(f_0\) of the objective function, which is a list of length 3 containing callables to evaluate the function values, gradients, and Hessians.

property f_opt: float | None

Value of objective function at the point with the lowest objective function value among the currently tracked iterates in the overall interpolation point set.

find_best_start_point()[source]

Refine the user-specified starting point by exploring multiple solutions to a Constraint Satisfaction Problem (CSP) using the Minimum Remaining Values (MRV) heuristic. In this CSP:

A state is a complete point x.

The constraints require that for any element index ele_idx, the

projection self.proj_onto_ele(x, ele_idx) must lie within the ele_idx-th interpolation set.

By default, we search for at most self.params("init.search_opt_x0_max_trials") points.

Return type:

Tuple[ndarray, float, List[float], float, ExitInfo | None]

get_geometry_step(idx_eles, want_GI)[source]

Compute and return the geometry-improving steps for the elemental interpolation sets.

Parameters:
  • idx_eles (list of int) – List of indices indicating target positions in each elemental interpolation set where new points will be placed.

  • want_GI (list of bool) – List of flags indicating whether a geometry improvement step is needed for each elemental interpolation set.

Returns:

  • list – List of computed geometry-improving step vectors for each elemental interpolation set. Return None for elements where geometry improvement is not needed.

  • list – List of cached outputs of ele_model.get_determinant_ratio(x_GI) for each element model ele_model, where x_GI denotes the geometry-improving point for the current element. Return None for elements where geometry improvement is not needed.

Return type:

Tuple[List[ndarray | None], List[Tuple[ndarray, float, ndarray, ndarray, ndarray] | None]]

Notes

The method computes two potential steps for each element:

  1. A constrained Cauchy step

  2. A step along lines connecting interpolation points

The step with better geometry improvement (larger absolute sigma value) is selected. If both steps result in zero determinant ratio, raises a SurrogateLinAlgError.

get_index_to_remove(x=None, center=None)[source]

Select the interpolation points to be deleted for each element based on possibly provided new point x.

Parameters:
  • x (ndarray, shape (n,), optional) –

    The new candidate point to be added to the elemental interpolation sets.

    If None, for each ele_idx-th element, selects the farthest point in the elemental interpolation set from self.proj_onto_ele(center, ele_idx).

    If provided, for each ele_idx-th element, calculates sigma values (the denominator of the update formula of the interpolation system) by replacing each existing interpolation point with self.proj_onto_ele(x, ele_idx) and selects the one with the maximum (sigma * distance_to_center ** 4) metric.

  • center (ndarray, shape (n,), optional) – Center point used for distance calculations. If None, uses the optimal point from the overall interpolation set as the center.

Returns:

  • list of int – List of indices of points to be removed from each elemental interpolation set.

  • list of float – List of distances between the center point (self.proj_onto_ele(center, ele_idx) for ele_idx-th element) and the points to be removed, for each elemental interpolation set.

  • list

    List of cached outputs of

    ele_model.get_determinant_ratio(self.proj_onto_ele(x, ele_idx))

    for each element model ele_model if x is provided, else None for each element.

Return type:

Tuple[List[int], List[float], List[Tuple[ndarray, float, ndarray, ndarray, ndarray] | None]]

get_reduction_ratio(fval, old_fval, decrease, fval_eles, old_fval_eles, decrease_eles, rreg=None)[source]

Compute the trust-region reduction ratios between actual and predicted decreases, both for the overall objective function and for each element function.

Parameters:
  • fval (float) – Objective function value at the trial point xk + sk, where sk is the trust-region step.

  • old_fval (float) – Objective function value at the iterate xk.

  • decrease (float) – Decrease of the overall model.

  • fval_eles (list of float) – Objective function values of each element at the trial point xk + sk.

  • old_fval_eles (list of float) – Objective function values of each element at the iterate xk.

  • decrease_eles (list of float) – Decrease of each element model.

  • rreg (float, optional) – Regularization term to stabilize ratio calculation. Defaults to machine epsilon scaled by the function values.

Returns:

  • ratio (float) – The reduction ratio for the objective function. Return -1 if the denominator (predicted decrease) is invalid (positive or too close to zero).

  • ratios (list of float) – The reduction ratios for each element function. Return -1 if the denominator (predicted decrease) is invalid (positive or too close to zero).

  • exit_info (ExitInfo or None) – An ExitInfo object with an ExitStatus.TR_INCREASE_ERROR flag if the denominator is invalid, None otherwise.

Return type:

Tuple[float, List[float], ExitInfo | None]

get_update_requests(idx_to_replaced_eles, tr_vios, cached_kkt_info_eles)[source]

Determine whether each elemental model and interpolation set need to be updated based on the trust region step sizes in each elemental subspace and the given intermediate quantities about the determinant when updating the KKT matrix for each elemental interpolation system.

The decision is based on a scoring strategy, which is largely empirical. Updates that significantly degrade the numerical stability and the poisedness of the interpolation set are rejected, while others are accepted. The threshold is controlled by the parameter general.filter_element_point_thresh.

Parameters:
  • idx_to_replaced_eles (list of int) – A list where each value represents the index of an interpolation point in the interpolation set of the corresponding element. The algorithm needs to decide whether to replace the marked interpolation point with a new one for each element.

  • tr_vios (ndarray, shape (ele_num,)) – Trust region violation ratios for each element. Trust region violation ratio denotes the ratio of the norm of the trust region step to the radius.

  • cached_kkt_info_eles (list of tuples) –

    List of cached outputs of

    ele_model.get_determinant_ratio(self.proj_onto_ele(xk + sk, ele_idx))

    for each element model ele_model, where xk denotes the current iterate, sk denotes the trust-region step.

Returns:

  • flags (list of bool) – A list indicating whether each elemental model and interpolation set need to be updated.

  • exit_info (ExitInfo or None) – Return None if any update is accepted. Otherwise, return an ExitInfo object with an ExitStatus.LINALG_ERROR flag and terminate the algorithm.

Return type:

Tuple[List[bool], ExitInfo | None]

has_reached_maxfev()[source]

Check if any element has exceeded its maximum function evaluation limit.

Return type:

bool

init_coord_overlap_relation()[source]

Initialize adjacency relationships between elements based on their coordinate overlaps.

property max_radius: float

The maximum of elemental trust-region radii

property n: int | None

Problem dimension

property npts: ndarray | None

Number of interpolation points for each element

prepare_for_a_restart(force=False)[source]

Prepare for a restart. This function should be called before launching a restart, and the return value is to be checked to determine whether a restart is allowed.

Parameters:

force (bool, default=False) – If True, bypasses all restart checks (for debugging purposes only).

Returns:

exit_info – None if restart is permitted, otherwise an ExitInfo object containing the reason for preventing restart.

Return type:

ExitInfo or None

reduce_resolution()[source]

Reduce the resolution of the trust-region.

Returns:

True if resolution has already reached its minimum value (radius_final), False otherwise.

Return type:

bool

reset_restart_track_info()[source]

Reset tracking information used for automatic restart detection.

Return type:

None

set_radius(radius, ele_idx=None)[source]

Set the trust region radius to radius.

If ele_idx is specified, only updates the radius for that element. Otherwise, updates all elements’ trust region radii. The radius is constrained to be: At least the resolution (if below θ₂ * resolution) At most θ₆ * resolution

Parameters:
  • radius (float) – The new trust region radius.

  • ele_idx (int, optional) – Index of the element whose radius is to be updated. (None means update all elements.)

Return type:

None

set_resolution(resolution)[source]

Set the trust region resolution to resolution.

Parameters:
  • radius (float) – The new trust region resolution.

  • resolution (float)

Return type:

None

set_weights(weights)[source]

Update the weights of all elements in the model.

Parameters:

weights (ndarray)

Return type:

None

set_xforms(xforms)[source]

Update the transformation functions for element outputs.

Parameters:

xforms (List[List[Callable[[ndarray], ndarray | float]] | None])

Return type:

None

shift_center_to(x)[source]

Shift the centers of overall and element models to x if movement exceeds threshold, then update the coefficients of the models and the interpolation systems.

Parameters:

x (ndarray, shape (n,)) – The new model center.

Return type:

None

soft_restart(funs)[source]

Perform a soft restart of the optimization process by resetting key parameters and improving the geometry of the interpolation set.

Parameters:

funs (callable) –

The element functions and the extra function (white-box component) to evaluate:

funs(x: ndarray) -> tuple[list[float], float]

Where x is the input vector, and the function returns a tuple containing:

  1. List of element function evaluations (list of floats)

  2. Extra function evaluation (float)

The element functions should be properly wrapped to be capable of raising an exception MaxEvalNumReached if the maximum number of evaluations has been reached.

Returns:

exit_info – None if restart succeeds, otherwise an ExitInfo object containing the reason for the failure.

Return type:

ExitInfo or None

update_radius(tr_vios, ratio, ratios, decrease_mval, decrease_extra_fval, decrease_wx_m_eles, decrease_wx_f_eles, spherical_step_radius=1.0, use_spherical_tr=False)[source]

Update elemental trust region radii based on the combined separation criterion suggested by [1].

Parameters:
  • tr_vios (ndarray, shape (ele_num,)) – Trust region violation ratios for each element. Trust region violation ratio denotes the ratio of the norm of the trust region step to the radius.

  • ratio (float) – Actual-to-predicted reduction ratio of the objective function value.

  • ratios (list of float) – Actual-to-predicted reduction ratios of the function value of each element function.

  • decrease_mval (float) – Decrease of the overall model value.

  • decrease_extra_fval (float) – Decrease of the extra function value. The extra function is the white-box component to include in the total objective function.

  • decrease_wx_m_eles (list of float) – Decrease of the weighted and transformed value of each element model. Note that for each element, the default weight is 1.0, and the default transformation (xform) is the identity.

  • decrease_wx_f_eles (list of float) – Decrease of the weighted and transformed value of each element function. Note that for each element, the default weight is 1.0, and the default transformation (xform) is the identity.

  • spherical_step_radius (float, default=1.0) – The ratio of the norm of the trust region step to the radius. This is only used when use_spherical_tr=True.

  • use_spherical_tr (bool, default=False) – Whether the spherical trust region is enabled. If enabled, the truncated conjugate gradient method is used to solve the trust region subproblem.

Return type:

None

References

[1] (1,2)

Johara S. Shahabuddin. 1996. Structured Trust Region Algorithms for the Minimization of Nonlinear Functions. Ph. D. Dissertation. Cornell University, USA.

update_restart_track_info(xk, old_grad, old_hess)[source]

Track gradient and Hessian changes to detect when a restart is needed.

Parameters:
  • xk (ndarray) – The current iterate of the optimization process.

  • old_grad (ndarray, shape (n,)) – Gradient from previous iteration (self.model.model_grad).

  • old_hess (ndarray, shape (n, n)) – Hessian from previous iteration (self.model.model_hess).

Return type:

None

Notes

Maintains three circular buffers tracking:

  1. Trust region radius history (restart_auto_detect_delta)

  2. Gradient change norms (restart_auto_detect_chg_grad)

  3. Hessian change norms (restart_auto_detect_chg_hess)

update_weights(weights)[source]

Update model weights, handling both list and dictionary inputs.

If weights is a dictionary, it maps element names to weight values. Missing elements retain their current weights.

Parameters:

weights (list or dict or ndarray) – New weights for model elements. Can be a 1D array, or a list of weights for all elements or a dictionary mapping element names to weights.

Return type:

None

update_xforms(xforms)[source]

Update transformation functions with input validation.

Handle both list and dictionary inputs. For dictionary inputs: Validate each value is a list of 3 callables, which are function, gradient and hessian. Missing elements retain their current transformations

Apply bounds wrapping if xform_bounds is enabled.

Parameters:

xforms (list or dict) –

New transformation functions. Can be either:

  1. A dictionary mapping element names to transformation lists

  2. A list of transformation lists for all elements

each transformation list should contain exactly 3 callables: [function, gradient, hessian].

Return type:

None

property weights: ndarray | None

Weights \(w_1, \ldots, w_q\) for each element in the model

wrap_xforms_with_bounds(xforms)[source]

Wrap transformation functions with bounds checking for each element.

Parameters:

xforms (list) – List of lists containing transformation functions for each element. Each inner list contains one element’s transformation’s function, gradient and hessian.

Returns:

List of lists containing wrapped transformation functions with bounds checking. Return None for elements with no transformations.

Return type:

list

property x_opt: ndarray | None

The point with the lowest objective function value among the currently tracked iterates in the overall interpolation point set.

property xforms: List[List[Callable[[ndarray], ndarray | float]] | None] | None

Transformations \(h_1, \ldots, h_q\) applied to element function outputs

upoqa.utils.model module

Interpolation Surrogate Model

Maintain two core classes:

  1. QuadSurrogate: Quadratic interpolation surrogate model using the derivative-free symmetric Broyden update [1]. Serves as the element model in UPOQA.

  2. OverallSurrogate: Overall surrogate model formed by assembling multiple element models.

References

[1]

Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.

class upoqa.utils.model.OverallSurrogate(n, interp_set, proj_onto_ele, coords, ele_models, extra_fun, ele_names, params)[source]

Bases: object

Overall surrogate model formed by assembling multiple elemental surrogate models.

Represent a full-system surrogate as a sum of component models, each defined on a subspace of the full design space.

Parameters:
  • n (int)

  • interp_set (OverallInterpSet)

  • proj_onto_ele (Callable[[ndarray, Any], ndarray])

  • coords (List[List | ndarray])

  • ele_models (List[QuadSurrogate])

  • extra_fun (List[Callable[[ndarray], float]])

  • ele_names (List[Any])

  • params (UPOQAParameterList)

n

Problem dimension

Type:

int

interp_set

Interpolation set for the overall surrogate model

Type:

OverallInterpSet

proj_onto_ele

Projection function mapping full-space points to element subspaces:

(full_point: ndarray, element_name: Any) -> element_point: ndarray

For example, proj_onto_ele([1.0, 2.0], "f_1") yields [1.0,], where "f_1" denotes a function that depends only on the first variable of x.

Type:

callable

coords

Variable indices for each element’s subspace (parallel to ele_names).

Type:

list of 1D arrays

ele_models

List of elemental surrogate models

Type:

list

xforms

Transformations \(h_1, \ldots, h_q\) applied to element function outputs

Type:

list

weights

Weights \(w_1, \ldots, w_q\) for each element in the model

Type:

list

extra_fun

The white-box component \(f_0\) of the objective function, which is a list of length 3 containing callables to evaluate the function value, gradient, and Hessian.

Type:

list

params

Parameter list used for the algorithm

Type:

UPOQAParameterList

ele_names

List containing the names of all elements in order

Type:

list

model_center

Center of the surrogate model

Type:

ndarray, shape (n,)

Methods

fun_eval(x)

Evaluate the overall surrogate model at given point.

grad_eval(x)

Evaluate the gradient vector of the overall surrogate model at given point.

hess_eval(x)

Evaluate the full Hessian matrix of the overall surrogate model at given point.

hess_operator(x, v)

Compute Hessian-vector product for the overall surrogate model at given point.

set_weights(weights)

Set the weights for the elements.

set_xforms(xforms)

Set transformation functions for the elements.

shift_center_to(x)

Shift the model center to a new point x.

update(want_update[, cached_KKT_info_eles])

Update the element models based on the provided update flags and cached KKT information.

update_weights(weights)

Update the weights for the elements, handling both list and dictionary inputs.

update_xforms(xforms)

Update transformation functions with input validation.

fun_eval(x)[source]

Evaluate the overall surrogate model at given point.

Parameters:

x (ndarray, shape (n,)) – Evaluation point.

Returns:

fval – The model value of the overall surrogate model at the point x.

Return type:

float

grad_eval(x)[source]

Evaluate the gradient vector of the overall surrogate model at given point.

Parameters:

x (ndarray, shape (n,)) – Evaluation point.

Returns:

g – The gradient vector of the overall surrogate model at the point x.

Return type:

ndarray, shape (n, )

hess_eval(x)[source]

Evaluate the full Hessian matrix of the overall surrogate model at given point.

Parameters:

x (ndarray, shape (n,)) – Evaluation point.

Returns:

H – The Hessian matrix of the overall surrogate model at the point x.

Return type:

ndarray, shape (n, n)

hess_operator(x, v)[source]

Compute Hessian-vector product for the overall surrogate model at given point.

Parameters:
  • x (ndarray, shape (n,)) – The point at which to evaluate the Hessian-vector product.

  • v (ndarray, shape (n, )) – The vector to multiply with the Hessian matrix.

Returns:

Hv – The result of the Hessian-vector product.

Return type:

ndarray, shape (n, )

set_weights(weights)[source]

Set the weights for the elements.

Parameters:

weights (-) – The weights to assign to each element.

Return type:

None

set_xforms(xforms)[source]

Set transformation functions for the elements.

Parameters:

xforms (list) – Transformation functions to apply to each element.

Return type:

None

shift_center_to(x)[source]

Shift the model center to a new point x. Note that this does not change the centers of the element models.

Parameters:

x (ndarray) – The new center point for the overall surrogate model.

Return type:

None

update(want_update, cached_KKT_info_eles=None)[source]

Update the element models based on the provided update flags and cached KKT information.

Parameters:
  • want_update (list of bool) – A list of flags indicating whether each element model should be updated.

  • cached_KKT_info_eles (list of tuples, optional) – List of cached outputs of ele_surrogate.get_determinant_ratio(x_ele_new) for each element model ele_surrogate, where x_ele_new denotes the new point just added into the corresponding elemental interpolation set.

Return type:

None

update_weights(weights)[source]

Update the weights for the elements, handling both list and dictionary inputs.

If weights is a dictionary, it maps element names to weight values. Missing elements retain their current weights.

This method should only be called by update_weights().

Parameters:

weights (list or dict or ndarray) – New weights for model elements. Can be a 1D array, or a list of weights for all elements or a dictionary mapping element names to weights.

Return type:

None

update_xforms(xforms)[source]

Update transformation functions with input validation. This method should only be called by update_xforms().

Parameters:

xforms (list) – List of updated transformation lists for all elements. each transformation list should contain exactly 3 callables: [function, gradient, hessian].

Return type:

None

class upoqa.utils.model.QuadSurrogate(interp_set, center=None, n=None, ref_surrogate=None)[source]

Bases: object

Quadratic interpolation surrogate model using the derivative-free symmetric Broyden update [1]. Serves as the elemental surrogate model in UPOQA.

Parameters:
  • interp_set (InterpSet) – The interpolation set. It must be provided in order for the interpolation surrogate model to be built based on the interpolation points.

  • center (ndarray, optional) – The initial center of the surrogate model. If provided, it overrides the n argument. Otherwise, the center will be set as a zero vector.

  • n (int, optional) – Problem dimension. Required if center is not provided. Note that either n or center must be provided.

  • ref_surrogate (QuadSurrogate, optional) – A reference surrogate model to help initialize the KKT coefficients and other cached information.

n

Problem dimension

Type:

int

interp_set

The interpolation set used to build the surrogate model

Type:

InterpSet

npt

Number of interpolation points in the interpolation set

Type:

int

model_center

Center of the surrogate model

Type:

ndarray, shape (n,)

model_cons

Constant term of the model at the center

Type:

float

model_grad

Gradient of the model at the center

Type:

ndarray, shape (n,)

model_hess_explicit

Explicit Hessian of the model

Type:

ndarray, shape (n, n)

model_hess_implicit

Implicit Hessian of the model

Type:

ndarray, shape (npt,)

References

[1]

Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.

Attributes:
model_hess

Full Hessian matrix of the surrogate model

Parameters:

Methods

alt_grad_eval(x)

Evaluate gradient of a newly-instantiated surrogate model at point x.

fun_eval(X)

Evaluate the surrogate model at given point(s).

get_determinant_ratio(x_new[, idx])

Calculate key intermediate quantities (alpha, beta, tau, sigma) for updating the inverse KKT matrix in the derivative-free symmetric Broyden update [1].

grad_eval(X)

Evaluate the gradient vector(s) of the surrogate model at given point(s).

hess_eval(*arg, **kwargs)

Return the full Hessian matrix of the surrogate model.

hess_operator(V)

Compute Hessian-vector product(s) for the surrogate model.

inv_kkt_matrix_dot(x[, w_star])

Compute inv(W) @ x, where W is the coefficient matrix of the KKT equation generated by the derivative-free symmetric Broyden update [1], and return the vector formed by the first npt and last n dimensions of the result.

inv_kkt_matrix_partial_dot(x)

Compute inv(W) @ x, where W is the coefficient matrix of the KKT equation generated by the derivative-free symmetric Broyden update [1], and return the vector formed by the first npt and last n dimensions of the result.

reinit()

Reinitialize the surrogate model by first resetting and then updating the coefficients.

reset(interp_set, center[, ref_surrogate])

Reset the model with the given interpolation set, center point and dimension, and optionally a reference model.

reset_surrogate_coeff()

Reset all surrogate model coefficients to zero values.

shift_center_to(x)

Shift the model center to x, then update the KKT and model coefficients.

solve_interpolation_KKT([f_Y_shift])

Given the interpolation set, solves the first-order KKT optimality conditions for the derivative-free symmetric Broyden update subproblem and returns the solution: the Lagrange multipliers and the gradient update term.

update([cached_kkt_info])

Update the KKT and model coefficients using the method detailed in [1], [2] and [3].

alt_grad_eval(x)[source]

Evaluate gradient of a newly-instantiated surrogate model at point x.

Conceptually equivalent to:

  1. Creating a new surrogate model with current interpolation set

  2. Updating its coefficients from zero

  3. Returning its gradient at x

Avoids actual model reconstruction by leveraging KKT solution.

Parameters:

x (ndarray, shape (n,)) – Evaluation point for gradient calculation.

Returns:

g – Gradient vector of the conceptual surrogate model at x.

Return type:

ndarray

fun_eval(X)[source]

Evaluate the surrogate model at given point(s).

Parameters:
  • x (ndarray, shape (n,) or ndarray, shape (n_samples, n)) – Evaluation point(s). Can be a single point (1D) or batch of points (2D).

  • X (ndarray)

Returns:

fval – Model value(s). Scalar for single point input, array of shape (n_samples,) for multiple points.

Return type:

float or ndarray, shape (n_samples,)

get_determinant_ratio(x_new, idx=None)[source]

Calculate key intermediate quantities (alpha, beta, tau, sigma) for updating the inverse KKT matrix in the derivative-free symmetric Broyden update [1]. These quantities are defined in [2], where sigma (the denominator in the update equation) represents the ratio of the determinant of the inverse KKT matrix before and after update.

Parameters:
  • x_new (ndarray, shape (n,)) – New point to be added to the interpolation set (not yet added).

  • idx (int, optional) – Index of the interpolation point to be replaced by x_new. If provided, returns scalar values at this index; otherwise returns full arrays.

Returns:

  • alpha (ndarray or float) – Quantity alpha. Returns alpha[idx] if idx is provided.

  • beta (float) – Quantity beta.

  • tau (ndarray or float) – Quantity tau. Returns tau[idx] if idx is provided.

  • sigma (ndarray or float) – Quantity sigma. Returns sigma[idx] if idx is provided.

  • ndarray – Result of self.inv_kkt_matrix_dot(w), where: w = [0.5 * ((x_new - center) @ (Y - center).T) ** 2, x_new - center], where Y is of shape (npt, n) which represents the interpolation points.

Return type:

Tuple[ndarray | float, float, ndarray | float, ndarray | float, ndarray]

References

[1] (1,2)

Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.

[2]

Michael J. D. Powell. 2004. On updating the inverse of a KKT matrix. Numerical Linear Algebra and Optimization, ed. Ya-xiang Yuan, Science Press (Beijing) (2004): 56-78.

grad_eval(X)[source]

Evaluate the gradient vector(s) of the surrogate model at given point(s).

Parameters:
  • x (ndarray, shape (n,) or ndarray, shape (n_samples, n)) – Evaluation point(s). Can be a single point (1D) or batch of points (2D).

  • X (ndarray)

Returns:

g – Gradient vector(s). Shape (n,) for single point, (n_samples, n) for multiple points.

Return type:

ndarray, shape (n, ) or ndarray, shape (n_samples, n)

hess_eval(*arg, **kwargs)[source]

Return the full Hessian matrix of the surrogate model.

Returns:

H – Hessian matrix (same as model_hess property).

Return type:

ndarray, shape (n, n)

hess_operator(V)[source]

Compute Hessian-vector product(s) for the surrogate model.

Parameters:

V (ndarray, shape (n, ) or ndarray, shape (n_samples, n)) – Input vector(s). Single vector (1D) or batch of vectors (2D).

Returns:

HV – Hessian-vector product(s). Shape (n,) for single input vector, (n_samples, n) for multiple vectors.

Return type:

ndarray, shape (n, ) or ndarray, shape (n_samples, n)

inv_kkt_matrix_dot(x, w_star=None)[source]

Compute inv(W) @ x, where W is the coefficient matrix of the KKT equation generated by the derivative-free symmetric Broyden update [1], and return the vector formed by the first npt and last n dimensions of the result.

In the multiplication, we can safely ignore the (npt + 1)-th dimension of x. Therefore, the input vector x only needs to include the first npt and the last n elements, instead of the full (npt + 1 + n)-dimensional vector.

Parameters:
  • x (ndarray, shape (npt + n,)) – The vector to be dotted with the inverse of the KKT matrix.

  • w_star (ndarray, shape (npt + n,), optional) –

    The auxiliary vector to assist in the calculation. It is formed by

    [0.5 * ((yi - center) @ (Y - center).T) ** 2, yi - center],

    where Y is of shape (npt, n) which represents the interpolation points. yi can be any point in the interpolation set, and is x_anchor by default.

Returns:

The vector formed by the first npt and last n dimensions of the result.

Return type:

ndarray, shape (npt + n,)

References

[1] (1,2)

Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.

inv_kkt_matrix_partial_dot(x)[source]

Compute inv(W) @ x, where W is the coefficient matrix of the KKT equation generated by the derivative-free symmetric Broyden update [1], and return the vector formed by the first npt and last n dimensions of the result.

Unlike self.inv_kkt_matrix_dot(x), this method assumes that x is zero for indices [npt:], allowing for a simplified calculation.

Parameters:

x (ndarray, shape (npt + n,)) – The vector to be dotted with the inverse of the KKT matrix.

Returns:

The vector formed by the first npt and last n dimensions of the result.

Return type:

ndarray, shape (npt + n,)

References

[1] (1,2)

Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.

property model_hess: ndarray

Full Hessian matrix of the surrogate model

reinit()[source]

Reinitialize the surrogate model by first resetting and then updating the coefficients.

Return type:

None

reset(interp_set, center, ref_surrogate=None)[source]

Reset the model with the given interpolation set, center point and dimension, and optionally a reference model.

Parameters:
  • interp_set (InterpSet) – The interpolation set. It must be provided in order for the interpolation surrogate model to be built based on the interpolation points.

  • center (ndarray) – The initial center of the surrogate model.

  • ref_surrogate (QuadSurrogate, optional) – A reference surrogate model to help initialize the KKT coefficients and other cached information.

Return type:

None

reset_surrogate_coeff()[source]

Reset all surrogate model coefficients to zero values.

Return type:

None

shift_center_to(x)[source]

Shift the model center to x, then update the KKT and model coefficients.

Parameters:

x (ndarray, shape (n,)) – The new model center.

Return type:

None

Notes

The computational cost is O(npt^3) with BLAS-3 operations.

solve_interpolation_KKT(f_Y_shift=<class 'numpy.ndarray'>)[source]

Given the interpolation set, solves the first-order KKT optimality conditions for the derivative-free symmetric Broyden update subproblem and returns the solution: the Lagrange multipliers and the gradient update term.

Parameters:

f_Y_shift (ndarray, shape (npt,)) – Represents fun(Y) - surrogate(Y), where Y denotes the interpolation points.

Returns:

  • lambda (ndarray, shape (npt,)) – The Lagrange multipliers.

  • g (ndarray, shape (n,)) – The gradient update term.

Return type:

Tuple[ndarray, ndarray]

update(cached_kkt_info=None)[source]

Update the KKT and model coefficients using the method detailed in [1], [2] and [3].

Parameters:

cached_kkt_info (tuple, optional) – If not None, it should be the cached output of self.get_determinant_ratio(x_new), where x_new denotes the new point just added into the interpolation set.

Return type:

None

References

[1] (1,2)

Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.

[2] (1,2)

Michael J. D. Powell. 2004. On updating the inverse of a KKT matrix. Numerical Linear Algebra and Optimization, ed. Ya-xiang Yuan, Science Press (Beijing) (2004): 56-78.

[3] (1,2)

Michael J. D. Powell. 2006. The NEWUOA Software for Unconstrained Optimization without Derivatives. Springer US, Boston, MA, 255-297.

exception upoqa.utils.model.SurrogateLinAlgError(message, ele_idx=None, ele_name=None)[source]

Bases: Exception

Exception raised for linear algebra errors in the surrogate model

Parameters:
  • message (str)

  • ele_idx (int | None)

  • ele_name (Any | None)

Return type:

None

ele_idx

Index of the element function that caused the error. If the error is not related to a specific element, this is None.

Type:

int or None

ele_name

Name of the element function that caused the error. If the error is not related to a specific element, this is None.

Type:

Any or None

upoqa.utils.params module

param A container for all parameter values.:

class upoqa.utils.params.UPOQAParameterList(ele_names, ele_dims, maxfev, noise_level=0, seek_global_minimum=False, debug=False)[source]

Bases: ParameterList

Parameter list for UPOQA.

Parameters:
  • ele_names (list) – List containing the names of all elements in order.

  • ele_dims (list of int) – Elemental dimensions of the problem.

  • maxfev (list) – Maximum number of function evaluations to go before termination for each element.

  • noise_level (int, default=0) – Indicates how noisy the objective function is. Should be 0, 1, or 2. Default number of interpolation points (npt) increases as the noise_level increases, and when noise_level > 0, restart mechanism is enabled.

  • seek_global_minimum (bool, default=False) – Whether to seek a global minimum. Defaults to False. If True, restart mechanism is enabled, but the parameters for restart are different.

  • debug (bool, default=False) –

    Whether to enable debug mode. Defaults to False. If True, the algorithm will check for NaN values in the objective function and the surrogate model.

    Note that NaN check will incur additional runtime costs.

Methods

__call__(key[, new_value])

Access or update a parameter value.

check_all_params()

Check the types and boundary conditions of all parameters.

params_start_with([prefix])

Return a dictionary of all parameters that start with the given prefix.

check_param

param_type

check_all_params()[source]

Check the types and boundary conditions of all parameters.

Return type:

Tuple[bool, list]

param_type(key=<class 'str'>)[source]
Return type:

Tuple[str, bool, Any, Any]

upoqa.utils.projection module

Projection Operator

Implement several projection operators for the use of the projected gradient method.

upoqa.utils.projection.average_proj(s, coords, radii, tol=1e-08, maxit=None)[source]

Averaged projection onto the intersection of cylinders, see [1] for more details about its convergence.

Parameters:
  • s (ndarray, shape (n,)) – Initial point.

  • coords (ndarray or list of 1D array) – Boolean masks or index arrays for each element’s variables.

  • radii (ndarray, shape (ele_num,)) – Trust-region radii.

  • tol (float, default=1e-8) – Tolerance for convergence.

  • maxit (int, optional) – Maximum iterations (default: 10*n).

Returns:

  • ndarray, shape (n,) – Feasible point.

  • bool – True if projection actually moved the point.

References

[1]

Lewis, A.S., Luke, D.R. & Malick, J. Local Linear Convergence for Alternating and Averaged Nonconvex Projections. Found Comput Math 9, 485-513 (2009). https://doi.org/10.1007/s10208-008-9036-y

upoqa.utils.projection.dykstra_proj(s, coords_mask, radii, tol=1e-06, maxit=None)[source]

Dykstra’s projection method onto the intersection of cylinders, see [1].

Parameters:
  • s (ndarray, shape (n,)) – Initial point.

  • coords_mask (ndarray, shape (ele_num, n)) – Boolean mask; coords_mask[i] indicates the variables belonging to element i.

  • radii (ndarray, shape (ele_num,)) – Trust-region radii.

  • tol (float, default=1e-6) – Tolerance for convergence.

  • maxit (int, optional) – Maximum iterations (default: 10*n).

Returns:

  • ndarray, shape (n,) – Feasible point.

  • bool – True if projection actually moved the point.

References

[1]

Boyle, J. P.; Dykstr, R. L. (1986). “A Method for Finding Projections onto the Intersection of Convex Sets in Hilbert Spaces”. Advances in Order Restricted Statistical Inference. Lecture Notes in Statistics. Vol. 37. pp. 28-47. doi:10.1007/978-1-4613-9940-7_3. ISBN 978-0-387-96419-5.

upoqa.utils.projection.get_violation_ratio(s, coords, radii)[source]

Compute the ratio ‖s[coord]‖₂ / radius for each element.

Parameters:
  • s (ndarray, shape (n,)) – Trial point.

  • coords (ndarray, shape (ele_num, n) or list of ndarray) – List of index arrays, one for each element, or a mask array of which each line is a mask vector indicating the indices.

  • radii (ndarray, shape (ele_num,)) – Trust-region radii for each element.

Returns:

Violation ratios (>1 means infeasible).

Return type:

ndarray, shape (ele_num,)

upoqa.utils.projection.steinmetz_comb_proj(s, coords_mask, radii, pretol=1e-08, preit=4)[source]

Combined averaged and Steinmetz projection onto the intersection of cylinders.

First applies a small number of averaged-projection iterations, then switches to the Steinmetz shrinking procedure.

Parameters:
  • s (ndarray, shape (n,)) – Point to project.

  • coords_mask (ndarray, shape (ele_num, n)) – Boolean mask; coords_mask[i] indicates the variables belonging to element i.

  • radii (ndarray, shape (ele_num,)) – Trust-region radii.

  • pretol (float) – Tolerance for the initial averaged projection.

  • preit (int) – Maximum iterations for the initial averaged projection.

Returns:

  • ndarray, shape (n,) – Feasible point.

  • bool – True if projection actually moved the point.

upoqa.utils.projection.steinmetz_proj(s, coords_mask, radii)[source]

Approximate projection onto the intersection of cylinders (which forms a Steinmetz solid in the special case of two cylinders) using a shrinking strategy.

Parameters:
  • s (ndarray, shape (n,)) – Point to project.

  • coords_mask (ndarray, shape (ele_num, n)) – Boolean masks for each element’s variables.

  • radii (ndarray, shape (ele_num,)) – Trust-region radii.

Returns:

  • ndarray, shape (n,) – Feasible point.

  • bool – True if projection actually moved the point.

upoqa.utils.result module

Optimization Result

Adapted from scipy.optimize.OptimizeResult with minor modifications.

class upoqa.utils.result.OptimizeResult[source]

Bases: dict

Represent the optimization result.

message

Description of the cause of the termination.

Type:

str

success

Whether the optimization procedure terminated successfully.

Type:

bool

exception

Exception raised during the optimization that caused the termination, if any.

Type:

Exception, optional

traceback

Traceback of the exception, if any.

Type:

str, optional

fun

Value of objective function evaluated at the solution.

Type:

float

funs

Values of the element functions evaluated at the solution.

The type matches the input fun:

  • If fun was a dictionary, return a dictionary mapping element names to their function values.

  • If fun was a list, return a list of function values in the same order as the input list.

Type:

dict or list

extra_fun

Value of the extra function at the solution.

The “extra function” represents the differentiable white-box component of the objective.

Type:

float

x

The solution of the optimization.

Type:

ndarray

jac, hess

Values of objective function’s Jacobian and its Hessian at x. The Hessian is an approximation.

Type:

ndarray

nit

Number of iterations performed by the optimizer

Type:

int

nfev

Number of evaluations of element functions.

The type matches the input fun:

  • If fun was a dictionary, return a dictionary mapping element names to their numbers of function evaluations.

  • If fun was a list, return a list of evaluation numbers in the same order as the input list.

Type:

dict or list

max_nfev

Maximum number of evaluations of element functions.

Type:

int

avg_nfev

Average number of evaluations of element functions.

Type:

float

nrun

Number of runs (when enabling restarts).

Type:

int

manager

Algorithm manager (debugging only). Included only when debug=True or return_internals=True.

Type:

UPOQAManager, optional

interp_set

Interpolation point set (debugging only). Included only when debug=True or return_internals=True.

Type:

OverallInterpSet, optional

model

Surrogate model (debugging only). Included only when debug=True or return_internals=True.

Type:

OverallSurrogate, optional

Methods

clear()

copy()

fromkeys(iterable[, value])

Create a new dictionary with keys from iterable and values set to value.

get(key[, default])

Return the value for key if key is in the dictionary, else default.

items()

keys()

pop(key[, default])

If the key is not found, return the default if given; otherwise, raise a KeyError.

popitem(/)

Remove and return a (key, value) pair as a 2-tuple.

setdefault(key[, default])

Insert key with a value of default if key is not in the dictionary.

update([E, ]**F)

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values()

upoqa.utils.sub_solver module

Trust Region Subproblem Solver

Implement several sub-routines for approximately solving trust-region sub-problems that arise in derivative-free optimization with partially separable structures:

  • cauchy_geometry / spider_geometry : geometry-improving steps along Cauchy / straight-line directions.

  • tangential_byrd_omojokun : truncated conjugate-gradient step on the tangential space.

  • conjugate_gradient_proj_steinmetz : modified projected CG for structured trust-regions using steinmetz projections.

The cauchy_geometry, spider_geometry, and tangential_byrd_omojokun solvers are drawn from the COBYQA library; see the original implementation at here.

upoqa.utils.sub_solver.cauchy_geometry(const, grad, curv, delta)[source]

Maximize approximately the absolute value of a quadratic function subject to bound constraints in a trust region.

This function solves approximately

\[\begin{split}\max_{s \in \mathbb{R}^n} \quad \bigg\lvert c + g^{\mathsf{T}} s + \frac{1}{2} s^{\mathsf{T}} H s \bigg\rvert \quad \text{s.t.} \quad \left\{ \begin{array}{l} l \le s \le u,\\ \lVert s \rVert \le \Delta, \end{array} \right.\end{split}\]

by maximizing the objective function along the constrained Cauchy direction.

Parameters:
  • const (float) – Constant \(c\) as shown above.

  • grad (ndarray, shape (n,)) – Gradient \(g\) as shown above.

  • curv (callable) –

    Curvature of \(H\) along any vector.

    curv(s) -> float

    returns \(s^{\mathsf{T}} H s\).

  • delta (float) – Trust-region radius \(\Delta\) as shown above.

Returns:

Approximate solution \(s\).

Return type:

ndarray, shape (n,)

Notes

This function is described as the first alternative in Section 6.5 of [1]. It is assumed that the origin is feasible with respect to the bound constraints and that delta is finite and positive.

References

[1]

Tom M. Ragonneau. Model-Based Derivative-Free Optimization Methods and Software. PhD thesis, Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China, 2022. URL: https://theses.lib.polyu.edu.hk/handle/200/12294.

upoqa.utils.sub_solver.conjugate_gradient_proj_steinmetz(fun, grad, hess_prod, coords_mask, n, deltas, envelope_delta, maxiter=None, **kwargs)[source]

Modified projected conjugate gradient method for a quadratic trust-region sub-problem with a structured trust-region.

The feasible region is the intersection of \(q\) cylinders

\[\mathcal{S} = \bigcap_{i=1}^{q} \left\{ s\in\mathbb{R}^{n} : \lVert s^{\mathcal{I}_i} \rVert_{2} \le \Delta_i \right\}.\]

This function solves approximately

\[\min_{s\in\mathcal{S}}\quad f(s).\]
Parameters:
  • fun (callable) – Overall surrogate model function \(f\) as shown above.

  • grad (callable) –

    Gradient \(\nabla f\) of the model function:

    grad(x) -> ndarray, shape (n,)

    returns the gradient at x.

  • hess_prod (callable) –

    Product of the Hessian matrix \(\nabla^2 f(x)\) with any vector \(v\):

    hess_prod(x, v) -> ndarray, shape (n,)

    returns the product \(\nabla^2 f(x) v\).

  • coords_mask (ndarray, shape (q, n)) – Bool mask indicating variable indices \(\mathcal{I}_i\) for each element.

  • n (int) – Problem dimension

  • deltas (ndarray) – Elemental trust-region radii \(\{\Delta_i\}_{i=1}^q\) as shown above.

  • envelope_delta (float) – A sufficiently large radius ensuring that the structured trust-region lies entirely within the ball centered at the origin with this radius.

  • maxiter (int, optional) – Maximum number of iterations. Default is None, which sets it to \(\max(20, 3n)\) in which \(2n\) steps are reserved for projection steps.

Returns:

Approximate solution to the trust-region subproblem.

Return type:

ndarray

Notes

This method uses Steinmetz projection (steinmetz_proj()) and combined projection (steinmetz_comb_proj()) to maintain feasibility with respect to the cylindrical constraints.

upoqa.utils.sub_solver.spider_geometry(const, grad, curv, xpt, delta)[source]

Maximize approximately the absolute value of a quadratic function subject to bound constraints in a trust region.

This function solves approximately

\[\begin{split}\max_{s \in \mathbb{R}^n} \quad \bigg\lvert c + g^{\mathsf{T}} s + \frac{1}{2} s^{\mathsf{T}} H s \bigg\rvert \quad \text{s.t.} \quad \left\{ \begin{array}{l} l \le s \le u,\\ \lVert s \rVert \le \Delta, \end{array} \right.\end{split}\]

by maximizing the objective function along given straight lines.

Parameters:
  • const (float) – Constant \(c\) as shown above.

  • grad (ndarray, shape (n,)) – Gradient \(g\) as shown above.

  • curv (callable) –

    Curvature of \(H\) along any vector.

    curv(s) -> float

    returns \(s^{\mathsf{T}} H s\).

  • xpt (ndarray, shape (npt, n)) – Points defining the straight lines. The straight lines considered are the ones passing through the origin and the points in xpt.

  • delta (float) – Trust-region radius \(\Delta\) as shown above.

Returns:

Approximate solution \(s\).

Return type:

ndarray, shape (n,)

Notes

This function is described as the second alternative in Section 6.5 of [1]. It is assumed that the origin is feasible with respect to the bound constraints and that delta is finite and positive.

References

[1]

Tom M. Ragonneau. Model-Based Derivative-Free Optimization Methods and Software. PhD thesis, Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China, 2022. URL: https://theses.lib.polyu.edu.hk/handle/200/12294.

upoqa.utils.sub_solver.tangential_byrd_omojokun(fun, grad, hess_prod, delta, n, **kwargs)[source]

Minimize approximately a quadratic function subject to bound constraints in a trust region.

This function solves approximately

\[\min_{s\in\mathbb{R}^n}\quad f(s) \quad \text{s.t.} \quad \lVert s \rVert \le \Delta.\]

using a variation of the truncated conjugate gradient method.

Parameters:
  • fun (callable) – Overall surrogate model function \(f\) as shown above.

  • grad (callable) –

    Gradient \(\nabla f\) of the model function:

    grad(x) -> ndarray, shape (n,)

    returns the gradient at x.

  • hess_prod (callable) –

    Product of the Hessian matrix \(\nabla^2 f(x)\) with any vector \(v\):

    hess_prod(x, v) -> ndarray, shape (n,)

    returns the product \(\nabla^2 f(x) v\).

  • delta (float) – Trust-region radius \(\Delta\) as shown above.

  • n (int) – Dimension of the function.

  • improve_tcg (bool, optional) – If True, a solution generated by the truncated conjugate gradient method that is on the boundary of the trust region is improved by moving around the trust-region boundary on the two-dimensional space spanned by the solution and the gradient of the quadratic function at the solution (default is True).

Returns:

Approximate solution \(s\).

Return type:

ndarray, shape (n,)

Notes

This function originally implements Algorithm 6.2 of [1] and is modified to adapt to the UPOQA solver. It is assumed that the origin is feasible with respect to the bound constraints and that delta is finite and positive.

References

[1]

Tom M. Ragonneau. Model-Based Derivative-Free Optimization Methods and Software. PhD thesis, Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China, 2022. URL: https://theses.lib.polyu.edu.hk/handle/200/12294.