upoqa.utils package
upoqa.utils.interp_set module
Interpolation Set
Maintain two classes which represent the elemental and overall interpolation sets respectively.
- class upoqa.utils.interp_set.InterpSet(n, npt=None)[source]
Bases:
objectThe interpolation set for a quadratic surrogate model.
- Parameters:
n (int) – Problem dimension
npt (int, optional) – Number of interpolation points. By default, it is set to
2 * n + 1. If the user provides a custom value, it should fall within the range[2 * n + 1, (n + 1) * (n + 2) / 2].
- n
Problem dimension
- Type:
int
- npt
Number of interpolation points. By default, it is set to
2 * n + 1.- Type:
int
- interp_set_Y
Interpolation points
- Type:
ndarray, shape (npt, n)
- interp_set_fval
Objective function values at the interpolation points
- Type:
ndarray, shape (npt,)
- x_opt_idx
Index of the interpolation point with the lowest objective function value.
- Type:
int
- x_anchor_idx
Index of the interpolation point used as the center for solving the trust region subproblem.
- Type:
int
- Attributes:
- Parameters:
n (int)
npt (int | None)
Methods
Retrieve the interpolation point used as the center for solving the trust region subproblem.
get_interp_set([idx])Retrieve the interpolation points or a single point at a specified index.
get_opt()Retrieve the interpolation point with the lowest objective function value from the interpolation set.
init_interp_set(fun, center[, step_size])Initialize the interpolation set and computes the corresponding objective function values.
init_interp_set_Y(center[, step_size])Initialize the interpolation set.
set_interp_set_fval(fvals)Set objective function values of the interpolation set to the provided values
fvals.update_point_on_idx(x, idx, fval[, ...])Replace the interpolation point at index
idxwith a new pointxand update its corresponding function valuefval.Update the optimal interpolation point (
self.x_opt) based on the current objective function values in the interpolation set.- property f_anchor: float
value of objective function at the anchor point (
self.x_anchor)
- property f_opt: float
The lowest objective function value in the interpolation set
- get_anchor()[source]
Retrieve the interpolation point used as the center for solving the trust region subproblem.
- Returns:
ndarray, shape (n,) – Interpolation point serving as the center for the trust region subproblem
float – Value of objective function at the anchor point
- Return type:
Tuple[ndarray, float]
- get_interp_set(idx=None)[source]
Retrieve the interpolation points or a single point at a specified index.
- Parameters:
idx (int, optional) – Index of the interpolation point to retrieve. If not provided, return the entire interpolation set.
- Returns:
ndarray, shape (npt, n) or ndarray, shape (n,) – The interpolation set if
idxis None, or the interpolation point at the specified indexidx.ndarray, shape (npt,) or float – Values of objective function for the interpolation set if
idxis None, or objective function value at the interpolation point with indexidx.
- Return type:
Tuple[ndarray, ndarray | float]
- get_opt()[source]
Retrieve the interpolation point with the lowest objective function value from the interpolation set.
- Returns:
ndarray, shape (n,) – Interpolation point with the lowest objective function value.
float – The lowest objective function value found in the interpolation set.
- Return type:
Tuple[ndarray, float]
- init_interp_set(fun, center, step_size=1.0)[source]
Initialize the interpolation set and computes the corresponding objective function values.
- Parameters:
fun (callable) –
A callable that takes a point in the interpolation set and returns its objective function value.
fun(x: np.ndarray) -> (ndarray | float)It should internally track its evaluation count and raise
MaxEvalNumReachedif the maximum number of evaluations is exceeded, orValueErrorif the function returns an invalid value.center (ndarray, shape (n,)) – Center point around which the interpolation set is generated.
step_size (float, default=1.0) – The step size used to generate interpolation points around the center point.
- init_interp_set_Y(center, step_size=1.0)[source]
Initialize the interpolation set.
The center point is set to
center, the next2npoints are constructed as a CFD (central finite difference) stencil, and the remaining points are generated by combining previous points.- Parameters:
center (ndarray of shape (n,)) – Center point of the interpolation set. Must have size equal to the problem dimension.
step_size (float, default=1.0) – Step size used to generate interpolation points around the center.
- Raises:
AssertionError – If
centersize does not match problem dimension- Return type:
None
References
[1]Tom M. Ragonneau. Model-Based Derivative-Free Optimization Methods and Software. PhD thesis, Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China, 2022. URL: https://theses.lib.polyu.edu.hk/handle/200/12294.
Notes
The interpolation set is constructed as follows:
Point 0:
centerPoints 1 to n:
center + step_size * e_iPoints n+1 to 2*n:
center - step_size * e_iPoints > 2*n: combinations of previous points
- set_interp_set_fval(fvals)[source]
Set objective function values of the interpolation set to the provided values
fvals.- Parameters:
fval (nd.array, shape (npt,)) – The user-defined objective function values for the interpolation points.
fvals (ndarray)
- Return type:
None
- update_point_on_idx(x, idx, fval, update_x_anchor=True)[source]
Replace the interpolation point at index
idxwith a new pointxand update its corresponding function valuefval.- Parameters:
x (ndarray, shape (n,)) – The new point to be added to the interpolation set.
idx (int) – The index of the point to be updated.
fval (float) – The function value at
x.update_x_anchor (bool)
- Return type:
None
- update_x_opt()[source]
Update the optimal interpolation point (
self.x_opt) based on the current objective function values in the interpolation set.If
self.x_anchoris None, it will be updated toself.x_opt.- Return type:
None
- property x_anchor: ndarray
The interpolation point serving as the center for the trust region subproblem
- property x_opt: ndarray
The interpolation point with the lowest objective function value
- class upoqa.utils.interp_set.OverallInterpSet(n, set_size, max_size, ele_interp_sets, proj_onto_ele, ele_names)[source]
Bases:
objectThe interpolation set for an overall surrogate model of the partially separable form \(m(x) = \sum_i m_i(x)\), where each \(m_i\) is an element model.
Note that this interpolation set does not directly use the stored points to construct the interpolation system; its primary functionality is to manage the elemental interpolation sets.
- Parameters:
n (int) – Problem dimension
set_size (int) – The initial number of points in the interpolation set
max_size (int) – Maximum number of points that the interpolation set can hold
ele_interp_sets (dict) – A dictionary of interpolation sets, one for each elemental surrogate model.
proj_onto_ele (callable) –
Projection function mapping full-space points to element subspaces:
(full_point: ndarray, element_name: Any) -> element_point: ndarray
For example,
proj_onto_ele([1.0, 2.0], "f_1")yields[1.0,], where"f_1"denotes a function that depends only on the first variable ofx.ele_names (List[Any])
- n
Problem dimension
- Type:
int
- ele_interp_sets
A dictionary of interpolation sets, one for each element model.
- Type:
dict
- proj_onto_ele
Projection function mapping full-space points to element subspaces:
(full_point: ndarray, element_name: Any) -> element_point: ndarray
For example,
proj_onto_ele([1.0, 2.0], "f_1")yields[1.0,], where"f_1"denotes a function that depends only on the first variable ofx.- Type:
callable
- ele_names
List of names for each element function
- Type:
list
- ele_idxs
List of indices corresponding to the element functions
- Type:
list of int
- npts
Number of points in each elemental interpolation set
- Type:
np.ndarray, shape (ele_num,)
- interp_set_Y
Interpolation points for the overall surrogate model
- Type:
ndarray, shape (set_size, n)
- interp_set_fval
Objective function values at the interpolation points for the overall surrogate model
- Type:
ndarray, shape (set_size,)
- interp_set_fval_eles
Values of element functions at the interpolation points for the overall surrogate model
- Type:
list of list of float
- interp_set_extra_fval
Values of the extra objective function at the interpolation points for the overall surrogate model. The “extra function” represents the differentiable white-box component of the objective.
- Type:
ndarray
- x_opt_idx
Index of the point with the lowest objective function value in the overall interpolation set
- Type:
int
- enable
Boolean array indicating whether each point in the interpolation set is enabled (True) or disabled (False).
- Type:
np.ndarray, shape (set_size,)
- unallocated_idx
See
unallocated_idxproperty
- allocated_idx
See
allocated_idxproperty
- Attributes:
allocated_idxIndices of allocated (enabled) points in the interpolation set
capacityTotal capacity of the interpolation set, including enabled and disabled points.
disableBoolean array indicating which points are disabled in the interpolation set
ele_numNumber of elements
f_optLowest objective function value in the interpolation set
set_sizeNumber of currently enabled points in the interpolation set
unallocated_idxIndices of unallocated (disabled) points in the interpolation set
x_optInterpolation point with the lowest objective function value
- Parameters:
n (int)
set_size (int)
max_size (int)
ele_interp_sets (List[InterpSet])
proj_onto_ele (Callable[[ndarray, Any], ndarray])
ele_names (List[Any])
Methods
append(x, fval, fval_eles[, extra_fval])Add a new point and its associated function values to the interpolation set.
clear()Clear the entire interpolation set, resetting all points and associated data to their default values.
Delete the worst points from the interpolation set to ensure the number of maintained points does not exceed
self.max_size.Delete all but the optimal point from the interpolation set, leaving only one point.
delete_points(delete_idx)Delete specified points from the interpolation set.
get_ele_fvals(idx)Retrieve the element function values and the extra function value at a specific index in the interpolation set.
get_interp_set([idx])Retrieve the interpolation points or a single point at a specific index.
get_opt([verbose])Retrieve the interpolation point with the lowest objective function value and its associated data.
Update all objective function values in the interpolation set based on new weights and transformations.
update_point_on_idx(x, idx_eles, fval_eles, fval)Update the interpolation set with a new point, including its element function values and overall objective function value.
Update the optimal interpolation point (
self.x_opt) based on the current objective function values in the interpolation set.- property allocated_idx: ndarray
Indices of allocated (enabled) points in the interpolation set
- append(x, fval, fval_eles, extra_fval=0.0)[source]
Add a new point and its associated function values to the interpolation set.
- Parameters:
x (ndarray, shape (n,)) – The point to be added to the interpolation set
fval (float) – Value of objective function at
xfval_eles (list of float) – Values of element functions at
xextra_fval (float, default=0.0) – Value of extra objective function at
x. The “extra function” represents the differentiable white-box component of the objective.
- Returns:
The index (or indices) where the point and its values were stored in the interpolation set.
- Return type:
ndarray
- property capacity: int
Total capacity of the interpolation set, including enabled and disabled points.
- clear()[source]
Clear the entire interpolation set, resetting all points and associated data to their default values.
- Return type:
None
- clear_invalid_point()[source]
Delete the worst points from the interpolation set to ensure the number of maintained points does not exceed
self.max_size.Notes
Points are deleted based on their objective function values, with the worst (highest) values being removed first.
- Return type:
None
- clear_with_only_one_left()[source]
Delete all but the optimal point from the interpolation set, leaving only one point.
Notes
The optimal point is determined by the lowest objective function value.
- Return type:
None
- delete_points(delete_idx)[source]
Delete specified points from the interpolation set.
- Parameters:
delete_idx (array_like or int) – Indices (or a single index) of the points to delete from the interpolation set.
- Raises:
RuntimeWarning – If attempting to delete a point that does not exist in the interpolation set.
- Return type:
None
- property disable: ndarray
Boolean array indicating which points are disabled in the interpolation set
- property ele_num: int
Number of elements
- property f_opt: float
Lowest objective function value in the interpolation set
- get_ele_fvals(idx)[source]
Retrieve the element function values and the extra function value at a specific index in the interpolation set.
- Parameters:
idx (int) – Index of the point in the interpolation set
- Returns:
list of float or None – Values of element functions at the interpolation point of the specified index. If the point does not exist, return None.
float or None – Value of the extra function at the interpolation point of the specified index. If the point does not exist, return None.
- Return type:
Tuple[List[float] | List[None] | None, float | None]
- get_interp_set(idx=None)[source]
Retrieve the interpolation points or a single point at a specific index.
- Parameters:
idx (int, optional) – Index of the interpolation point to retrieve. If None, return the entire interpolation set.
- Returns:
ndarray, shape (set_size, n) or ndarray, shape (n,) – The interpolation point set if
idxis None, or the interpolation point at the specified indexidx.ndarray, shape (set_size,) or float – Objective function values at the interpolation points if
idxis None, or objective function value at the interpolation point with indexidx.
- Return type:
Tuple[ndarray, ndarray | float]
- get_opt(verbose=False)[source]
Retrieve the interpolation point with the lowest objective function value and its associated data.
- Parameters:
verbose (bool, default=False) – If True, return additional information including the element function values and the extra function value.
- Returns:
x_opt (ndarray, shape (n,)) – Interpolation point with the lowest objective function value
f_opt (float) – The lowest objective function value found in the interpolation set
fval_eles (list of float, optional) – Values of element functions at the optimal point, if
verbose=True.extra_fval (float, optional) – Value of the extra function at the optimal point, if
verbose=True.
- Return type:
Tuple[None, None] | Tuple[None, None, None, None] | Tuple[ndarray, float] | Tuple[ndarray, float, List[float], float]
- property set_size: int
Number of currently enabled points in the interpolation set
- property unallocated_idx: ndarray
Indices of unallocated (disabled) points in the interpolation set
- update_fval_with_new_weights_and_xforms(weights, xforms, extra_fun_eval)[source]
Update all objective function values in the interpolation set based on new weights and transformations.
- Parameters:
weights (ndarray, shape (ele_num,)) – Weights applied to the element function values
xforms (list of list of callables) – A list of transformations applied to the element function values. Each transformation is a callable function or None if no transformation is applied.
extra_fun_eval (callable) –
The “extra function” included in the objective function, representing the differentiable white-box component of the objective:
extra_fun_eval(x: ndarray) -> floatwhere
xis a 1-D array with shape (n,).
- Returns:
True if the optimal point (
x_opt) changes after the update, otherwise False.- Return type:
bool
- update_point_on_idx(x, idx_eles, fval_eles, fval, extra_fval=0.0, need_update=None)[source]
Update the interpolation set with a new point, including its element function values and overall objective function value.
- Parameters:
x (ndarray) – New point to be added to the interpolation set
idx_eles (list of int) – Point indices in the elemental interpolation sets corresponding to the new point.
fval_eles (list of float) – Element function values at the new point
fval (float) – Overall objective function value at the new point
extra_fval (float, default=0.0) – Extra objective function value at the new point. The “extra function” represents the differentiable white-box component of the objective.
need_update (list of bool, optional) – List of boolean flags indicating which elemental interpolation sets should be updated. If None, all elemental sets are updated. Default is None.
- Returns:
The index of the updated point in the overall interpolation set.
- Return type:
int
- update_x_opt()[source]
Update the optimal interpolation point (
self.x_opt) based on the current objective function values in the interpolation set.- Return type:
None
- property x_opt: ndarray
Interpolation point with the lowest objective function value
upoqa.utils.manager module
Algorithm Manager
Maintain two classes that command the behaviors of the UPOQA algorithm.
- class upoqa.utils.manager.ExitInfo(flag, msg, exception=None, traceback=None)[source]
Bases:
objectClass that contains the exit information.
- Parameters:
flag (int)
msg (str)
exception (Exception | None)
traceback (str | None)
- flag
Flag that represents the type of the cause of an exit, and should be one of the following flags:
ExitStatus.AUTO_DETECT_RESTART_WARNING = 2 # warning, auto-detected restart criteria ExitStatus.SLOW_WARNING = 1 # warning, maximum number of slow (successful) iterations reached ExitStatus.SUCCESS = 0 # successful finish (rho = resolution_final, or reach maxiter or maxfev) ExitStatus.INPUT_ERROR = -1 # error, bad inputs ExitStatus.TR_INCREASE_ERROR = -2 # error, trust region step increased model value ExitStatus.LINALG_ERROR = -3 # error, linalg error (singular matrix encountered) ExitStatus.INVALID_EVAL_ERROR = -4 # error, funtion values with invalid shape or type or NaN encountered ExitStatus.UNKNOWN_ERROR = -5 # error, raised by unknown error source
- Type:
int
- msg
Message that describes the detailed reason for an exit.
- Type:
str
- exception
The exception that caused the exit.
- Type:
Exception, optional
- traceback
The traceback of the exception that caused the exit.
- Type:
str, optional
Methods
message([with_stem])Return the message that describes the detailed reason for an exit.
- message(with_stem=True)[source]
Return the message that describes the detailed reason for an exit.
- Returns:
The message that describes the detailed reason for an exit. if
with_stemis True, the returned message will be prefixed with “Error” or “Warning”.- Return type:
str
- Parameters:
with_stem (bool)
- class upoqa.utils.manager.ExitStatus[source]
Bases:
object- AUTO_DETECT_RESTART_WARNING = 2
- INPUT_ERROR = -1
- INVALID_EVAL_ERROR = -4
- LINALG_ERROR = -3
- SLOW_WARNING = 1
- SUCCESS = 0
- TR_INCREASE_ERROR = -2
- UNKNOWN_ERROR = -5
- exception upoqa.utils.manager.MaxEvalNumReached(ele_idx=None, ele_name='')[source]
Bases:
ExceptionException raised when the maximum number of evaluations of the objective function has been reached.
- Parameters:
ele_idx (int | None)
ele_name (Any)
- Return type:
None
- class upoqa.utils.manager.UPOQAManager(resolution_init, resolution_final, maxiter, maxfev, proj_onto_ele, coords, xform_bounds, params, ele_names, disp=0)[source]
Bases:
objectManage the execution and state of the UPOQA algorithm.
- Parameters:
resolution_init (float) – The initial value of trust region resolution and elemental radii.
resolution_final (float) – The final value of trust region resolution and elemental radii.
maxiter (int) – Maximum number of iterations to go before termination.
maxfev (list of int) – Maximum function evaluations per element (list parallel to
ele_names).proj_onto_ele (callable) –
Projection function mapping full-space points to element subspaces:
(full_point: ndarray, element_name: Any) -> element_point: ndarray
For example,
proj_onto_ele([1.0, 2.0], "f_1")yields[1.0,], where"f_1"denotes a function that depends only on the first variable ofx.coords (list of 1D arrays) – Variable indices for each element’s subspace (parallel to
ele_names).xform_bounds (list of tuples) – List in which each member is a tuple of the form
(l_val, h_val), representing the interval type domain of each transformation function inxforms.params (
UPOQAParameterList) – Parameter list used for the algorithm.ele_names (list) – List containing the names of all elements in order
disp (int, optional) –
Verbosity level:
disp=0: silentdisp=1: briefdisp=2: verbosedisp=3: verbose with debug information
- n
Problem dimension
- Type:
int
- npts
Number of interpolation points for each element
- Type:
ndarray, shape (ele_num,)
- resolution
Current trust region resolution
- Type:
float
- radii
Trust-region radii for each element.
- Type:
ndarray
- ele_names
List containing the names of all elements in order
- Type:
list
- ele_num
Number of elements
- Type:
int
- proj_onto_ele
Projection function mapping full-space points to element subspaces:
(full_point: ndarray, element_name: Any) -> element_point: ndarray
For example,
proj_onto_ele([1.0, 2.0], "f_1")yields[1.0,], where"f_1"denotes a function that depends only on the first variable ofx.- Type:
callable
- interp_set
The interpolation set used by the algorithm
- Type:
- model
The surrogate model used by the algorithm
- Type:
- ele_models
See
ele_modelsproperty
- params
Parameter list used for the algorithm
- Type:
- maxiter
Maximum number of iterations to go before termination
- Type:
int
- maxfev
Maximum function evaluations per element (list parallel to
ele_names).- Type:
list of int
- nfev
Numbers of element function evaluations performed so far
- Type:
list of int
- nrun
Number of runs (restarts) performed so far
- Type:
int
- disp
Verbosity level:
disp=0: silentdisp=1: briefdisp=2: verbosedisp=3: verbose with debug information
- Type:
int
- Attributes:
average_radiusThe average of elemental trust-region radii
ele_interp_setsList of elemental interpolation sets
ele_modelsList of elemental surrogate models
extra_funThe white-box component \(f_0\) of the objective function, which is a list of length 3 containing callables to evaluate the function values, gradients, and Hessians.
f_optValue of objective function at the point with the lowest objective function value among the currently tracked iterates in the overall interpolation point set.
max_radiusThe maximum of elemental trust-region radii
nProblem dimension
nptsNumber of interpolation points for each element
weightsWeights \(w_1, \ldots, w_q\) for each element in the model
x_optThe point with the lowest objective function value among the currently tracked iterates in the overall interpolation point set.
xformsTransformations \(h_1, \ldots, h_q\) applied to element function outputs
- Parameters:
resolution_init (float)
resolution_final (float)
maxiter (int)
maxfev (List[int])
proj_onto_ele (Callable[[ndarray, Any], ndarray])
coords (List[List | ndarray])
xform_bounds (List[Tuple[float]])
params (UPOQAParameterList)
ele_names (List[Any])
disp (int)
Methods
build_fval(fval_eles, extra_fval)Construct overall function value from element contributions.
calc_detailed_decrease(xk, x_hold, ...)Obtain extra information about the decrease of values of surrogate model and objective function for each element.
Monitor optimization progress and detect slow convergence patterns.
couple_with_utils(interp_set, model)Connect the manager with interpolation set and surrogate model utilities.
Decide whether to restart using the auto-detection strategy.
Refine the user-specified starting point by exploring multiple solutions to a Constraint Satisfaction Problem (CSP) using the Minimum Remaining Values (MRV) heuristic.
get_geometry_step(idx_eles, want_GI)Compute and return the geometry-improving steps for the elemental interpolation sets.
get_index_to_remove([x, center])Select the interpolation points to be deleted for each element based on possibly provided new point
x.get_reduction_ratio(fval, old_fval, ...[, rreg])Compute the trust-region reduction ratios between actual and predicted decreases, both for the overall objective function and for each element function.
get_update_requests(idx_to_replaced_eles, ...)Determine whether each elemental model and interpolation set need to be updated based on the trust region step sizes in each elemental subspace and the given intermediate quantities about the determinant when updating the KKT matrix for each elemental interpolation system.
Check if any element has exceeded its maximum function evaluation limit.
Initialize adjacency relationships between elements based on their coordinate overlaps.
prepare_for_a_restart([force])Prepare for a restart.
Reduce the resolution of the trust-region.
Reset tracking information used for automatic restart detection.
set_radius(radius[, ele_idx])Set the trust region radius to
radius.set_resolution(resolution)Set the trust region resolution to
resolution.set_weights(weights)Update the weights of all elements in the model.
set_xforms(xforms)Update the transformation functions for element outputs.
Shift the centers of overall and element models to
xif movement exceeds threshold, then update the coefficients of the models and the interpolation systems.soft_restart(funs)Perform a soft restart of the optimization process by resetting key parameters and improving the geometry of the interpolation set.
update_radius(tr_vios, ratio, ratios, ...[, ...])Update elemental trust region radii based on the combined separation criterion suggested by [1].
update_restart_track_info(xk, old_grad, old_hess)Track gradient and Hessian changes to detect when a restart is needed.
update_weights(weights)Update model weights, handling both list and dictionary inputs.
update_xforms(xforms)Update transformation functions with input validation.
wrap_xforms_with_bounds(xforms)Wrap transformation functions with bounds checking for each element.
- property average_radius: float
The average of elemental trust-region radii
- build_fval(fval_eles, extra_fval)[source]
Construct overall function value from element contributions.
Combines element values (optionally transformed) with weights and extra value. Performs detailed NaN checking when debug mode is enabled.
- Parameters:
fval_eles (list of float) – Function values from each element.
extra_fval (float) – Function value of the white-box component to include in the total.
- Returns:
The calculated overall function value.
- Return type:
float
- Raises:
ValueError – If NaN values are detected in element contributions or final result.
- calc_detailed_decrease(xk, x_hold, old_fval_eles, fval_eles)[source]
Obtain extra information about the decrease of values of surrogate model and objective function for each element. The returned values are then used to determine whether to expand or shrink the trust-region radius for each element.
- Parameters:
xk (ndarray) – The current iterate of the optimization process
x_hold (ndarray) – The trial point
xk + sk, whereskis the trust-region step.old_fval_eles (list of float) – Objective function values of each element at the iterate
xk.fval_eles (list of float) – Objective function values of each element at the trial point
xk + sk.
- Returns:
decrease_eles (list of float) – Predicted decrease of each element model.
decrease_wx_m_eles (list of float) – Decrease of the weighted and transformed value of each element model. Note that for each element, the default weight is 1.0, and the default transformation (xform) is the identity.
decrease_wx_f_eles (list of float) – Decrease of the weighted and transformed value of each element function.
fval_wx_eles (list of float) – Weighted and transformed values of each element function at the trial point
xk + sk.
- Return type:
Tuple[List[float], List[float], List[float], List[float]]
- check_slow_iteration()[source]
Monitor optimization progress and detect slow convergence patterns.
- Returns:
bool – True if current iteration shows insufficient progress (slow iteration), False otherwise. Progress is measured by comparing the current function value with the value from
history_for_slowiterations ago.bool – True if maximum allowed consecutive slow iterations (
max_slow_iters) has been reached, indicating potential need for restart/termination, False otherwise.
- Return type:
Tuple[bool, bool]
- couple_with_utils(interp_set, model)[source]
Connect the manager with interpolation set and surrogate model utilities.
- Parameters:
interp_set (
OverallInterpSet) – The overall interpolation set.model (
OverallSurrogate) – The overall surrogate model.
- Return type:
None
- decide_to_restart_due_to_exploding_coeff()[source]
Decide whether to restart using the auto-detection strategy.
- Returns:
True if restart conditions are met, False otherwise.
- Return type:
bool
Notes
The decision is based on three criteria:
Trust region radius history analysis (flat/down/up trends)
Gradient coefficient growth rate (log-linear regression slope)
Hessian coefficient growth rate (log-linear regression slope)
A restart is triggered when:
No radius increases observed (only flat/decreasing)
Both gradient and Hessian coefficients show significant increasing trends
Correlation coefficients exceed minimum thresholds
- property ele_models: List[QuadSurrogate] | None
List of elemental surrogate models
- property extra_fun: List[Callable[[ndarray], ndarray | list | float]] | None
The white-box component \(f_0\) of the objective function, which is a list of length 3 containing callables to evaluate the function values, gradients, and Hessians.
- property f_opt: float | None
Value of objective function at the point with the lowest objective function value among the currently tracked iterates in the overall interpolation point set.
- find_best_start_point()[source]
Refine the user-specified starting point by exploring multiple solutions to a Constraint Satisfaction Problem (CSP) using the Minimum Remaining Values (MRV) heuristic. In this CSP:
A state is a complete point
x.- The constraints require that for any element index
ele_idx, the projection
self.proj_onto_ele(x, ele_idx)must lie within theele_idx-th interpolation set.
By default, we search for at most
self.params("init.search_opt_x0_max_trials")points.- Return type:
Tuple[ndarray, float, List[float], float, ExitInfo | None]
- The constraints require that for any element index
- get_geometry_step(idx_eles, want_GI)[source]
Compute and return the geometry-improving steps for the elemental interpolation sets.
- Parameters:
idx_eles (list of int) – List of indices indicating target positions in each elemental interpolation set where new points will be placed.
want_GI (list of bool) – List of flags indicating whether a geometry improvement step is needed for each elemental interpolation set.
- Returns:
list – List of computed geometry-improving step vectors for each elemental interpolation set. Return None for elements where geometry improvement is not needed.
list – List of cached outputs of
ele_model.get_determinant_ratio(x_GI)for each element modelele_model, wherex_GIdenotes the geometry-improving point for the current element. Return None for elements where geometry improvement is not needed.
- Return type:
Tuple[List[ndarray | None], List[Tuple[ndarray, float, ndarray, ndarray, ndarray] | None]]
Notes
The method computes two potential steps for each element:
A constrained Cauchy step
A step along lines connecting interpolation points
The step with better geometry improvement (larger absolute sigma value) is selected. If both steps result in zero determinant ratio, raises a
SurrogateLinAlgError.
- get_index_to_remove(x=None, center=None)[source]
Select the interpolation points to be deleted for each element based on possibly provided new point
x.- Parameters:
x (ndarray, shape (n,), optional) –
The new candidate point to be added to the elemental interpolation sets.
If None, for each
ele_idx-th element, selects the farthest point in the elemental interpolation set fromself.proj_onto_ele(center, ele_idx).If provided, for each
ele_idx-th element, calculates sigma values (the denominator of the update formula of the interpolation system) by replacing each existing interpolation point withself.proj_onto_ele(x, ele_idx)and selects the one with the maximum (sigma * distance_to_center ** 4) metric.center (ndarray, shape (n,), optional) – Center point used for distance calculations. If None, uses the optimal point from the overall interpolation set as the center.
- Returns:
list of int – List of indices of points to be removed from each elemental interpolation set.
list of float – List of distances between the center point (
self.proj_onto_ele(center, ele_idx)forele_idx-th element) and the points to be removed, for each elemental interpolation set.list –
List of cached outputs of
ele_model.get_determinant_ratio(self.proj_onto_ele(x, ele_idx))for each element model
ele_modelifxis provided, else None for each element.
- Return type:
Tuple[List[int], List[float], List[Tuple[ndarray, float, ndarray, ndarray, ndarray] | None]]
- get_reduction_ratio(fval, old_fval, decrease, fval_eles, old_fval_eles, decrease_eles, rreg=None)[source]
Compute the trust-region reduction ratios between actual and predicted decreases, both for the overall objective function and for each element function.
- Parameters:
fval (float) – Objective function value at the trial point
xk + sk, whereskis the trust-region step.old_fval (float) – Objective function value at the iterate
xk.decrease (float) – Decrease of the overall model.
fval_eles (list of float) – Objective function values of each element at the trial point
xk + sk.old_fval_eles (list of float) – Objective function values of each element at the iterate
xk.decrease_eles (list of float) – Decrease of each element model.
rreg (float, optional) – Regularization term to stabilize ratio calculation. Defaults to machine epsilon scaled by the function values.
- Returns:
ratio (float) – The reduction ratio for the objective function. Return
-1if the denominator (predicted decrease) is invalid (positive or too close to zero).ratios (list of float) – The reduction ratios for each element function. Return
-1if the denominator (predicted decrease) is invalid (positive or too close to zero).exit_info (
ExitInfoor None) – AnExitInfoobject with anExitStatus.TR_INCREASE_ERRORflag if the denominator is invalid, None otherwise.
- Return type:
Tuple[float, List[float], ExitInfo | None]
- get_update_requests(idx_to_replaced_eles, tr_vios, cached_kkt_info_eles)[source]
Determine whether each elemental model and interpolation set need to be updated based on the trust region step sizes in each elemental subspace and the given intermediate quantities about the determinant when updating the KKT matrix for each elemental interpolation system.
The decision is based on a scoring strategy, which is largely empirical. Updates that significantly degrade the numerical stability and the poisedness of the interpolation set are rejected, while others are accepted. The threshold is controlled by the parameter
general.filter_element_point_thresh.- Parameters:
idx_to_replaced_eles (list of int) – A list where each value represents the index of an interpolation point in the interpolation set of the corresponding element. The algorithm needs to decide whether to replace the marked interpolation point with a new one for each element.
tr_vios (ndarray, shape (ele_num,)) – Trust region violation ratios for each element. Trust region violation ratio denotes the ratio of the norm of the trust region step to the radius.
cached_kkt_info_eles (list of tuples) –
List of cached outputs of
ele_model.get_determinant_ratio(self.proj_onto_ele(xk + sk, ele_idx))for each element model
ele_model, wherexkdenotes the current iterate,skdenotes the trust-region step.
- Returns:
flags (list of bool) – A list indicating whether each elemental model and interpolation set need to be updated.
exit_info (
ExitInfoor None) – Return None if any update is accepted. Otherwise, return anExitInfoobject with anExitStatus.LINALG_ERRORflag and terminate the algorithm.
- Return type:
Tuple[List[bool], ExitInfo | None]
- has_reached_maxfev()[source]
Check if any element has exceeded its maximum function evaluation limit.
- Return type:
bool
- init_coord_overlap_relation()[source]
Initialize adjacency relationships between elements based on their coordinate overlaps.
- property max_radius: float
The maximum of elemental trust-region radii
- property n: int | None
Problem dimension
- property npts: ndarray | None
Number of interpolation points for each element
- prepare_for_a_restart(force=False)[source]
Prepare for a restart. This function should be called before launching a restart, and the return value is to be checked to determine whether a restart is allowed.
- reduce_resolution()[source]
Reduce the resolution of the trust-region.
- Returns:
True if resolution has already reached its minimum value (
radius_final), False otherwise.- Return type:
bool
- reset_restart_track_info()[source]
Reset tracking information used for automatic restart detection.
- Return type:
None
- set_radius(radius, ele_idx=None)[source]
Set the trust region radius to
radius.If
ele_idxis specified, only updates the radius for that element. Otherwise, updates all elements’ trust region radii. The radius is constrained to be: At least the resolution (if below θ₂ * resolution) At most θ₆ * resolution- Parameters:
radius (float) – The new trust region radius.
ele_idx (int, optional) – Index of the element whose radius is to be updated. (None means update all elements.)
- Return type:
None
- set_resolution(resolution)[source]
Set the trust region resolution to
resolution.- Parameters:
radius (float) – The new trust region resolution.
resolution (float)
- Return type:
None
- set_weights(weights)[source]
Update the weights of all elements in the model.
- Parameters:
weights (ndarray)
- Return type:
None
- set_xforms(xforms)[source]
Update the transformation functions for element outputs.
- Parameters:
xforms (List[List[Callable[[ndarray], ndarray | float]] | None])
- Return type:
None
- shift_center_to(x)[source]
Shift the centers of overall and element models to
xif movement exceeds threshold, then update the coefficients of the models and the interpolation systems.- Parameters:
x (ndarray, shape (n,)) – The new model center.
- Return type:
None
- soft_restart(funs)[source]
Perform a soft restart of the optimization process by resetting key parameters and improving the geometry of the interpolation set.
- Parameters:
funs (callable) –
The element functions and the extra function (white-box component) to evaluate:
funs(x: ndarray) -> tuple[list[float], float]Where
xis the input vector, and the function returns a tuple containing:List of element function evaluations (list of floats)
Extra function evaluation (float)
The element functions should be properly wrapped to be capable of raising an exception
MaxEvalNumReachedif the maximum number of evaluations has been reached.- Returns:
exit_info – None if restart succeeds, otherwise an
ExitInfoobject containing the reason for the failure.- Return type:
ExitInfoor None
- update_radius(tr_vios, ratio, ratios, decrease_mval, decrease_extra_fval, decrease_wx_m_eles, decrease_wx_f_eles, spherical_step_radius=1.0, use_spherical_tr=False)[source]
Update elemental trust region radii based on the combined separation criterion suggested by [1].
- Parameters:
tr_vios (ndarray, shape (ele_num,)) – Trust region violation ratios for each element. Trust region violation ratio denotes the ratio of the norm of the trust region step to the radius.
ratio (float) – Actual-to-predicted reduction ratio of the objective function value.
ratios (list of float) – Actual-to-predicted reduction ratios of the function value of each element function.
decrease_mval (float) – Decrease of the overall model value.
decrease_extra_fval (float) – Decrease of the extra function value. The extra function is the white-box component to include in the total objective function.
decrease_wx_m_eles (list of float) – Decrease of the weighted and transformed value of each element model. Note that for each element, the default weight is 1.0, and the default transformation (xform) is the identity.
decrease_wx_f_eles (list of float) – Decrease of the weighted and transformed value of each element function. Note that for each element, the default weight is 1.0, and the default transformation (xform) is the identity.
spherical_step_radius (float, default=1.0) – The ratio of the norm of the trust region step to the radius. This is only used when
use_spherical_tr=True.use_spherical_tr (bool, default=False) – Whether the spherical trust region is enabled. If enabled, the truncated conjugate gradient method is used to solve the trust region subproblem.
- Return type:
None
References
- update_restart_track_info(xk, old_grad, old_hess)[source]
Track gradient and Hessian changes to detect when a restart is needed.
- Parameters:
xk (ndarray) – The current iterate of the optimization process.
old_grad (ndarray, shape (n,)) – Gradient from previous iteration (
self.model.model_grad).old_hess (ndarray, shape (n, n)) – Hessian from previous iteration (
self.model.model_hess).
- Return type:
None
Notes
Maintains three circular buffers tracking:
Trust region radius history (
restart_auto_detect_delta)Gradient change norms (
restart_auto_detect_chg_grad)Hessian change norms (
restart_auto_detect_chg_hess)
- update_weights(weights)[source]
Update model weights, handling both list and dictionary inputs.
If
weightsis a dictionary, it maps element names to weight values. Missing elements retain their current weights.- Parameters:
weights (list or dict or ndarray) – New weights for model elements. Can be a 1D array, or a list of weights for all elements or a dictionary mapping element names to weights.
- Return type:
None
- update_xforms(xforms)[source]
Update transformation functions with input validation.
Handle both list and dictionary inputs. For dictionary inputs: Validate each value is a list of 3 callables, which are function, gradient and hessian. Missing elements retain their current transformations
Apply bounds wrapping if
xform_boundsis enabled.- Parameters:
xforms (list or dict) –
New transformation functions. Can be either:
A dictionary mapping element names to transformation lists
A list of transformation lists for all elements
each transformation list should contain exactly 3 callables:
[function, gradient, hessian].- Return type:
None
- property weights: ndarray | None
Weights \(w_1, \ldots, w_q\) for each element in the model
- wrap_xforms_with_bounds(xforms)[source]
Wrap transformation functions with bounds checking for each element.
- Parameters:
xforms (list) – List of lists containing transformation functions for each element. Each inner list contains one element’s transformation’s function, gradient and hessian.
- Returns:
List of lists containing wrapped transformation functions with bounds checking. Return None for elements with no transformations.
- Return type:
list
- property x_opt: ndarray | None
The point with the lowest objective function value among the currently tracked iterates in the overall interpolation point set.
- property xforms: List[List[Callable[[ndarray], ndarray | float]] | None] | None
Transformations \(h_1, \ldots, h_q\) applied to element function outputs
upoqa.utils.model module
Interpolation Surrogate Model
Maintain two core classes:
QuadSurrogate: Quadratic interpolation surrogate model using the derivative-free symmetric Broyden update [1]. Serves as the element model in UPOQA.OverallSurrogate: Overall surrogate model formed by assembling multiple element models.
References
Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.
- class upoqa.utils.model.OverallSurrogate(n, interp_set, proj_onto_ele, coords, ele_models, extra_fun, ele_names, params)[source]
Bases:
objectOverall surrogate model formed by assembling multiple elemental surrogate models.
Represent a full-system surrogate as a sum of component models, each defined on a subspace of the full design space.
- Parameters:
n (int)
interp_set (OverallInterpSet)
proj_onto_ele (Callable[[ndarray, Any], ndarray])
coords (List[List | ndarray])
ele_models (List[QuadSurrogate])
extra_fun (List[Callable[[ndarray], float]])
ele_names (List[Any])
params (UPOQAParameterList)
- n
Problem dimension
- Type:
int
- interp_set
Interpolation set for the overall surrogate model
- Type:
- proj_onto_ele
Projection function mapping full-space points to element subspaces:
(full_point: ndarray, element_name: Any) -> element_point: ndarray
For example,
proj_onto_ele([1.0, 2.0], "f_1")yields[1.0,], where"f_1"denotes a function that depends only on the first variable ofx.- Type:
callable
- coords
Variable indices for each element’s subspace (parallel to
ele_names).- Type:
list of 1D arrays
- ele_models
List of elemental surrogate models
- Type:
list
- xforms
Transformations \(h_1, \ldots, h_q\) applied to element function outputs
- Type:
list
- weights
Weights \(w_1, \ldots, w_q\) for each element in the model
- Type:
list
- extra_fun
The white-box component \(f_0\) of the objective function, which is a list of length 3 containing callables to evaluate the function value, gradient, and Hessian.
- Type:
list
- params
Parameter list used for the algorithm
- Type:
- ele_names
List containing the names of all elements in order
- Type:
list
- model_center
Center of the surrogate model
- Type:
ndarray, shape (n,)
Methods
fun_eval(x)Evaluate the overall surrogate model at given point.
grad_eval(x)Evaluate the gradient vector of the overall surrogate model at given point.
hess_eval(x)Evaluate the full Hessian matrix of the overall surrogate model at given point.
hess_operator(x, v)Compute Hessian-vector product for the overall surrogate model at given point.
set_weights(weights)Set the weights for the elements.
set_xforms(xforms)Set transformation functions for the elements.
Shift the model center to a new point
x.update(want_update[, cached_KKT_info_eles])Update the element models based on the provided update flags and cached KKT information.
update_weights(weights)Update the weights for the elements, handling both list and dictionary inputs.
update_xforms(xforms)Update transformation functions with input validation.
- fun_eval(x)[source]
Evaluate the overall surrogate model at given point.
- Parameters:
x (ndarray, shape (n,)) – Evaluation point.
- Returns:
fval – The model value of the overall surrogate model at the point
x.- Return type:
float
- grad_eval(x)[source]
Evaluate the gradient vector of the overall surrogate model at given point.
- Parameters:
x (ndarray, shape (n,)) – Evaluation point.
- Returns:
g – The gradient vector of the overall surrogate model at the point
x.- Return type:
ndarray, shape (n, )
- hess_eval(x)[source]
Evaluate the full Hessian matrix of the overall surrogate model at given point.
- Parameters:
x (ndarray, shape (n,)) – Evaluation point.
- Returns:
H – The Hessian matrix of the overall surrogate model at the point
x.- Return type:
ndarray, shape (n, n)
- hess_operator(x, v)[source]
Compute Hessian-vector product for the overall surrogate model at given point.
- Parameters:
x (ndarray, shape (n,)) – The point at which to evaluate the Hessian-vector product.
v (ndarray, shape (n, )) – The vector to multiply with the Hessian matrix.
- Returns:
Hv – The result of the Hessian-vector product.
- Return type:
ndarray, shape (n, )
- set_weights(weights)[source]
Set the weights for the elements.
- Parameters:
weights (-) – The weights to assign to each element.
- Return type:
None
- set_xforms(xforms)[source]
Set transformation functions for the elements.
- Parameters:
xforms (list) – Transformation functions to apply to each element.
- Return type:
None
- shift_center_to(x)[source]
Shift the model center to a new point
x. Note that this does not change the centers of the element models.- Parameters:
x (ndarray) – The new center point for the overall surrogate model.
- Return type:
None
- update(want_update, cached_KKT_info_eles=None)[source]
Update the element models based on the provided update flags and cached KKT information.
- Parameters:
want_update (list of bool) – A list of flags indicating whether each element model should be updated.
cached_KKT_info_eles (list of tuples, optional) – List of cached outputs of
ele_surrogate.get_determinant_ratio(x_ele_new)for each element modelele_surrogate, wherex_ele_newdenotes the new point just added into the corresponding elemental interpolation set.
- Return type:
None
- update_weights(weights)[source]
Update the weights for the elements, handling both list and dictionary inputs.
If
weightsis a dictionary, it maps element names to weight values. Missing elements retain their current weights.This method should only be called by
update_weights().- Parameters:
weights (list or dict or ndarray) – New weights for model elements. Can be a 1D array, or a list of weights for all elements or a dictionary mapping element names to weights.
- Return type:
None
- update_xforms(xforms)[source]
Update transformation functions with input validation. This method should only be called by
update_xforms().- Parameters:
xforms (list) – List of updated transformation lists for all elements. each transformation list should contain exactly 3 callables:
[function, gradient, hessian].- Return type:
None
- class upoqa.utils.model.QuadSurrogate(interp_set, center=None, n=None, ref_surrogate=None)[source]
Bases:
objectQuadratic interpolation surrogate model using the derivative-free symmetric Broyden update [1]. Serves as the elemental surrogate model in UPOQA.
- Parameters:
interp_set (
InterpSet) – The interpolation set. It must be provided in order for the interpolation surrogate model to be built based on the interpolation points.center (ndarray, optional) – The initial center of the surrogate model. If provided, it overrides the
nargument. Otherwise, the center will be set as a zero vector.n (int, optional) – Problem dimension. Required if
centeris not provided. Note that eithernorcentermust be provided.ref_surrogate (
QuadSurrogate, optional) – A reference surrogate model to help initialize the KKT coefficients and other cached information.
- n
Problem dimension
- Type:
int
- npt
Number of interpolation points in the interpolation set
- Type:
int
- model_center
Center of the surrogate model
- Type:
ndarray, shape (n,)
- model_cons
Constant term of the model at the center
- Type:
float
- model_grad
Gradient of the model at the center
- Type:
ndarray, shape (n,)
- model_hess_explicit
Explicit Hessian of the model
- Type:
ndarray, shape (n, n)
- model_hess_implicit
Implicit Hessian of the model
- Type:
ndarray, shape (npt,)
References
[1]Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.
- Attributes:
model_hessFull Hessian matrix of the surrogate model
- Parameters:
interp_set (InterpSet)
center (ndarray | None)
n (int | None)
ref_surrogate (QuadSurrogate | None)
Methods
Evaluate gradient of a newly-instantiated surrogate model at point
x.fun_eval(X)Evaluate the surrogate model at given point(s).
get_determinant_ratio(x_new[, idx])Calculate key intermediate quantities (
alpha,beta,tau,sigma) for updating the inverse KKT matrix in the derivative-free symmetric Broyden update [1].grad_eval(X)Evaluate the gradient vector(s) of the surrogate model at given point(s).
hess_eval(*arg, **kwargs)Return the full Hessian matrix of the surrogate model.
Compute Hessian-vector product(s) for the surrogate model.
inv_kkt_matrix_dot(x[, w_star])Compute inv(W) @ x, where W is the coefficient matrix of the KKT equation generated by the derivative-free symmetric Broyden update [1], and return the vector formed by the first
nptand lastndimensions of the result.Compute inv(W) @ x, where W is the coefficient matrix of the KKT equation generated by the derivative-free symmetric Broyden update [1], and return the vector formed by the first
nptand lastndimensions of the result.reinit()Reinitialize the surrogate model by first resetting and then updating the coefficients.
reset(interp_set, center[, ref_surrogate])Reset the model with the given interpolation set, center point and dimension, and optionally a reference model.
Reset all surrogate model coefficients to zero values.
Shift the model center to
x, then update the KKT and model coefficients.solve_interpolation_KKT([f_Y_shift])Given the interpolation set, solves the first-order KKT optimality conditions for the derivative-free symmetric Broyden update subproblem and returns the solution: the Lagrange multipliers and the gradient update term.
update([cached_kkt_info])Update the KKT and model coefficients using the method detailed in [1], [2] and [3].
- alt_grad_eval(x)[source]
Evaluate gradient of a newly-instantiated surrogate model at point
x.Conceptually equivalent to:
Creating a new surrogate model with current interpolation set
Updating its coefficients from zero
Returning its gradient at
x
Avoids actual model reconstruction by leveraging KKT solution.
- Parameters:
x (ndarray, shape (n,)) – Evaluation point for gradient calculation.
- Returns:
g – Gradient vector of the conceptual surrogate model at
x.- Return type:
ndarray
- fun_eval(X)[source]
Evaluate the surrogate model at given point(s).
- Parameters:
x (ndarray, shape (n,) or ndarray, shape (n_samples, n)) – Evaluation point(s). Can be a single point (1D) or batch of points (2D).
X (ndarray)
- Returns:
fval – Model value(s). Scalar for single point input, array of shape (n_samples,) for multiple points.
- Return type:
float or ndarray, shape (n_samples,)
- get_determinant_ratio(x_new, idx=None)[source]
Calculate key intermediate quantities (
alpha,beta,tau,sigma) for updating the inverse KKT matrix in the derivative-free symmetric Broyden update [1]. These quantities are defined in [2], wheresigma(the denominator in the update equation) represents the ratio of the determinant of the inverse KKT matrix before and after update.- Parameters:
x_new (ndarray, shape (n,)) – New point to be added to the interpolation set (not yet added).
idx (int, optional) – Index of the interpolation point to be replaced by
x_new. If provided, returns scalar values at this index; otherwise returns full arrays.
- Returns:
alpha (ndarray or float) – Quantity
alpha. Returnsalpha[idx]ifidxis provided.beta (float) – Quantity
beta.tau (ndarray or float) – Quantity
tau. Returnstau[idx]ifidxis provided.sigma (ndarray or float) – Quantity
sigma. Returnssigma[idx]ifidxis provided.ndarray – Result of
self.inv_kkt_matrix_dot(w), where:w = [0.5 * ((x_new - center) @ (Y - center).T) ** 2, x_new - center], whereYis of shape(npt, n)which represents the interpolation points.
- Return type:
Tuple[ndarray | float, float, ndarray | float, ndarray | float, ndarray]
References
[1] (1,2)Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.
[2]Michael J. D. Powell. 2004. On updating the inverse of a KKT matrix. Numerical Linear Algebra and Optimization, ed. Ya-xiang Yuan, Science Press (Beijing) (2004): 56-78.
- grad_eval(X)[source]
Evaluate the gradient vector(s) of the surrogate model at given point(s).
- Parameters:
x (ndarray, shape (n,) or ndarray, shape (n_samples, n)) – Evaluation point(s). Can be a single point (1D) or batch of points (2D).
X (ndarray)
- Returns:
g – Gradient vector(s). Shape (n,) for single point, (n_samples, n) for multiple points.
- Return type:
ndarray, shape (n, ) or ndarray, shape (n_samples, n)
- hess_eval(*arg, **kwargs)[source]
Return the full Hessian matrix of the surrogate model.
- Returns:
H – Hessian matrix (same as
model_hessproperty).- Return type:
ndarray, shape (n, n)
- hess_operator(V)[source]
Compute Hessian-vector product(s) for the surrogate model.
- Parameters:
V (ndarray, shape (n, ) or ndarray, shape (n_samples, n)) – Input vector(s). Single vector (1D) or batch of vectors (2D).
- Returns:
HV – Hessian-vector product(s). Shape (n,) for single input vector, (n_samples, n) for multiple vectors.
- Return type:
ndarray, shape (n, ) or ndarray, shape (n_samples, n)
- inv_kkt_matrix_dot(x, w_star=None)[source]
Compute inv(W) @ x, where W is the coefficient matrix of the KKT equation generated by the derivative-free symmetric Broyden update [1], and return the vector formed by the first
nptand lastndimensions of the result.In the multiplication, we can safely ignore the
(npt + 1)-th dimension ofx. Therefore, the input vectorxonly needs to include the firstnptand the lastnelements, instead of the full(npt + 1 + n)-dimensional vector.- Parameters:
x (ndarray, shape (npt + n,)) – The vector to be dotted with the inverse of the KKT matrix.
w_star (ndarray, shape (npt + n,), optional) –
The auxiliary vector to assist in the calculation. It is formed by
[0.5 * ((yi - center) @ (Y - center).T) ** 2, yi - center],where
Yis of shape(npt, n)which represents the interpolation points.yican be any point in the interpolation set, and isx_anchorby default.
- Returns:
The vector formed by the first
nptand lastndimensions of the result.- Return type:
ndarray, shape (npt + n,)
References
- inv_kkt_matrix_partial_dot(x)[source]
Compute inv(W) @ x, where W is the coefficient matrix of the KKT equation generated by the derivative-free symmetric Broyden update [1], and return the vector formed by the first
nptand lastndimensions of the result.Unlike
self.inv_kkt_matrix_dot(x), this method assumes thatxis zero for indices[npt:], allowing for a simplified calculation.- Parameters:
x (ndarray, shape (npt + n,)) – The vector to be dotted with the inverse of the KKT matrix.
- Returns:
The vector formed by the first
nptand lastndimensions of the result.- Return type:
ndarray, shape (npt + n,)
References
- property model_hess: ndarray
Full Hessian matrix of the surrogate model
- reinit()[source]
Reinitialize the surrogate model by first resetting and then updating the coefficients.
- Return type:
None
- reset(interp_set, center, ref_surrogate=None)[source]
Reset the model with the given interpolation set, center point and dimension, and optionally a reference model.
- Parameters:
interp_set (
InterpSet) – The interpolation set. It must be provided in order for the interpolation surrogate model to be built based on the interpolation points.center (ndarray) – The initial center of the surrogate model.
ref_surrogate (
QuadSurrogate, optional) – A reference surrogate model to help initialize the KKT coefficients and other cached information.
- Return type:
None
- reset_surrogate_coeff()[source]
Reset all surrogate model coefficients to zero values.
- Return type:
None
- shift_center_to(x)[source]
Shift the model center to
x, then update the KKT and model coefficients.- Parameters:
x (ndarray, shape (n,)) – The new model center.
- Return type:
None
Notes
The computational cost is O(npt^3) with BLAS-3 operations.
- solve_interpolation_KKT(f_Y_shift=<class 'numpy.ndarray'>)[source]
Given the interpolation set, solves the first-order KKT optimality conditions for the derivative-free symmetric Broyden update subproblem and returns the solution: the Lagrange multipliers and the gradient update term.
- Parameters:
f_Y_shift (ndarray, shape (npt,)) – Represents
fun(Y) - surrogate(Y), whereYdenotes the interpolation points.- Returns:
lambda (ndarray, shape (npt,)) – The Lagrange multipliers.
g (ndarray, shape (n,)) – The gradient update term.
- Return type:
Tuple[ndarray, ndarray]
- update(cached_kkt_info=None)[source]
Update the KKT and model coefficients using the method detailed in [1], [2] and [3].
- Parameters:
cached_kkt_info (tuple, optional) – If not None, it should be the cached output of
self.get_determinant_ratio(x_new), wherex_newdenotes the new point just added into the interpolation set.- Return type:
None
References
[1] (1,2)Michael J. D. Powell. 2004. Least Frobenius norm updating of quadratic models that satisfy interpolation conditions. Math. Program. 100, 1 (May 2004), 183-215.
- exception upoqa.utils.model.SurrogateLinAlgError(message, ele_idx=None, ele_name=None)[source]
Bases:
ExceptionException raised for linear algebra errors in the surrogate model
- Parameters:
message (str)
ele_idx (int | None)
ele_name (Any | None)
- Return type:
None
- ele_idx
Index of the element function that caused the error. If the error is not related to a specific element, this is None.
- Type:
int or None
- ele_name
Name of the element function that caused the error. If the error is not related to a specific element, this is None.
- Type:
Any or None
upoqa.utils.params module
- param A container for all parameter values.:
- class upoqa.utils.params.UPOQAParameterList(ele_names, ele_dims, maxfev, noise_level=0, seek_global_minimum=False, debug=False)[source]
Bases:
ParameterListParameter list for UPOQA.
- Parameters:
ele_names (list) – List containing the names of all elements in order.
ele_dims (list of int) – Elemental dimensions of the problem.
maxfev (list) – Maximum number of function evaluations to go before termination for each element.
noise_level (int, default=0) – Indicates how noisy the objective function is. Should be 0, 1, or 2. Default number of interpolation points (
npt) increases as thenoise_levelincreases, and whennoise_level> 0, restart mechanism is enabled.seek_global_minimum (bool, default=False) – Whether to seek a global minimum. Defaults to False. If True, restart mechanism is enabled, but the parameters for restart are different.
debug (bool, default=False) –
Whether to enable debug mode. Defaults to False. If True, the algorithm will check for NaN values in the objective function and the surrogate model.
Note that NaN check will incur additional runtime costs.
Methods
__call__(key[, new_value])Access or update a parameter value.
Check the types and boundary conditions of all parameters.
params_start_with([prefix])Return a dictionary of all parameters that start with the given prefix.
check_param
param_type
upoqa.utils.projection module
Projection Operator
Implement several projection operators for the use of the projected gradient method.
- upoqa.utils.projection.average_proj(s, coords, radii, tol=1e-08, maxit=None)[source]
Averaged projection onto the intersection of cylinders, see [1] for more details about its convergence.
- Parameters:
s (ndarray, shape (n,)) – Initial point.
coords (ndarray or list of 1D array) – Boolean masks or index arrays for each element’s variables.
radii (ndarray, shape (ele_num,)) – Trust-region radii.
tol (float, default=1e-8) – Tolerance for convergence.
maxit (int, optional) – Maximum iterations (default: 10*n).
- Returns:
ndarray, shape (n,) – Feasible point.
bool – True if projection actually moved the point.
References
[1]Lewis, A.S., Luke, D.R. & Malick, J. Local Linear Convergence for Alternating and Averaged Nonconvex Projections. Found Comput Math 9, 485-513 (2009). https://doi.org/10.1007/s10208-008-9036-y
- upoqa.utils.projection.dykstra_proj(s, coords_mask, radii, tol=1e-06, maxit=None)[source]
Dykstra’s projection method onto the intersection of cylinders, see [1].
- Parameters:
s (ndarray, shape (n,)) – Initial point.
coords_mask (ndarray, shape (ele_num, n)) – Boolean mask;
coords_mask[i]indicates the variables belonging to elementi.radii (ndarray, shape (ele_num,)) – Trust-region radii.
tol (float, default=1e-6) – Tolerance for convergence.
maxit (int, optional) – Maximum iterations (default: 10*n).
- Returns:
ndarray, shape (n,) – Feasible point.
bool – True if projection actually moved the point.
References
[1]Boyle, J. P.; Dykstr, R. L. (1986). “A Method for Finding Projections onto the Intersection of Convex Sets in Hilbert Spaces”. Advances in Order Restricted Statistical Inference. Lecture Notes in Statistics. Vol. 37. pp. 28-47. doi:10.1007/978-1-4613-9940-7_3. ISBN 978-0-387-96419-5.
- upoqa.utils.projection.get_violation_ratio(s, coords, radii)[source]
Compute the ratio ‖s[coord]‖₂ / radius for each element.
- Parameters:
s (ndarray, shape (n,)) – Trial point.
coords (ndarray, shape (ele_num, n) or list of ndarray) – List of index arrays, one for each element, or a mask array of which each line is a mask vector indicating the indices.
radii (ndarray, shape (ele_num,)) – Trust-region radii for each element.
- Returns:
Violation ratios (>1 means infeasible).
- Return type:
ndarray, shape (ele_num,)
- upoqa.utils.projection.steinmetz_comb_proj(s, coords_mask, radii, pretol=1e-08, preit=4)[source]
Combined averaged and Steinmetz projection onto the intersection of cylinders.
First applies a small number of averaged-projection iterations, then switches to the Steinmetz shrinking procedure.
- Parameters:
s (ndarray, shape (n,)) – Point to project.
coords_mask (ndarray, shape (ele_num, n)) – Boolean mask;
coords_mask[i]indicates the variables belonging to elementi.radii (ndarray, shape (ele_num,)) – Trust-region radii.
pretol (float) – Tolerance for the initial averaged projection.
preit (int) – Maximum iterations for the initial averaged projection.
- Returns:
ndarray, shape (n,) – Feasible point.
bool – True if projection actually moved the point.
- upoqa.utils.projection.steinmetz_proj(s, coords_mask, radii)[source]
Approximate projection onto the intersection of cylinders (which forms a Steinmetz solid in the special case of two cylinders) using a shrinking strategy.
- Parameters:
s (ndarray, shape (n,)) – Point to project.
coords_mask (ndarray, shape (ele_num, n)) – Boolean masks for each element’s variables.
radii (ndarray, shape (ele_num,)) – Trust-region radii.
- Returns:
ndarray, shape (n,) – Feasible point.
bool – True if projection actually moved the point.
upoqa.utils.result module
Optimization Result
Adapted from scipy.optimize.OptimizeResult with minor modifications.
- class upoqa.utils.result.OptimizeResult[source]
Bases:
dictRepresent the optimization result.
- message
Description of the cause of the termination.
- Type:
str
- success
Whether the optimization procedure terminated successfully.
- Type:
bool
- exception
Exception raised during the optimization that caused the termination, if any.
- Type:
Exception, optional
- traceback
Traceback of the exception, if any.
- Type:
str, optional
- fun
Value of objective function evaluated at the solution.
- Type:
float
- funs
Values of the element functions evaluated at the solution.
The type matches the input
fun:If
funwas a dictionary, return a dictionary mapping element names to their function values.If
funwas a list, return a list of function values in the same order as the input list.
- Type:
dict or list
- extra_fun
Value of the extra function at the solution.
The “extra function” represents the differentiable white-box component of the objective.
- Type:
float
- x
The solution of the optimization.
- Type:
ndarray
- jac, hess
Values of objective function’s Jacobian and its Hessian at
x. The Hessian is an approximation.- Type:
ndarray
- nit
Number of iterations performed by the optimizer
- Type:
int
- nfev
Number of evaluations of element functions.
The type matches the input
fun:If
funwas a dictionary, return a dictionary mapping element names to their numbers of function evaluations.If
funwas a list, return a list of evaluation numbers in the same order as the input list.
- Type:
dict or list
- max_nfev
Maximum number of evaluations of element functions.
- Type:
int
- avg_nfev
Average number of evaluations of element functions.
- Type:
float
- nrun
Number of runs (when enabling restarts).
- Type:
int
- manager
Algorithm manager (debugging only). Included only when
debug=Trueorreturn_internals=True.- Type:
UPOQAManager, optional
- interp_set
Interpolation point set (debugging only). Included only when
debug=Trueorreturn_internals=True.- Type:
OverallInterpSet, optional
- model
Surrogate model (debugging only). Included only when
debug=Trueorreturn_internals=True.- Type:
OverallSurrogate, optional
Methods
clear()copy()fromkeys(iterable[, value])Create a new dictionary with keys from iterable and values set to value.
get(key[, default])Return the value for key if key is in the dictionary, else default.
items()keys()pop(key[, default])If the key is not found, return the default if given; otherwise, raise a KeyError.
popitem(/)Remove and return a (key, value) pair as a 2-tuple.
setdefault(key[, default])Insert key with a value of default if key is not in the dictionary.
update([E, ]**F)If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
values()
upoqa.utils.sub_solver module
Trust Region Subproblem Solver
Implement several sub-routines for approximately solving trust-region sub-problems that arise in derivative-free optimization with partially separable structures:
cauchy_geometry/spider_geometry: geometry-improving steps along Cauchy / straight-line directions.tangential_byrd_omojokun: truncated conjugate-gradient step on the tangential space.conjugate_gradient_proj_steinmetz: modified projected CG for structured trust-regions using steinmetz projections.
The cauchy_geometry, spider_geometry, and tangential_byrd_omojokun solvers
are drawn from the COBYQA library; see the
original implementation at
here.
- upoqa.utils.sub_solver.cauchy_geometry(const, grad, curv, delta)[source]
Maximize approximately the absolute value of a quadratic function subject to bound constraints in a trust region.
This function solves approximately
\[\begin{split}\max_{s \in \mathbb{R}^n} \quad \bigg\lvert c + g^{\mathsf{T}} s + \frac{1}{2} s^{\mathsf{T}} H s \bigg\rvert \quad \text{s.t.} \quad \left\{ \begin{array}{l} l \le s \le u,\\ \lVert s \rVert \le \Delta, \end{array} \right.\end{split}\]by maximizing the objective function along the constrained Cauchy direction.
- Parameters:
const (float) – Constant \(c\) as shown above.
grad (ndarray, shape (n,)) – Gradient \(g\) as shown above.
curv (callable) –
Curvature of \(H\) along any vector.
curv(s) -> floatreturns \(s^{\mathsf{T}} H s\).
delta (float) – Trust-region radius \(\Delta\) as shown above.
- Returns:
Approximate solution \(s\).
- Return type:
ndarray, shape (n,)
Notes
This function is described as the first alternative in Section 6.5 of [1]. It is assumed that the origin is feasible with respect to the bound constraints and that
deltais finite and positive.References
[1]Tom M. Ragonneau. Model-Based Derivative-Free Optimization Methods and Software. PhD thesis, Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China, 2022. URL: https://theses.lib.polyu.edu.hk/handle/200/12294.
- upoqa.utils.sub_solver.conjugate_gradient_proj_steinmetz(fun, grad, hess_prod, coords_mask, n, deltas, envelope_delta, maxiter=None, **kwargs)[source]
Modified projected conjugate gradient method for a quadratic trust-region sub-problem with a structured trust-region.
The feasible region is the intersection of \(q\) cylinders
\[\mathcal{S} = \bigcap_{i=1}^{q} \left\{ s\in\mathbb{R}^{n} : \lVert s^{\mathcal{I}_i} \rVert_{2} \le \Delta_i \right\}.\]This function solves approximately
\[\min_{s\in\mathcal{S}}\quad f(s).\]- Parameters:
fun (callable) – Overall surrogate model function \(f\) as shown above.
grad (callable) –
Gradient \(\nabla f\) of the model function:
grad(x) -> ndarray, shape (n,)returns the gradient at
x.hess_prod (callable) –
Product of the Hessian matrix \(\nabla^2 f(x)\) with any vector \(v\):
hess_prod(x, v) -> ndarray, shape (n,)returns the product \(\nabla^2 f(x) v\).
coords_mask (ndarray, shape (q, n)) – Bool mask indicating variable indices \(\mathcal{I}_i\) for each element.
n (int) – Problem dimension
deltas (ndarray) – Elemental trust-region radii \(\{\Delta_i\}_{i=1}^q\) as shown above.
envelope_delta (float) – A sufficiently large radius ensuring that the structured trust-region lies entirely within the ball centered at the origin with this radius.
maxiter (int, optional) – Maximum number of iterations. Default is None, which sets it to \(\max(20, 3n)\) in which \(2n\) steps are reserved for projection steps.
- Returns:
Approximate solution to the trust-region subproblem.
- Return type:
ndarray
Notes
This method uses Steinmetz projection (
steinmetz_proj()) and combined projection (steinmetz_comb_proj()) to maintain feasibility with respect to the cylindrical constraints.
- upoqa.utils.sub_solver.spider_geometry(const, grad, curv, xpt, delta)[source]
Maximize approximately the absolute value of a quadratic function subject to bound constraints in a trust region.
This function solves approximately
\[\begin{split}\max_{s \in \mathbb{R}^n} \quad \bigg\lvert c + g^{\mathsf{T}} s + \frac{1}{2} s^{\mathsf{T}} H s \bigg\rvert \quad \text{s.t.} \quad \left\{ \begin{array}{l} l \le s \le u,\\ \lVert s \rVert \le \Delta, \end{array} \right.\end{split}\]by maximizing the objective function along given straight lines.
- Parameters:
const (float) – Constant \(c\) as shown above.
grad (ndarray, shape (n,)) – Gradient \(g\) as shown above.
curv (callable) –
Curvature of \(H\) along any vector.
curv(s) -> floatreturns \(s^{\mathsf{T}} H s\).
xpt (ndarray, shape (npt, n)) – Points defining the straight lines. The straight lines considered are the ones passing through the origin and the points in
xpt.delta (float) – Trust-region radius \(\Delta\) as shown above.
- Returns:
Approximate solution \(s\).
- Return type:
ndarray, shape (n,)
Notes
This function is described as the second alternative in Section 6.5 of [1]. It is assumed that the origin is feasible with respect to the bound constraints and that
deltais finite and positive.References
[1]Tom M. Ragonneau. Model-Based Derivative-Free Optimization Methods and Software. PhD thesis, Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China, 2022. URL: https://theses.lib.polyu.edu.hk/handle/200/12294.
- upoqa.utils.sub_solver.tangential_byrd_omojokun(fun, grad, hess_prod, delta, n, **kwargs)[source]
Minimize approximately a quadratic function subject to bound constraints in a trust region.
This function solves approximately
\[\min_{s\in\mathbb{R}^n}\quad f(s) \quad \text{s.t.} \quad \lVert s \rVert \le \Delta.\]using a variation of the truncated conjugate gradient method.
- Parameters:
fun (callable) – Overall surrogate model function \(f\) as shown above.
grad (callable) –
Gradient \(\nabla f\) of the model function:
grad(x) -> ndarray, shape (n,)returns the gradient at
x.hess_prod (callable) –
Product of the Hessian matrix \(\nabla^2 f(x)\) with any vector \(v\):
hess_prod(x, v) -> ndarray, shape (n,)returns the product \(\nabla^2 f(x) v\).
delta (float) – Trust-region radius \(\Delta\) as shown above.
n (int) – Dimension of the function.
improve_tcg (bool, optional) – If True, a solution generated by the truncated conjugate gradient method that is on the boundary of the trust region is improved by moving around the trust-region boundary on the two-dimensional space spanned by the solution and the gradient of the quadratic function at the solution (default is True).
- Returns:
Approximate solution \(s\).
- Return type:
ndarray, shape (n,)
Notes
This function originally implements Algorithm 6.2 of [1] and is modified to adapt to the UPOQA solver. It is assumed that the origin is feasible with respect to the bound constraints and that
deltais finite and positive.References
[1]Tom M. Ragonneau. Model-Based Derivative-Free Optimization Methods and Software. PhD thesis, Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China, 2022. URL: https://theses.lib.polyu.edu.hk/handle/200/12294.