src/numericalnim/optimize

Search:
Group by:
Source   Edit  

Optimization

This module implements optimization and curve fitting routines.

Optimization

These optimization methods are provided:

  • lbfgs (recommended): A quasi-Newton method that strikes a good balance between accuracy and performance.
  • bfgs: A quasi-Newton method. lbfgs is a lighter version of this.
  • newton: Newton's method. Fast for small problems but struggles when the number of variables increase.
  • steepestDescent: The classical gradient descent. Has slow convergence compared to the others.

By default the gradients are approximated using finite differences. Optionally an analytical gradient can be supplied with the analyticGradient argument.

Example:

import src/numericalnim/optimize
import arraymancer
# f(x, y) = x^2 + y^2
proc f(x: Tensor[float]): float =
    x[0]*x[0] + x[1]*x[1]

let guess = [1.0, 1.0].toTensor
let sol = lbfgs(f, guess)

Curve fitting

A Levenberg-Marquardt non-linear least squares solver is provided in levmarq. The curve to fit is provided as a function taking in a 1D Tensor with all parameters and the value of the independent variable to evaluate it at.

Optionally the y-errors can be supplied to the yError argument to take the uncertainties of each of the points into account. The uncertainties of the fitted parameters can be obtained by calling paramUncertainties.

Example:

import src/numericalnim/optimize
import arraymancer
# f(x) = a*x + b
proc f(params: Tensor[float], x: float): float =
    let a = params[0]
    let b = params[1]
    result = a*x + b

# Generate measurements
let x = linspace(0.0, 10.0, 10)
let y = 3.14 * x +. 2.81 + randomNormalTensor(10, 0.0, 0.05)
let yError = 0.05 * ones[float](10)
# Solve for best fit parameters
let guess = [1.0, 1.0].toTensor
let solution = levmarq(f, guess, x, y, yError=yError)
# Calculate uncertainties in fitted parameters
let uncertainties = sqrt(paramUncertainties(solution, f, x, y, yError))

Types

LBFGSOptions[U] = object
  savedIterations*: int
Source   Edit  
LevMarqOptions[U] = object
  lambda0*: U
Source   Edit  
LineSearchCriterion = enum
  Armijo, Wolfe, WolfeStrong, NoLineSearch
Source   Edit  
OptimOptions[U; AO] = object
  tol*, alpha*: U
  fastMode*: bool
  maxIterations*: int
  lineSearchCriterion*: LineSearchCriterion
  algoOptions*: AO
Source   Edit  
StandardOptions = object
Source   Edit  

Procs

proc bfgs[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U]; options: OptimOptions[
    U, StandardOptions] = bfgsOptions[U](); analyticGradient: proc (x: Tensor[U]): Tensor[
    T] = nil): Tensor[U]

BFGS (Broyden–Fletcher–Goldfarb–Shanno) method for optimization.

Inputs:

  • f: The function to optimize. It should take as input a 1D Tensor of the input variables and return a scalar.
  • options: Options object (see bfgsOptions for constructing one)
  • analyticGradient: The analytic gradient of f taking in and returning a 1D Tensor. If not provided, a finite difference approximation will be performed instead.

Returns:

  • The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.
Source   Edit  
proc bfgs_old[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U];
                                alpha: U = U(1); tol: U = U(0.000001);
                                fastMode: bool = false; analyticGradient: proc (
    x: Tensor[U]): Tensor[T] = nil): Tensor[U]
Source   Edit  
proc bfgsOptions[U](tol: U = U(0.000001); alpha: U = U(1);
                    fastMode: bool = false; maxIterations: int = 10000;
                    lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[
    U, StandardOptions]
Returns a BFGS OptimOptions
  • tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).
  • alpha: The step size.
  • fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.
  • maxIteration: The maximum number of iteration before returning if convergence haven't been reached.
  • lineSearchCriterion: Which line search method to use.
Source   Edit  
proc conjugate_gradient[T](A, b, x_0: Tensor[T]; tolerance: float64): Tensor[T]

Conjugate Gradient method. Given a Symmetric and Positive-Definite matrix A, solve the linear system Ax = b Symmetric Matrix: Square matrix that is equal to its transpose, transpose(A) == A Positive Definite: Square matrix such that transpose(x)Ax > 0 for all x in R^n

Input:

  • A: NxN square matrix
  • b: vector on the right side of Ax=b
  • x_0: Initial guess vector

Returns:

  • Tensor.
Source   Edit  
proc lbfgs[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U]; options: OptimOptions[
    U, LBFGSOptions[U]] = lbfgsOptions[U](); analyticGradient: proc (
    x: Tensor[U]): Tensor[T] = nil): Tensor[U]

LBFGS (Limited-memory Broyden–Fletcher–Goldfarb–Shanno) method for optimization.

Inputs:

  • f: The function to optimize. It should take as input a 1D Tensor of the input variables and return a scalar.
  • options: Options object (see lbfgsOptions for constructing one)
  • analyticGradient: The analytic gradient of f taking in and returning a 1D Tensor. If not provided, a finite difference approximation will be performed instead.

Returns:

  • The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.
Source   Edit  
proc lbfgsOptions[U](savedIterations: int = 10; tol: U = U(0.000001);
                     alpha: U = U(1); fastMode: bool = false;
                     maxIterations: int = 10000;
                     lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[
    U, LBFGSOptions[U]]
Returns a LBFGS OptimOptions
  • tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).
  • alpha: The step size.
  • fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.
  • maxIteration: The maximum number of iteration before returning if convergence haven't been reached.
  • lineSearchCriterion: Which line search method to use.
  • savedIterations: Number of past iterations to save. The higher the value, the better but slower steps.
Source   Edit  
proc levmarq[U; T: not Tensor](f: proc (params: Tensor[U]; x: U): T;
                               params0: Tensor[U]; xData: Tensor[U];
                               yData: Tensor[T]; options: OptimOptions[U,
    LevMarqOptions[U]] = levmarqOptions[U]();
                               yError: Tensor[T] = ones_like(yData)): Tensor[U]

Levenberg-Marquardt for non-linear least square solving. Basically it fits parameters of a function to data samples.

Input:

  • f: The function you want to fit the data to. The first argument should be a 1D Tensor with the values of the parameters and the second argument is the value if the independent variable to evaluate the function at.
  • params0: The starting guess for the parameter values as a 1D Tensor.
  • yData: The measured values of the dependent variable as 1D Tensor.
  • xData: The values of the independent variable as 1D Tensor.
  • options: Object with all the options like tol and lambda0. (see levmarqOptions)
  • yError: The uncertainties of the yData as 1D Tensor. Ideally these should be the 1σ standard deviation.

Returns:

  • The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.
Source   Edit  
proc levmarqOptions[U](lambda0: U = U(1); tol: U = U(0.000001); alpha: U = U(1);
                       fastMode: bool = false; maxIterations: int = 10000;
                       lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[
    U, LevMarqOptions[U]]
Returns a levmarq OptimOptions
  • tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).
  • alpha: The step size.
  • fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.
  • maxIteration: The maximum number of iteration before returning if convergence haven't been reached.
  • lineSearchCriterion: Which line search method to use.
  • lambda0: Starting value of dampening parameter
Source   Edit  
proc line_search[U, T](alpha: var U; p: Tensor[T]; x0: Tensor[U];
                       f: proc (x: Tensor[U]): T;
                       criterion: LineSearchCriterion; fastMode: bool = false)
Source   Edit  
proc newton[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U]; options: OptimOptions[
    U, StandardOptions] = newtonOptions[U](); analyticGradient: proc (
    x: Tensor[U]): Tensor[T] = nil): Tensor[U]

Newton's method for optimization.

Inputs:

  • f: The function to optimize. It should take as input a 1D Tensor of the input variables and return a scalar.
  • options: Options object (see newtonOptions for constructing one)
  • analyticGradient: The analytic gradient of f taking in and returning a 1D Tensor. If not provided, a finite difference approximation will be performed instead.

Returns:

  • The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.
Source   Edit  
proc newtonOptions[U](tol: U = U(0.000001); alpha: U = U(1);
                      fastMode: bool = false; maxIterations: int = 10000;
                      lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[
    U, StandardOptions]
Returns a Newton OptimOptions
  • tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).
  • alpha: The step size.
  • fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.
  • maxIteration: The maximum number of iteration before returning if convergence haven't been reached.
  • lineSearchCriterion: Which line search method to use.
Source   Edit  
proc newtons(f: proc (x: float64): float64; deriv: proc (x: float64): float64;
             start: float64; precision: float64 = 0.00001;
             max_iters: Natural = 1000): float64 {....raises: [ArithmeticError],
    effectsOf: [f, deriv], ...tags: [], forbids: [].}
Newton-Raphson implementation for 1-dimensional functions Source   Edit  
proc optimOptions[U](tol: U = U(0.000001); alpha: U = U(1);
                     fastMode: bool = false; maxIterations: int = 10000;
                     lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[
    U, StandardOptions]
Returns a vanilla OptimOptions
  • tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).
  • alpha: The step size.
  • fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.
  • maxIteration: The maximum number of iteration before returning if convergence haven't been reached.
  • lineSearchCriterion: Which line search method to use.
Source   Edit  
proc paramUncertainties[U; T](params: Tensor[U];
                              fitFunc: proc (params: Tensor[U]; x: U): T;
                              xData: Tensor[U]; yData: Tensor[T];
                              yError: Tensor[T]; returnFullCov = false): Tensor[
    T]

Returns the whole covariance matrix or only the diagonal elements for the parameters in params.

Inputs:

  • params: The parameters in a 1D Tensor that the uncertainties are wanted for.
  • fitFunc: The function used for fitting the parameters. (see levmarq for more)
  • xData: The values of the independent variable as 1D Tensor.
  • yData: The measured values of the dependent variable as 1D Tensor.
  • yError: The uncertainties of the yData as 1D Tensor. Ideally these should be the 1σ standard deviation.
  • returnFullConv: If true, the full covariance matrix will be returned as a 2D Tensor, else only the diagonal elements will be returned as a 1D Tensor.

Returns:

The uncertainties of the parameters in the form of a covariance matrix (or only the diagonal elements).

Note: it is the covariance that is returned, so if you want the standard deviation you have to take the square root of it.

Source   Edit  
proc secant(f: proc (x: float64): float64; start: array[2, float64];
            precision: float64 = 0.00001; max_iters: Natural = 1000): float64 {.
    ...raises: [Exception], tags: [RootEffect], forbids: [].}
Source   Edit  
proc steepest_descent(deriv: proc (x: float64): float64; start: float64;
                      gamma: float64 = 0.01; precision: float64 = 0.00001;
                      max_iters: Natural = 1000): float64 {.inline,
    ...raises: [Exception], tags: [RootEffect], forbids: [].}

Gradient descent optimization algorithm for finding local minimums of a function with derivative 'deriv'

Assuming that a multivariable function F is defined and differentiable near a minimum, F(x) decreases fastest when going in the direction negative to the gradient of F(a), similar to how water might traverse down a hill following the path of least resistance. can benefit from preconditioning if the condition number of the coefficient matrix is ill-conditioned Input:

  • deriv: derivative of a multivariable function F
  • start: starting point near F's minimum
  • gamma: step size multiplier, used to control the step size between iterations
  • precision: numerical precision
  • max_iters: maximum iterations

Returns:

  • float64.
Source   Edit  
proc steepestDescent[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U];
    options: OptimOptions[U, StandardOptions] = steepestDescentOptions[U]();
    analyticGradient: proc (x: Tensor[U]): Tensor[T] = nil): Tensor[U]

Steepest descent method for optimization.

Inputs:

  • f: The function to optimize. It should take as input a 1D Tensor of the input variables and return a scalar.
  • options: Options object (see steepestDescentOptions for constructing one)
  • analyticGradient: The analytic gradient of f taking in and returning a 1D Tensor. If not provided, a finite difference approximation will be performed instead.

Returns:

  • The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.
Source   Edit  
proc steepestDescentOptions[U](tol: U = U(0.000001); alpha: U = U(0.001);
                               fastMode: bool = false;
                               maxIterations: int = 10000; lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[
    U, StandardOptions]
Returns a Steepest Descent OptimOptions
  • tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).
  • alpha: The step size.
  • fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.
  • maxIteration: The maximum number of iteration before returning if convergence haven't been reached.
  • lineSearchCriterion: Which line search method to use.
Source   Edit  
proc vectorNorm[T](v: Tensor[T]): T
Calculates the norm of the vector, ie the sqrt(Σ vᵢ²) Source   Edit