Source Edit

Optimization

This module implements optimization and curve fitting routines.

Optimization

These optimization methods are provided:

lbfgs (recommended): A quasi-Newton method that strikes a good balance between accuracy and performance.
bfgs: A quasi-Newton method. lbfgs is a lighter version of this.
newton: Newton's method. Fast for small problems but struggles when the number of variables increase.
steepestDescent: The classical gradient descent. Has slow convergence compared to the others.

By default the gradients are approximated using finite differences. Optionally an analytical gradient can be supplied with the analyticGradient argument.

Example:

import src/numericalnim/optimize
import arraymancer
# f(x, y) = x^2 + y^2
proc f(x: Tensor[float]): float =
    x[0]*x[0] + x[1]*x[1]

let guess = [1.0, 1.0].toTensor
let sol = lbfgs(f, guess)

A Levenberg-Marquardt non-linear least squares solver is provided in levmarq. The curve to fit is provided as a function taking in a 1D Tensor with all parameters and the value of the independent variable to evaluate it at.

Optionally the y-errors can be supplied to the yError argument to take the uncertainties of each of the points into account. The uncertainties of the fitted parameters can be obtained by calling paramUncertainties.

Example:

import src/numericalnim/optimize
import arraymancer
# f(x) = a*x + b
proc f(params: Tensor[float], x: float): float =
    let a = params[0]
    let b = params[1]
    result = a*x + b

# Generate measurements
let x = linspace(0.0, 10.0, 10)
let y = 3.14 * x +. 2.81 + randomNormalTensor(10, 0.0, 0.05)
let yError = 0.05 * ones[float](10)
# Solve for best fit parameters
let guess = [1.0, 1.0].toTensor
let solution = levmarq(f, guess, x, y, yError=yError)
# Calculate uncertainties in fitted parameters
let uncertainties = sqrt(paramUncertainties(solution, f, x, y, yError))

Imports

differentiate, utils

Types

LBFGSOptions[U] = object savedIterations*: int: Source Edit
LevMarqOptions[U] = object lambda0*: U: Source Edit
LineSearchCriterion = enum Armijo, Wolfe, WolfeStrong, NoLineSearch: Source Edit
OptimOptions[U; AO] = object tol*, alpha*: U fastMode*: bool maxIterations*: int lineSearchCriterion*: LineSearchCriterion algoOptions*: AO: Source Edit
StandardOptions = object: Source Edit

Procs

proc bfgs[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U]; options: OptimOptions[ U, StandardOptions] = bfgsOptions[U](); analyticGradient: proc (x: Tensor[U]): Tensor[ T] = nil): Tensor[U]: BFGS (Broyden–Fletcher–Goldfarb–Shanno) method for optimization.

Inputs:

f: The function to optimize. It should take as input a 1D Tensor of the input variables and return a scalar.

options: Options object (see bfgsOptions for constructing one)

analyticGradient: The analytic gradient of f taking in and returning a 1D Tensor. If not provided, a finite difference approximation will be performed instead.

Returns:

The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.

Source Edit
proc bfgs_old[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U]; alpha: U = U(1); tol: U = U(0.000001); fastMode: bool = false; analyticGradient: proc ( x: Tensor[U]): Tensor[T] = nil): Tensor[U]: Source Edit
proc bfgsOptions[U](tol: U = U(0.000001); alpha: U = U(1); fastMode: bool = false; maxIterations: int = 10000; lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[ U, StandardOptions]: Returns a BFGS OptimOptions
tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).

alpha: The step size.

fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.

maxIteration: The maximum number of iteration before returning if convergence haven't been reached.

lineSearchCriterion: Which line search method to use.

Source Edit
proc conjugate_gradient[T](A, b, x_0: Tensor[T]; tolerance: float64): Tensor[T]: Conjugate Gradient method. Given a Symmetric and Positive-Definite matrix A, solve the linear system Ax = b Symmetric Matrix: Square matrix that is equal to its transpose, transpose(A) == A Positive Definite: Square matrix such that transpose(x)Ax > 0 for all x in R^n

Input:

A: NxN square matrix

b: vector on the right side of Ax=b

x_0: Initial guess vector

Returns:

Tensor.

Source Edit
proc lbfgs[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U]; options: OptimOptions[ U, LBFGSOptions[U]] = lbfgsOptions[U](); analyticGradient: proc ( x: Tensor[U]): Tensor[T] = nil): Tensor[U]: LBFGS (Limited-memory Broyden–Fletcher–Goldfarb–Shanno) method for optimization.

Inputs:

f: The function to optimize. It should take as input a 1D Tensor of the input variables and return a scalar.

options: Options object (see lbfgsOptions for constructing one)

analyticGradient: The analytic gradient of f taking in and returning a 1D Tensor. If not provided, a finite difference approximation will be performed instead.

Returns:

The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.

Source Edit
proc lbfgsOptions[U](savedIterations: int = 10; tol: U = U(0.000001); alpha: U = U(1); fastMode: bool = false; maxIterations: int = 10000; lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[ U, LBFGSOptions[U]]: Returns a LBFGS OptimOptions
tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).

alpha: The step size.

fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.

maxIteration: The maximum number of iteration before returning if convergence haven't been reached.

lineSearchCriterion: Which line search method to use.

savedIterations: Number of past iterations to save. The higher the value, the better but slower steps.

Source Edit
proc levmarq[U; T: not Tensor](f: proc (params: Tensor[U]; x: U): T; params0: Tensor[U]; xData: Tensor[U]; yData: Tensor[T]; options: OptimOptions[U, LevMarqOptions[U]] = levmarqOptions[U](); yError: Tensor[T] = ones_like(yData)): Tensor[U]: Levenberg-Marquardt for non-linear least square solving. Basically it fits parameters of a function to data samples.

Input:

f: The function you want to fit the data to. The first argument should be a 1D Tensor with the values of the parameters and the second argument is the value if the independent variable to evaluate the function at.

params0: The starting guess for the parameter values as a 1D Tensor.

yData: The measured values of the dependent variable as 1D Tensor.

xData: The values of the independent variable as 1D Tensor.

options: Object with all the options like tol and lambda0. (see levmarqOptions)

yError: The uncertainties of the yData as 1D Tensor. Ideally these should be the 1σ standard deviation.

Returns:

The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.

Source Edit
proc levmarqOptions[U](lambda0: U = U(1); tol: U = U(0.000001); alpha: U = U(1); fastMode: bool = false; maxIterations: int = 10000; lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[ U, LevMarqOptions[U]]: Returns a levmarq OptimOptions
tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).

alpha: The step size.

fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.

maxIteration: The maximum number of iteration before returning if convergence haven't been reached.

lineSearchCriterion: Which line search method to use.

lambda0: Starting value of dampening parameter

Source Edit
proc line_search[U, T](alpha: var U; p: Tensor[T]; x0: Tensor[U]; f: proc (x: Tensor[U]): T; criterion: LineSearchCriterion; fastMode: bool = false): Source Edit
proc newton[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U]; options: OptimOptions[ U, StandardOptions] = newtonOptions[U](); analyticGradient: proc ( x: Tensor[U]): Tensor[T] = nil): Tensor[U]: Newton's method for optimization.

Inputs:

f: The function to optimize. It should take as input a 1D Tensor of the input variables and return a scalar.

options: Options object (see newtonOptions for constructing one)

analyticGradient: The analytic gradient of f taking in and returning a 1D Tensor. If not provided, a finite difference approximation will be performed instead.

Returns:

The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.

Source Edit
proc newtonOptions[U](tol: U = U(0.000001); alpha: U = U(1); fastMode: bool = false; maxIterations: int = 10000; lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[ U, StandardOptions]: Returns a Newton OptimOptions
tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).

alpha: The step size.

fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.

maxIteration: The maximum number of iteration before returning if convergence haven't been reached.

lineSearchCriterion: Which line search method to use.

Source Edit
proc newtons(f: proc (x: float64): float64; deriv: proc (x: float64): float64; start: float64; precision: float64 = 0.00001; max_iters: Natural = 1000): float64 {....raises: [ArithmeticError], effectsOf: [f, deriv], ...tags: [], forbids: [].}: Newton-Raphson implementation for 1-dimensional functions Source Edit
proc optimOptions[U](tol: U = U(0.000001); alpha: U = U(1); fastMode: bool = false; maxIterations: int = 10000; lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[ U, StandardOptions]: Returns a vanilla OptimOptions
tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).

alpha: The step size.

fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.

maxIteration: The maximum number of iteration before returning if convergence haven't been reached.

lineSearchCriterion: Which line search method to use.

Source Edit
proc paramUncertainties[U; T](params: Tensor[U]; fitFunc: proc (params: Tensor[U]; x: U): T; xData: Tensor[U]; yData: Tensor[T]; yError: Tensor[T]; returnFullCov = false): Tensor[ T]: Returns the whole covariance matrix or only the diagonal elements for the parameters in params.

Inputs:

params: The parameters in a 1D Tensor that the uncertainties are wanted for.

fitFunc: The function used for fitting the parameters. (see levmarq for more)

xData: The values of the independent variable as 1D Tensor.

yData: The measured values of the dependent variable as 1D Tensor.

yError: The uncertainties of the yData as 1D Tensor. Ideally these should be the 1σ standard deviation.

returnFullConv: If true, the full covariance matrix will be returned as a 2D Tensor, else only the diagonal elements will be returned as a 1D Tensor.

Returns:

The uncertainties of the parameters in the form of a covariance matrix (or only the diagonal elements).

Note: it is the covariance that is returned, so if you want the standard deviation you have to take the square root of it.
Source Edit
proc secant(f: proc (x: float64): float64; start: array[2, float64]; precision: float64 = 0.00001; max_iters: Natural = 1000): float64 {. ...raises: [Exception], tags: [RootEffect], forbids: [].}: Source Edit
proc steepest_descent(deriv: proc (x: float64): float64; start: float64; gamma: float64 = 0.01; precision: float64 = 0.00001; max_iters: Natural = 1000): float64 {.inline, ...raises: [Exception], tags: [RootEffect], forbids: [].}: Gradient descent optimization algorithm for finding local minimums of a function with derivative 'deriv'

Assuming that a multivariable function F is defined and differentiable near a minimum, F(x) decreases fastest when going in the direction negative to the gradient of F(a), similar to how water might traverse down a hill following the path of least resistance. can benefit from preconditioning if the condition number of the coefficient matrix is ill-conditioned Input:

deriv: derivative of a multivariable function F

start: starting point near F's minimum

gamma: step size multiplier, used to control the step size between iterations

precision: numerical precision

max_iters: maximum iterations

Returns:

float64.

Source Edit
proc steepestDescent[U; T: not Tensor](f: proc (x: Tensor[U]): T; x0: Tensor[U]; options: OptimOptions[U, StandardOptions] = steepestDescentOptions[U](); analyticGradient: proc (x: Tensor[U]): Tensor[T] = nil): Tensor[U]: Steepest descent method for optimization.

Inputs:

f: The function to optimize. It should take as input a 1D Tensor of the input variables and return a scalar.

options: Options object (see steepestDescentOptions for constructing one)

analyticGradient: The analytic gradient of f taking in and returning a 1D Tensor. If not provided, a finite difference approximation will be performed instead.

Returns:

The final solution for the parameters. Either because a (local) minimum was found or because the maximum number of iterations was reached.

Source Edit
proc steepestDescentOptions[U](tol: U = U(0.000001); alpha: U = U(0.001); fastMode: bool = false; maxIterations: int = 10000; lineSearchCriterion: LineSearchCriterion = NoLineSearch): OptimOptions[ U, StandardOptions]: Returns a Steepest Descent OptimOptions
tol: The tolerance used. This is the criteria for convergence: gradNorm < tol*(1 + fNorm).

alpha: The step size.

fastMode: If true, a faster first order accurate finite difference approximation of the derivative will be used. Else a more accurate but slowe second order finite difference scheme will be used.

maxIteration: The maximum number of iteration before returning if convergence haven't been reached.

lineSearchCriterion: Which line search method to use.

Source Edit
proc vectorNorm[T](v: Tensor[T]): T: Calculates the norm of the vector, ie the sqrt(Σ vᵢ²) Source Edit

src/numericalnim/optimize

Optimization

Optimization

Curve fitting

Imports

Types

Procs