LSQ: How to implement

Consider the data and model from Simple example:

X = collect(1:10)
Y = [1.0, 1.78, 3.64, 3.72, 5.33, 2.73, 7.52, 9.19, 6.73, 8.95]
ΔY = [0.38, 0.86, 0.29, 0.45, 0.66, 2.46, 0.39, 0.45, 1.62, 1.54]
data = FittingData(X,Y,ΔY)
model = ModelFunctions((x,λ)-> λ*x)

Objective functions

Use lsq_objective to construct the uncertainty-weighted least squares objective:

weighted_lsq = lsq_objective(data,model)

#9 (generic function with 1 method)

The returned objective function takes the model parameter (array) λ as argument weighted_lsq(λ).

To obtain the standard least squares objective, the errors must be set to 1, e.g. by using the shortened constructor (see The FittingData struct):

data_no_errors = FittingData(X,Y)
standard_lsq = lsq_objective(data_no_errors,model)

Partial derivatives and gradients

To obtain partial derivatives or the gradient of the least squares objective function, the partial derivatives of the model function need to be added to the ModelFunctions object (cf. The ModelFunctions struct):

model = ModelFunctions((x,λ)->λ*x , partials = [(x,λ)-> x])

ModelFunctions(Main.var"#3#5"(), [Main.var"#4#6"()])

The partial derivatives of the least squares objective can be obtained with lsq_partials

∂_weighted_lsq = lsq_partials(data,model)

1-element Vector{Function}:
 #17 (generic function with 1 method)

Note that lsq_partials returns the partial derivatives as vector of abstract functions with λ as argument, even in the 1-dimensional case.

∂_weighted_lsq[1](1.1)

42.62381855699296

The gradient of the least squares objective can be obtained with lsq_gradient

∇_weighted_lsq = lsq_gradient(data,model)

#23 (generic function with 1 method)

The returned gradient function has the signature (grad_vector,λ). The argument grad_vector must be a vector of appropriate type and length, that can be mutated.

∇_weighted_lsq([0.0],1.1)

1-element Vector{Float64}:
 42.62381855699296

Mutation of gradient vector

In some optimization algorithms, the gradient function is called multiple times during each iteration. Mutating an array allows to reduce the memory allocation overhead of creating new gradient arrays.