LSQ: How to implement
Consider the data and model from Simple example:
X = collect(1:10)
Y = [1.0, 1.78, 3.64, 3.72, 5.33, 2.73, 7.52, 9.19, 6.73, 8.95]
ΔY = [0.38, 0.86, 0.29, 0.45, 0.66, 2.46, 0.39, 0.45, 1.62, 1.54]
data = FittingData(X,Y,ΔY)
model = ModelFunctions((x,λ)-> λ*x)
Objective functions
Use lsq_objective
to construct the uncertainty-weighted least squares objective:
weighted_lsq = lsq_objective(data,model)
#9 (generic function with 1 method)
The returned objective function takes the model parameter (array) λ
as argument weighted_lsq(λ)
.
To obtain the standard least squares objective, the errors must be set to 1
, e.g. by using the shortened constructor (see The FittingData
struct):
data_no_errors = FittingData(X,Y)
standard_lsq = lsq_objective(data_no_errors,model)
Partial derivatives and gradients
To obtain partial derivatives or the gradient of the least squares objective function, the partial derivatives of the model function need to be added to the ModelFunctions
object (cf. The ModelFunctions
struct):
model = ModelFunctions((x,λ)->λ*x , partials = [(x,λ)-> x])
ModelFunctions(Main.var"#3#5"(), [Main.var"#4#6"()])
The partial derivatives of the least squares objective can be obtained with lsq_partials
∂_weighted_lsq = lsq_partials(data,model)
1-element Vector{Function}:
#17 (generic function with 1 method)
Note that lsq_partials
returns the partial derivatives as vector of abstract functions with λ
as argument, even in the 1-dimensional case.
∂_weighted_lsq[1](1.1)
42.62381855699296
The gradient of the least squares objective can be obtained with lsq_gradient
∇_weighted_lsq = lsq_gradient(data,model)
#23 (generic function with 1 method)
The returned gradient function has the signature (grad_vector,λ)
. The argument grad_vector
must be a vector of appropriate type and length, that can be mutated.
∇_weighted_lsq([0.0],1.1)
1-element Vector{Float64}:
42.62381855699296
In some optimization algorithms, the gradient function is called multiple times during each iteration. Mutating an array allows to reduce the memory allocation overhead of creating new gradient arrays.