lsqfit - Nonlinear Least Squares Fitting¶
Introduction¶
This package contains tools for nonlinear least-squares curve fitting of data. In general a fit has four inputs:
- The dependent data
ythat is to be fit — typicallyyis a Python dictionary in anlsqfitanalysis. Its valuesy[k]are eithergvar.GVars or arrays (any shape or dimension) ofgvar.GVars that specify the values of the dependent variables and their errors.- A collection
xof independent data —xcan have any structure and contain any data, or it can be omitted.- A fit function
f(x, p)whose parameterspare adjusted by the fit untilf(x, p)equalsyto withinys errors — parameters p` are usually specified by a dictionary whose valuesp[k]are individual parameters or (numpy) arrays of parameters. The fit function is assumed independent ofx(that is,f(p)) ifx = False(or ifxis omitted from the input data).- Initial estimates or priors for each parameter in
p— priors are usually specified using a dictionarypriorwhose valuesprior[k]aregvar.GVars or arrays ofgvar.GVars that give initial estimates (values and errors) for parametersp[k].
A typical code sequence has the structure:
... collect x, y, prior ...
def f(x, p):
... compute fit to y[k], for all k in y, using x, p ...
... return dictionary containing the fit values for the y[k]s ...
fit = lsqfit.nonlinear_fit(data=(x, y), prior=prior, fcn=f)
print(fit) # variable fit is of type nonlinear_fit
The parameters p[k] are varied until the chi**2 for the fit is
minimized.
The best-fit values for the parameters are recovered after fitting
using, for example, p=fit.p. Then the p[k] are gvar.GVars or
arrays of gvar.GVars that give best-fit estimates and fit uncertainties
in those estimates. The print(fit) statement prints a summary of
the fit results.
The dependent variable y above could be an array instead of a
dictionary, which is less flexible in general but possibly more
convenient in simpler fits. Then the approximate y returned by fit
function f(x, p) must be an array with the same shape as the dependent
variable. The prior prior could also be represented by an array
instead of a dictionary.
By default priors are Gaussian/normal distributions, represented by
gvar.GVars. Setting nonlinear_fit parameter extend=True
allows for log-normal and sqrt-normal distributions as well. The
latter are indicated by replacing the prior (in a dictionary prior)
with key c, for example, by a prior for the parameter’s logarithm
or square root, with key log(c) or sqrt(c), respectively.
nonlinear_fit adds parameter c to the parameter
dictionary, deriving its value from parameter log(c) or
sqrt(c). The fit function can be expressed directly in terms of
parameter c and so is the same no matter which distribution is
used for c. Note that a sqrt-normal distribution with zero mean is
equivalent to an exponential distribution. Additional distributions
can be added using gvar.add_parameter_distribution().
The lsqfit tutorial contains extended explanations and examples.
The first appendix in the paper at http://arxiv.org/abs/arXiv:1406.2279
provides conceptual background on the techniques used in this
module for fits and, especially, error budgets.
nonlinear_fit Objects¶
-
class
lsqfit.nonlinear_fit(data, fcn, prior=None, p0=None, extend=False, svdcut=1e-12, debug=False, tol=1e-8, maxit=1000, fitter='gsl_multifit', **fitterargs)¶ Nonlinear least-squares fit.
lsqfit.nonlinear_fitfits a (nonlinear) functionf(x, p)to datayby varying parametersp, and stores the results: for example,fit = nonlinear_fit(data=(x, y), fcn=f, prior=prior) # do fit print(fit) # print fit results
The best-fit values for the parameters are in
fit.p, while thechi**2, the number of degrees of freedom, the logarithm of Gaussian Bayes Factor, the number of iterations (or function evaluations), and the cpu time needed for the fit are infit.chi2,fit.dof,fit.logGBF,fit.nit, andfit.time, respectively. Results for individual parameters infit.pare of typegvar.GVar, and therefore carry information about errors and correlations with other parameters. The fit data and prior can be recovered usingfit.x(equalsFalseif there is nox),fit.y, andfit.prior; the data and prior are corrected for the SVD cut, if there is one (that is, their covariance matrices have been modified in accordance with the SVD cut).Parameters: - data (dict, array or tuple) –
Data to be fit by
lsqfit.nonlinear_fitcan have any of the following forms:data = x, yxis the independent data that is passed to the fit function with the fit parameters:fcn(x, p).yis a dictionary (or array) ofgvar.GVars that encode the means and covariance matrix for the data that is to be fit being fit. The fit function must return a result having the same layout asy.data = yyis a dictionary (or array) ofgvar.GVars that encode the means and covariance matrix for the data being fit. There is no independent data so the fit function depends only upon the fit parameters:fit(p). The fit function must return a result having the same layout asy.data = x, ymean, ycovxis the independent data that is passed to the fit function with the fit parameters:fcn(x, p).ymeanis an array containing the mean values of the fit data.ycovis an array containing the covariance matrix of the fit data;ycov.shapeequals2*ymean.shape. The fit function must return an array having the same shape asymean.data = x, ymean, ysdevxis the independent data that is passed to the fit function with the fit parameters:fcn(x, p).ymeanis an array containing the mean values of the fit data.ysdevis an array containing the standard deviations of the fit data;ysdev.shapeequalsymean.shape. The data are assumed to be uncorrelated. The fit function must return an array having the same shape asymean.
Setting
x=Falsein the first, third or fourth of these formats implies that the fit function depends only on the fit parameters: that is,fcn(p)instead offcn(x, p). (This is not assumed ifx=None.) - fcn (callable) – The function to be fit to
data. It is either a function of the independent dataxand the fit parametersp(fcn(x, p)), or a function of just the fit parameters (fcn(p)) when there is noxdata orx=False. The parameters are tuned in the fit until the function returns values that agree with theydata to within theys’ errors. The function’s return value must have the same layout as theydata (a dictionary or an array). The fit parameterspare either: 1) a dictionary where eachp[k]is a single parameter or an array of parameters (any shape); or, 2) a single array of parameters. The layout of the parameters is the same as that of priorpriorif it is specified; otherwise, it is inferred from of the starting valuep0for the fit. - prior (dict, array, str, gvar.GVar or None) – A dictionary (or array)
containing a priori estimates for all parameters
pused by fit functionfcn(x, p)(orfcn(p)). Fit parameterspare stored in a dictionary (or array) with the same keys and structure (or shape) asprior. The default value isNone;priormust be defined ifp0isNone. - p0 (dict, array, float or None) – Starting values for fit
parameters in fit.
lsqfit.nonlinear_fitadjustsp0to make it consistent in shape and structure withpriorwhen the latter is specified: elements missing fromp0are filled in usingprior, and elements inp0that are not inpriorare discarded. Ifp0is a string, it is taken as a file name andlsqfit.nonlinear_fitattempts to read starting values from that file; best-fit parameter values are written out to the same file after the fit (for priming future fits). Ifp0isNoneor the attempt to read the file fails, starting values are extracted fromprior. The default value isNone;p0must be defined ifpriorisNone. - svdcut (float or None) – If
svdcutis nonzero (but notNone), SVD cuts are applied to every block-diagonal sub-matrix of the covariance matrix for the datayandprior(if there is a prior). The blocks are first rescaled so that all diagonal elements equal 1 – that is, the blocks are replaced by the correlation matrices for the corresponding subsets of variables. Then, ifsvdcut > 0, eigenvalues of the rescaled matrices that are smaller thansvdcuttimes the maximum eigenvalue are replaced bysvdcuttimes the maximum eigenvalue. This makes the covariance matrix less singular and less susceptible to roundoff error. Whensvdcut < 0, eigenvalues smaller than|svdcut|times the maximum eigenvalue are discarded and the corresponding components inyandpriorare zeroed out. Default is 1e-12. - extend (bool) – Log-normal and sqrt-normal distributions can be used
for fit priors when
extend=True, provided the parameters are specified by a dictionary (as opposed to an array). To use such a distribution for a parameter'c'in the fit prior, replaceprior['c']with a prior specifying its logarithm or square root, designated byprior['log(c)']orprior['sqrt(c)'], respectively. The dictionaries containing parameters generated bylsqfit.nonlinear_fitwill have entries for both'c'and'log(c)'or'sqrt(c)', so only the prior need be changed to switch to log-normal/sqrt-normal distributions. Settingextend=False(the default) restricts all parameters to Gaussian distributions. Additional distributions can be added usinggvar.add_parameter_distribution(). - udata (dict, array or tuple) – Same as
databut instructs the fitter to ignore correlations between different pieces of data. This speeds up the fit, particularly for large amounts of data, but ignores potentially valuable information if the data actually are correlated. Only one ofdataorudatashould be specified. (Default isNone.) - fitter (str or None) – Fitter code. Options if GSL is installed
include:
'gsl_multifit'(default) and'gsl_v1_multifit'(original fitter). Options ifscipyis installed include:'scipy_least_squares'(default if GSL not installed).gsl_multifithas many options, providing extensive user control.scipy_least_squarescan be used for fits where the parameters are bounded. (Bounded parameters can also be implemented, for any of the fitters, using non-Gaussian priors — see the tutorial.) - tol (float or tuple) –
Assigning
tol=(xtol, gtol, ftol)causes the fit to stop searching for a minimum when any ofxtol >=relative change in parameters between iterationsgtol >=relative size of gradient ofchi**2functionftol >=relative change inchi**2between iterations
is satisfied. See the fitter documentation for detailed definitions of these stopping conditions. Typically one sets
xtol=1/10**dwheredis the number of digits of precision desired in the result, whilegtol<<1andftol<<1. Settingtol=epswhereepsis a number is equivalent to settingtol=(eps,1e-10,1e-10). Settingtol=(eps1,eps2)is equivalent to settingtol=(eps1,eps2,1e-10). Default istol=1e-8. (Note: theftoloption is disabled in some versions of the GSL library.) - maxit (int) – Maximum number of algorithm iterations (or function evaluations for some fitters) in search for minimum; default is 1000.
- debug (bool) – Set to
Truefor extra debugging of the fit function and a check for roundoff errors. (Default isFalse.) - fitterargs (dict) – Dictionary of additional arguments passed through to the underlying fitter. Different fitters offer different parameters; see the documentation for each.
Objects of type
lsqfit.nonlinear_fithave the following attributes:-
chi2¶ float – The minimum
chi**2for the fit.fit.chi2 / fit.dofis usually of order one in good fits; values much less than one suggest that the actual standard deviations in the input data and/or priors are smaller than the standard deviations used in the fit.
-
cov¶ array – Covariance matrix of the best-fit parameters from the fit.
-
dof¶ int – Number of degrees of freedom in the fit, which equals the number of pieces of data being fit when priors are specified for the fit parameters. Without priors, it is the number of pieces of data minus the number of fit parameters.
-
error¶ str – Error message generated by the underlying fitter when an error occurs.
Noneotherwise.
-
fitter_results¶ Results returned by the underlying fitter. Refer to the appropriate fitter’s documentation for details.
-
logGBF¶ float or None – The logarithm of the probability (density) of obtaining the fit data by randomly sampling the parameter model (priors plus fit function) used in the fit — that is, it is
P(data|model). This quantity is useful for comparing fits of the same data to different models, with different priors and/or fit functions. The model with the largest value offit.logGBFis the one preferred by the data. The exponential of the difference infit.logGBFbetween two models is the ratio of probabilities (Bayes factor) for those models. Differences infit.logGBFsmaller than 1 are not very significant. Gaussian statistics are assumed when computingfit.logGBF.
-
p¶ dict, array or gvar.GVar – Best-fit parameters from fit. Depending upon what was used for the prior (or
p0), it is either: a dictionary (gvar.BufferDict) ofgvar.GVars and/or arrays ofgvar.GVars; or an array (numpy.ndarray) ofgvar.GVars.fit.prepresents a multi-dimensional Gaussian distribution which, in Bayesian terminology, is the posterior probability distribution of the fit parameters.
-
pmean¶ dict, array or float – Means of the best-fit parameters from fit.
-
psdev¶ dict, array or float – Standard deviations of the best-fit parameters from fit.
-
palt¶ dict, array or gvar.GVar – Same as
fit.pexcept that the errors are computed directly fromfit.cov. This is faster but means that no information about correlations with the input data is retained (unlike infit.p); and, therefore,fit.paltcannot be used to generate error budgets.fit.pandfit.paltgive the same means and normally give the same errors for each parameter. They differ only when the input data’s covariance matrix is too singular to invert accurately (because of roundoff error), in which case an SVD cut is advisable.
-
p0¶ dict, array or float – The parameter values used to start the fit. This will differ from the input
p0if the latter was incomplete.
-
prior¶ dict, array, gvar.GVar or None – Prior used in the fit. This may differ from the input prior if an SVD cut is used. It is either a dictionary (
gvar.BufferDict) or an array (numpy.ndarray), depending upon the input. EqualsNoneif no prior was specified.
-
Q¶ float or None – The probability that the
chi**2from the fit could have been larger, by chance, assuming the best-fit model is correct. Good fits haveQvalues larger than 0.1 or so. Also called the p-value of the fit.
-
stopping_criterion¶ int – Criterion used to stop fit:
0: didn’t converge
1:
xtol >=relative change in parameters between iterations2:
gtol >=relative size of gradient ofchi**23:
ftol >=relative change inchi**2between iterations
-
svdcorrection¶ gvar.GVar – Sum of all SVD corrections, if any, added to the fit data
yor the priorprior.
-
svdn¶ int – Number of eigenmodes modified (and/or deleted) by the SVD cut.
-
time¶ float – CPU time (in secs) taken by fit.
-
tol¶ tuple – Tolerance used in fit. This differs from the input tolerance if the latter was incompletely specified.
-
x¶ obj – The first field in the input
data. This is sometimes the independent variable (as in ‘y vs x’ plot), but may be anything. It is set equal toFalseif thexfield is omitted from the inputdata. (This also means that the fit function has noxargument: sof(p)rather thanf(x,p).)
-
y¶ dict, array or gvar.GVar – Fit data used in the fit. This may differ from the input data if an SVD cut is used. It is either a dictionary (
gvar.BufferDict) or an array (numpy.ndarray), depending upon the input.
-
nblocks¶ dict –
nblocks[s]equals the number of block-diagonal sub-matrices of they–priorcovariance matrix that are sizes-by-s. This is sometimes useful for debugging.
The global defaults used by
lsqfit.nonlinear_fitcan be changed by changing entries in dictionarylsqfit.nonlinear_fit.DEFAULTSfor keys ‘extend’, ‘svdcut’, ‘debug’, tol, ‘maxit’, and ‘fitter’. Additional defaults can be added to that dictionary to be are passed throughlsqfit.nonlinear_fitto the underlying fitter (via dictionaryfitterargs).Additional methods are provided for printing out detailed information about the fit, testing fits with simulated data, doing bootstrap analyses of the fit errors, dumping (for later use) and loading parameter values, and checking for roundoff errors in the final error estimates:
-
format(maxline=0, pstyle='v')¶ Formats fit output details into a string for printing.
The output tabulates the
chi**2per degree of freedom of the fit (chi2/dof), the number of degrees of freedom, the logarithm of the Gaussian Bayes Factor for the fit (logGBF), and the number of fit- algorithm iterations needed by the fit. Optionally, it will also list the best-fit values for the fit parameters together with the prior for each (in[]on each line). Lines for parameters that deviate from their prior by more than one (prior) standard deviation are marked with asterisks, with the number of asterisks equal to the number of standard deviations (up to five).formatcan also list all of the data and the corresponding values from the fit, again with asterisks on lines where there is a significant discrepancy. At the end it lists the SVD cut, the number of eigenmodes modified by the SVD cut, the tolerances used in the fit, and the time in seconds needed to do the fit. The tolerance used to terminate the fit is marked with an asterisk.Parameters: - maxline (integer or bool) – Maximum number of data points for which fit
results and input data are tabulated.
maxline<0implies that onlychi2,Q,logGBF, anditnsare tabulated; no parameter values are included. Settingmaxline=Trueprints all data points; setting it equalNoneorFalseis the same as setting it equal to-1. Default ismaxline=0. - pstyle (‘vv’, ‘v’, ‘m’, or
None) – Style used for parameter list. Supported values are ‘vv’ for very verbose, ‘v’ for verbose, and ‘m’ for minimal. When ‘m’ is set, only parameters whose values differ from their prior values are listed. Settingpstyle=Noneimplies no parameters are listed.
Returns: String containing detailed information about fit.
- maxline (integer or bool) – Maximum number of data points for which fit
results and input data are tabulated.
-
simulated_fit_iter(n=None, pexact=None, **kargs)¶ Iterator that returns simulation copies of a fit.
Fit reliability can be tested using simulated data which replaces the mean values in
self.ywith random numbers drawn from a distribution whose mean equalsself.fcn(pexact)and whose covariance matrix is the same asself.y’s. Simulated data is very similar to the original fit data,self.y, but corresponds to a world where the correct values for the parameters (i.e., averaged over many simulated data sets) are given bypexact.pexactis usually taken equal tofit.pmean.Each iteration of the iterator creates new simulated data, with different random numbers, and fits it, returning the the
lsqfit.nonlinear_fitthat results. The simulated data has the same covariance matrix asfit.y. Typical usage is:... fit = nonlinear_fit(...) ... for sfit in fit.simulated_fit_iter(n=3): ... verify that sfit.p agrees with pexact=fit.pmean within errors ...
Only a few iterations are needed to get a sense of the fit’s reliability since we know the correct answer in each case. The simulated fit’s output results should agree with
pexact(=fit.pmeanhere) within the simulated fit’s errors.Simulated fits can also be used to estimate biases in the fit’s output parameters or functions of them, should non-Gaussian behavior arise. This is possible, again, because we know the correct value for every parameter before we do the fit. Again only a few iterations may be needed for reliable estimates.
The (possibly non-Gaussian) probability distributions for parameters, or functions of them, can be explored in more detail by setting option
bootstrap=Trueand collecting results from a large number of simulated fits. Withbootstrap=True, the means of the priors are also varied from fit to fit, as in a bootstrap simulation; the new prior means are chosen at random from the prior distribution. Variations in the best-fit parameters (or functions of them) from fit to fit define the probability distributions for those quantities. For example, one would use the following code to analyze the distribution of functiong(p)of the fit parameters:fit = nonlinear_fit(...) ... glist = [] for sfit in fit.simulated_fit_iter(n=100, bootstrap=True): glist.append(g(sfit.pmean)) ... analyze samples glist[i] from g(p) distribution ...
This code generates
n=100samplesglist[i]from the probability distribution ofg(p). If everything is Gaussian, the mean and standard deviation ofglist[i]should agree withg(fit.p).meanandg(fit.p).sdev.The only difference between simulated fits with
bootstrap=Trueandbootstrap=False(the default) is that the prior means are varied. It is essential that they be varied in a bootstrap analysis since one wants to capture the impact of the priors on the final distributions, but it is not necessary and probably not desirable when simply testing a fit’s reliability.Parameters: - n (integer or
None) – Maximum number of iterations (equals infinity ifNone). - pexact (
Noneor array or dictionary of numbers) – Fit-parameter values for the underlying distribution used to generate simulated data; replaced byself.pmeanif isNone(default). - bootstrap (bool) – Vary prior means if
True; otherwise vary only the means inself.y(default).
Returns: An iterator that returns
lsqfit.nonlinear_fits for different simulated data.Note that additional keywords can be added to overwrite keyword arguments in
lsqfit.nonlinear_fit.- n (integer or
-
bootstrap_iter(n=None, datalist=None)¶ Iterator that returns bootstrap copies of a fit.
A bootstrap analysis involves three steps: 1) make a large number of “bootstrap copies” of the original input data and prior that differ from each other by random amounts characteristic of the underlying randomness in the original data; 2) repeat the entire fit analysis for each bootstrap copy of the data, extracting fit results from each; and 3) use the variation of the fit results from bootstrap copy to bootstrap copy to determine an approximate probability distribution (possibly non-gaussian) for the fit parameters and/or functions of them: the results from each bootstrap fit are samples from that distribution.
Bootstrap copies of the data for step 2 are provided in
datalist. IfdatalistisNone, they are generated instead from the means and covariance matrix of the fit data (assuming gaussian statistics). The maximum number of bootstrap copies considered is specified byn(Noneimplies no limit).Variations in the best-fit parameters (or functions of them) from bootstrap fit to bootstrap fit define the probability distributions for those quantities. For example, one could use the following code to analyze the distribution of function
g(p)of the fit parameters:fit = nonlinear_fit(...) ... glist = [] for sfit in fit.bootstrapped_fit_iter( n=100, datalist=datalist, bootstrap=True ): glist.append(g(sfit.pmean)) ... analyze samples glist[i] from g(p) distribution ...
This code generates
n=100samplesglist[i]from the probability distribution ofg(p). If everything is Gaussian, the mean and standard deviation ofglist[i]should agree withg(fit.p).meanandg(fit.p).sdev.Parameters: - n (integer) – Maximum number of iterations if
nis notNone; otherwise there is no maximum. - datalist (sequence or iterator or
None) – Collection of bootstrapdatasets for fitter.
Returns: Iterator that returns an
lsqfit.nonlinear_fitobject containing results from the fit to the next data set indatalist- n (integer) – Maximum number of iterations if
-
dump_p(filename)¶ Dump parameter values (
fit.p) into filefilename.fit.dump_p(filename)saves the best-fit parameter values (fit.p) from anonlinear_fitcalledfit. These values are recovered usingp = nonlinear_fit.load_parameters(filename)wherep’s layout is the same as that offit.p.
-
dump_pmean(filename)¶ Dump parameter means (
fit.pmean) into filefilename.fit.dump_pmean(filename)saves the means of the best-fit parameter values (fit.pmean) from anonlinear_fitcalledfit. These values are recovered usingp0 = nonlinear_fit.load_parameters(filename)wherep0’s layout is the same asfit.pmean. The saved values can be used to initialize a later fit (nonlinear_fitparameterp0).
-
static
load_parameters(filename)¶ Load parameters stored in file
filename.p = nonlinear_fit.load_p(filename)is used to recover the values of fit parameters dumped usingfit.dump_p(filename)(orfit.dump_pmean(filename)) wherefitis of typelsqfit.nonlinear_fit. The layout of the returned parameterspis the same as that offit.p(orfit.pmean).
-
check_roundoff(rtol=0.25, atol=1e-6)¶ Check for roundoff errors in fit.p.
Compares standard deviations from fit.p and fit.palt to see if they agree to within relative tolerance
rtoland absolute toleranceatol. Generates a warning if they do not (in which case an SVD cut might be advisable).
-
static
set(clear=False, **defaults)¶ Set default parameters for
lsqfit.nonlinear_fit.Use to set default values for parameters:
extend,svdcut,debug,tol,maxit, andfitter. Can also set parameters specific to the fitter specified by thefitterargument.Sample usage:
import lsqfit old_defaults = lsqfit.nonlinear_fit.set( fitter='gsl_multifit', alg='subspace2D', solver='cholesky', tol=1e-10, debug=True, )
nonlinear_fit.set()without arguments returns a dictionary containing the current defaults.Parameters: - clear (bool) – If
Trueremove earlier settings, restoring the original defaults, before adding new defaults. The default value isclear=False.nonlinear_fit.set(clear=True)restores the original defaults. - defaults (dict) – Dictionary containing new defaults.
Returns: A dictionary containing the old defaults, before they were updated. These can be restored using
nonlinear_fit.set(old_defaults)whereold_defaultsis the dictionary containint the old defaults.- clear (bool) – If
- data (dict, array or tuple) –
Functions¶
-
lsqfit.empbayes_fit(z0, fitargs, **minargs)¶ Return fit and
zcorresponding to the fitlsqfit.nonlinear_fit(**fitargs(z))that maximizeslogGBF.This function maximizes the logarithm of the Bayes Factor from fit
lsqfit.nonlinear_fit(**fitargs(z))by varyingz, starting atz0. The fit is redone for each value ofzthat is tried, in order to determinelogGBF.The Bayes Factor is proportional to the probability that the data came from the model (fit function and priors) used in the fit.
empbayes_fit()finds the model or data that maximizes this probability.One application is illustrated by the following code:
import numpy as np import gvar as gv import lsqfit # fit data x = np.array([1., 2., 3., 4.]) y = np.array([3.4422, 1.2929, 0.4798, 0.1725]) # prior prior = gv.gvar(['10(1)', '1.0(1)']) # fit function def fcn(x, p): return p[0] * gv.exp( - p[1] * x) # find optimal dy def fitargs(z): dy = y * z newy = gv.gvar(y, dy) return dict(data=(x, newy), fcn=fcn, prior=prior) fit, z = lsqfit.empbayes_fit(0.1, fitargs) print fit.format(True)
Here we want to fit data
ywith fit functionfcnbut we don’t know the uncertainties in ouryvalues. We assume that the relative errors arex-independent and uncorrelated. We add the errordythat maximizes the Bayes Factor, as this is the most likely choice. This fit gives the following output:Least Square Fit: chi2/dof [dof] = 0.58 [4] Q = 0.67 logGBF = 7.4834 Parameters: 0 9.44 (18) [ 10.0 (1.0) ] 1 0.9979 (69) [ 1.00 (10) ] Fit: x[k] y[k] f(x[k],p) --------------------------------------- 1 3.442 (54) 3.481 (45) 2 1.293 (20) 1.283 (11) 3 0.4798 (75) 0.4731 (41) 4 0.1725 (27) 0.1744 (23) Settings: svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 3/0.0)
We have, in effect, used the variation in the data relative to the best fit curve to estimate that the uncertainty in each data point is of order 1.6%.
Parameters: - z0 (number, array or dict) – Starting point for search.
- fitargs (callable) – Function of
zthat returns a dictionaryargscontaining thelsqfit.nonlinear_fitarguments corresponding toz.zshould have the same layout (number, array or dictionary) asz0.fitargs(z)can instead return a tuple(args, plausibility), whereargsis again the dictionary forlsqfit.nonlinear_fit.plausibilityis the logarithm of the a priori probabilitiy thatzis sensible. Whenplausibilityis provided,lsqfit.empbayes_fit()maximizes the sumlogGBF + plausibility. Specifyingplausibilityis a way of steering selections away from completely implausible values forz. - minargs (dict) – Optional argument dictionary, passed on to
lsqfit.gsl_multiminex(orlsqfit.scipy_multiminex), which finds the minimum.
Returns: A tuple containing the best fit (object of type
lsqfit.nonlinear_fit) and the optimal value for parameterz.
-
lsqfit.wavg(dataseq, prior=None, fast=False, **fitterargs)¶ Weighted average of
gvar.GVars or arrays/dicts ofgvar.GVars.The weighted average of several
gvar.GVars is what one obtains from a least-squares fit of the collection ofgvar.GVars to the one-parameter fit functiondef f(p): return N * [p[0]]
where
Nis the number ofgvar.GVars. The average is the best-fit value forp[0].gvar.GVars with smaller standard deviations carry more weight than those with larger standard deviations. The averages computed bywavgtake account of correlations between thegvar.GVars.If
prioris notNone, it is added to the list of data used in the average. Thuswavg([x2, x3], prior=x1)is the same aswavg([x1, x2, x3]).Typical usage is
x1 = gvar.gvar(...) x2 = gvar.gvar(...) x3 = gvar.gvar(...) xavg = wavg([x1, x2, x3]) # weighted average of x1, x2 and x3
where the result
xavgis agvar.GVarcontaining the weighted average.The individual
gvar.GVars in the last example can be replaced by multidimensional distributions, represented by arrays ofgvar.GVars or dictionaries ofgvar.GVars (or arrays ofgvar.GVars). For example,x1 = [gvar.gvar(...), gvar.gvar(...)] x2 = [gvar.gvar(...), gvar.gvar(...)] x3 = [gvar.gvar(...), gvar.gvar(...)] xavg = wavg([x1, x2, x3]) # xavg[i] is wgtd avg of x1[i], x2[i], x3[i]
where each array
x1,x2… must have the same shape. The resultxavgin this case is an array ofgvar.GVars, where the shape of the array is the same as that ofx1, etc.Another example is
x1 = dict(a=[gvar.gvar(...), gvar.gvar(...)], b=gvar.gvar(...)) x2 = dict(a=[gvar.gvar(...), gvar.gvar(...)], b=gvar.gvar(...)) x3 = dict(a=[gvar.gvar(...), gvar.gvar(...)]) xavg = wavg([x1, x2, x3]) # xavg['a'][i] is wgtd avg of x1['a'][i], x2['a'][i], x3['a'][i] # xavg['b'] is gtd avg of x1['b'], x2['b']
where different dictionaries can have (some) different keys. Here the result
xavgis agvar.BufferDict`having the same keys asx1, etc.Weighted averages can become costly when the number of random samples being averaged is large (100s or more). In such cases it might be useful to set parameter
fast=True. This causeswavgto estimate the weighted average by incorporating the random samples one at a time into a running average:result = prior for dataseq_i in dataseq: result = wavg([result, dataseq_i], ...)
This method is much faster when
len(dataseq)is large, and gives the exact result when there are no correlations between different elements of listdataseq. The results are approximately correct whendataseq[i]anddataseq[j]are correlated fori!=j.Parameters: - dataseq (list) – The
gvar.GVars to be averaged.dataseqis a one-dimensional sequence ofgvar.GVars, or of arrays ofgvar.GVars, or of dictionaries containinggvar.GVars and/or arrays ofgvar.GVars. Alldataseq[i]must have the same shape. - prior (dict, array or gvar.GVar) – Prior values for the averages, to
be included in the weighted average. Default value is
None, in which caseprioris ignored. - fast (bool) – Setting
fast=Truecauseswavgto compute an approximation to the weighted average that is much faster to calculate when averaging a large number of samples (100s or more). The default isfast=False. - fitterargs (dict) – Additional arguments (e.g.,
svdcut) for thelsqfit.nonlinear_fitfitter used to do the averaging.
Results returned by
gvar.wavg()have the following extra attributes describing the average:chi2 -
chi**2for weighted average.dof - Effective number of degrees of freedom.
- Q - The probability that the
chi**2could have been larger, by chance, assuming that the data are all Gaussian and consistent with each other. Values smaller than 0.1 or so suggest that the data are not Gaussian or are inconsistent with each other. Also called the p-value.
Quality factor Q (or p-value) for fit.
time - Time required to do average.
- svdcorrection - The svd corrections made to the data
- when
svdcutis notNone.
fit - Fit output from average.
- dataseq (list) – The
-
lsqfit.gammaQ()¶ Return the normalized incomplete gamma function
Q(a,x) = 1-P(a,x).Q(a, x) = 1/Gamma(a) * \int_x^\infty dt exp(-t) t ** (a-1) = 1 - P(a, x)Note that
gammaQ(ndof/2., chi2/2.)is the probabilty that one could get achi**2larger thanchi2withndofdegrees of freedom even if the model used to constructchi2is correct.
-
gvar.add_parameter_distribution()¶ Add new parameter distribution for use in fits.
This function adds new distributions for the parameters used in
lsqfit.nonlinear_fit. For example, the codeimport gvar as gv gv.add_parameter_distribution('log', gv.exp)
enables the use of log-normal distributions for parameters. The log-normal distribution is invoked for a parameter
pby includinglog(p)rather thanpitself in the fit prior. log-normal, sqrt-normal, and erfinv-normal distributions are included by default. (Setting a priorprior[erfinv(w)]equal togv.gvar('0(1)') / gv.sqrt(2)means that the prior probability forwis distributed uniformly between -1 and 1, and is zero elsewhere.)These distributions are implemented by replacing a fit parameter
pby a new fit parameterfcn(p)wherefcnis some function.fcn(p)is assumed to have a Gaussian distribution, and parameterpis recovered using the inverse functioninvfcnwherep=invfcn(fcn(p)).Parameters: - name (str) – Distribution’s name.
- invfcn – Inverse of the transformation function.
-
gvar.del_parameter_distribution()¶ Delete parameter distribution
name.
-
gvar.add_parameter_parentheses()¶ Return dictionary with proper keys for parameter distributions (legacy code).
This utility function helps fix legacy code that uses parameter keys like
logporsqrtpinstead oflog(p)orsqrt(p), as now required. This method creates a copy of dictionaryp'' but with keys like ``logporsqrtpreplaced bylog(p)orsqrt(p). So settingp = add_parameter_parentheses(p)
fixes the keys in
pfor log-normal and sqrt-normal parameters.
Classes for Bayesian Integrals¶
lsqfit provides support for doing Bayesian integrals, using results
from a least-squares fit to optimize the multi-dimensional integral. This
is useful for severely non-Gaussian situations. Module vegas is
used to do the integrals, using an adaptive Monte Carlo algorithm.
The integrator class is:
-
class
lsqfit.BayesIntegrator(fit, limit=1e15, scale=1, pdf=None, adapt_to_pdf=True, svdcut=1e-15)¶ vegasintegrator for Bayesian fit integrals.Parameters: - fit – Fit from
nonlinear_fit. - limit (positive float) – Limits the integrations to a finite
region of size
limittimes the standard deviation on either side of the mean. This can be useful if the functions being integrated misbehave for large parameter values (e.g.,numpy.expoverflows for a large range of arguments). Default is1e15. - scale (positive float) – The integration variables are
rescaled to emphasize parameter values of order
scaletimes the corresponding standard deviations. The rescaling does not change the value of the integral but it can reduce uncertainties in thevegasestimates. Default is1.0. - pdf (callable) – Probability density function
pdf(p)of the fit parameters to use in place of the normal PDF associated with the least-squares fit used to create the integrator. - adapt_to_pdf (bool) –
vegasadapts to the PDF ifTrue(default); otherwise it adapts tof(p)times the PDF. - svdcut (non-negative float or None) – If not
None, replace covariance matrix ofgwith a new matrix whose small eigenvalues are modified: eigenvalues smaller thansvdcuttimes the maximum eigenvalueeig_maxare replaced bysvdcut*eig_max. This can ameliorate problems caused by roundoff errors when inverting the covariance matrix. It increases the uncertainty associated with the modified eigenvalues and so is conservative. Settingsvdcut=Noneorsvdcut=0leaves the covariance matrix unchanged. Default is1e-15.
BayesIntegrator(fit)is avegasintegrator that evaluates expectation values for the multi-dimensional Bayesian distribution associated withnonlinear_fitfit: the probability density function is the exponential of thechi**2function (times-1/2), for data and priors, used in the fit. For linear fits, it is equivalent tovegas.PDFIntegrator(fit.p), since thechi**2function is quadratic in the fit parameters; but they can differ significantly for nonlinear fits.BayesIntegratorintegrates over the entire parameter space but first re-expresses the integrals in terms of variables that diagonalize the covariance matrix of the best-fit parametersfit.pfromnonlinear_fitand are centered at the best-fit values. This greatly facilitates the integration usingvegas, making integrals over 10s or more of parameters feasible. (Thevegasmodule must be installed separately in order to useBayesIntegrator.)A simple illustration of
BayesIntegratoris given by the following code, which we use to evaluate the mean and standard deviation fors*gwheresandgare fit parameters:import lsqfit import gvar as gv import numpy as np # least-squares fit x = np.array([0.1, 1.2, 1.9, 3.5]) y = gv.gvar(['1.2(1.0)', '2.4(1)', '2.0(1.2)', '5.2(3.2)']) prior = gv.gvar(dict(a='0(5)', s='0(2)', g='2(2)')) def f(x, p): return p['a'] + p['s'] * x ** p['g'] fit = lsqfit.nonlinear_fit(data=(x,y), prior=prior, fcn=f, debug=True) print(fit) # Bayesian integral to evaluate expectation value of s*g def g(p): sg = p['s'] * p['g'] return [sg, sg**2] expval = lsqfit.BayesIntegrator(fit, limit=20.) warmup = expval(neval=4000, nitn=10) results = expval(g, neval=4000, nitn=15, adapt=False) print(results.summary()) print('results =', results, '\n') sg, sg2 = results sg_sdev = (sg2 - sg**2) ** 0.5 print('s*g from Bayes integral: mean =', sg, ' sdev =', sg_sdev) print('s*g from fit:', fit.p['s'] * fit.p['g'])
where the
warmupcalls to the integrator are used to adapt it to probability density function from the fit, and then the integrator is used to evaluate the expectation value ofg(p), which is returned in arrayresults. Herenevalis the (approximate) number of function calls per iteration of thevegasalgorithm andnitnis the number of iterations. We use the integrator to calculated the expectation value ofs*gand(s*g)**2so we can compute a mean and standard deviation.The output from this code shows that the Gaussian approximation for
s*g(0.76(66)) is somewhat different from the result obtained from a Bayesian integral (0.48(54)):Least Square Fit: chi2/dof [dof] = 0.32 [4] Q = 0.87 logGBF = -9.2027 Parameters: a 1.61 (90) [ 0.0 (5.0) ] s 0.62 (81) [ 0.0 (2.0) ] g 1.2 (1.1) [ 2.0 (2.0) ] Settings: svdcut/n = 1e-15/0 reltol/abstol = 0.0001/0* (itns/time = 10/0.0) itn integral average chi2/dof Q ------------------------------------------------------- 1 1.034(21) 1.034(21) 0.00 1.00 2 1.034(21) 1.034(15) 0.56 0.64 3 1.024(18) 1.030(12) 0.37 0.90 4 1.010(18) 1.0254(98) 0.47 0.89 5 1.005(17) 1.0213(85) 0.55 0.88 6 1.013(19) 1.0199(78) 0.69 0.80 7 0.987(16) 1.0152(70) 0.78 0.72 8 1.002(18) 1.0135(66) 0.90 0.59 9 1.036(20) 1.0160(62) 0.86 0.66 10 1.060(20) 1.0204(60) 0.94 0.55 results = [0.4837(32) 0.5259(47)] s*g from Bayes integral: mean = 0.4837(32) sdev = 0.5403(25) s*g from fit: 0.78(66)
The table shows estimates of the probability density function’s normalization from each of the
vegasiterations used by the integrator to estimate the final results.In general functions being integrated can return a number, or an array of numbers, or a dictionary whose values are numbers or arrays of numbers. This allows multiple expectation values to be evaluated simultaneously.
See the documentation with the
vegasmodule for more details on its use, and on the attributes and methods associated with integrators. The example above setsadapt=Falsewhen computing final results. This gives more reliable error estimates whennevalis small. Note thatnevalmay need to be much larger (tens or hundreds of thousands) for more difficult high-dimension integrals.-
__call__(f=None, pdf=None, adapt_to_pdf=None, **kargs)¶ Estimate expectation value of function
f(p).Uses multi-dimensional integration modules
vegasto estimate the expectation value off(p)with respect to the probability density function associated withnonlinear_fitfit.Parameters: - f (callable) – Function
f(p)to integrate. Integral is the expectation value of the function with respect to the distribution. The function can return a number, an array of numbers, or a dictionary whose values are numbers or arrays of numbers. Its argumentphas the same format asself.fit.pmean(that is, either a number, an array, or a dictionary). Omittingf(or setting it toNone) implies that only the PDF is integrated. - pdf (callable) – Probability density function
pdf(p)of the fit parameters to use in place of the normal PDF associated with the least-squares fit used to create the integrator. The PDF need not be normalized; vegas will normalize it. Ignored ifpdf=None(the default). - adapt_to_pdf (bool) –
vegasadapts to the PDF ifTrue(default); otherwise it adapts tof(p)times the PDF.
All other keyword arguments are passed on to a
vegasintegrator; see thevegasdocumentation for further information.The results returned are similar to what
vegasreturns but with an extra attribute:results.norm, which contains thevegasestimate for the norm of the PDF. This should equal 1 within errors if the PDF is normalized (and so can serve as a check on the integration in those cases).- f (callable) – Function
- fit – Fit from
A class that describes the Bayesian probability distribution associated with a fit is:
-
class
lsqfit.BayesPDF(fit, svdcut=1e-15)¶ Bayesian probability density function corresponding to
nonlinear_fitfit.The probability density function is the exponential of
-1/2times thechi**2function (data and priors) used infitdivided bynorm.Parameters: - fit – Fit from
nonlinear_fit. - svdcut (non-negative float or None) – If not
None, replace covariance matrix ofgwith a new matrix whose small eigenvalues are modified: eigenvalues smaller thansvdcuttimes the maximum eigenvalueeig_maxare replaced bysvdcut*eig_max. This can ameliorate problems caused by roundoff errors when inverting the covariance matrix. It increases the uncertainty associated with the modified eigenvalues and so is conservative. Settingsvdcut=Noneorsvdcut=0leaves the covariance matrix unchanged. Default is1e-15.
-
__call__(p)¶ Probability density function evaluated at
p.
-
logpdf(p)¶ Logarithm of the probability density function evaluated at
p.
- fit – Fit from
lsqfit.MultiFitter Classes¶
lsqfit.MultiFitter provides a framework for fitting multiple pieces
of data using a set of custom-designed models, derived from
lsqfit.MultiFitterModel, each of which encapsulates a particular fit
function. This framework was developed to support the corrfitter
module, but is more general. Instances of model classes associate specific
subsets of the fit data with specific subsets of the fit parameters. This
allows fit problems to be broken down down into more manageable pieces, which
are then aggregated by lsqfit.MultiFitter into a single fit.
A trivial example of a model would be one that encapsulates a linear fit function:
import numpy as np
import lsqfit
class Linear(lsqfit.MultiFitterModel):
def __init__(self, datatag, x, intercept, slope):
super(Linear, self).__init__(datatag)
# the independent variable
self.x = np.array(x)
# keys used to find the intercept and slope in a parameter dictionary
self.intercept = intercept
self.slope = slope
def fitfcn(self, p):
if self.slope in p:
return p[self.intercept] + p[self.slope] * self.x
else:
# slope parameter marginalized
return len(self.x) * [p[self.intercept]]
def buildprior(self, prior, mopt=None, extend=False):
" Extract the model's parameters from prior. "
newprior = {}
newprior[self.intercept] = prior[self.intercept]
if mopt is None:
# slope parameter marginalized if mopt is not None
newprior[self.slope] = prior[self.slope]
return newprior
def builddata(self, data):
" Extract the model's fit data from data. "
return data[self.datatag]
Imagine four sets of data, each corresponding to x=1,2,3,4, all of which
have the same intercept but different slopes:
data = gv.gvar(dict(
d1=['1.154(10)', '2.107(16)', '3.042(22)', '3.978(29)'],
d2=['0.692(10)', '1.196(16)', '1.657(22)', '2.189(29)'],
d3=['0.107(10)', '0.030(16)', '-0.027(22)', '-0.149(29)'],
d4=['0.002(10)', '-0.197(16)', '-0.382(22)', '-0.627(29)'],
))
To find the common intercept, we define a model for each set of data:
models = [
Linear('d1', x=[1,2,3,4], intercept='a', slope='s1'),
Linear('d2', x=[1,2,3,4], intercept='a', slope='s2'),
Linear('d3', x=[1,2,3,4], intercept='a', slope='s3'),
Linear('d4', x=[1,2,3,4], intercept='a', slope='s4'),
]
This says that data['d3'], for example, should be fit with function
p['a'] + p['s3'] * np.array([1,2,3,4]) where p is a dictionary of fit
parameters. The models here all share the same intercept, but have different
slopes. Assume that we know a priori that the intercept and slopes are all
order one:
prior = gv.gvar(dict(a='0(1)', s1='0(1)', s2='0(1)', s3='0(1)', s4='0(1)'))
Then we can fit all the data to determine the intercept:
fitter = lsqfit.MultiFitter(models=models)
fit = fitter.lsqfit(data=data, prior=prior)
print(fit)
print('intercept =', fit.p['a'])
The output from this code is:
Least Square Fit:
chi2/dof [dof] = 0.49 [16] Q = 0.95 logGBF = 18.793
Parameters:
a 0.2012 (78) [ 0.0 (1.0) ]
s1 0.9485 (53) [ 0.0 (1.0) ]
s2 0.4927 (53) [ 0.0 (1.0) ]
s3 -0.0847 (53) [ 0.0 (1.0) ]
s4 -0.2001 (53) [ 0.0 (1.0) ]
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 5/0.0)
intercept = 0.2012(78)
Model class Linear is configured to allow
marginalization of the slope parameter, if desired. Calling
fitter.lsqfit(data=data, prior=prior, mopt=True) moves the slope
parameters into the data (by subtracting m.x * prior[m.slope]
from the data for each model m), and does a single-parameter fit for the
intercept:
Least Square Fit:
chi2/dof [dof] = 0.49 [16] Q = 0.95 logGBF = 18.793
Parameters:
a 0.2012 (78) [ 0.0 (1.0) ]
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 4/0.0)
intercept = 0.2012(78)
Marginalization can be useful when fitting large data sets since it reduces the number of fit parameters and simplifies the fit.
Another variation is to replace the simultaneous fit of the four models by a chained fit, where one model is fit at a time and its results are fed into the next fit through that fit’s prior. Replacing the fit code by
fitter = lsqfit.MultiFitter(models=models)
fit = fitter.chained_lsqfit(data=data, prior=prior)
print(fit.formatall())
print('slope =', fit.p['a'])
gives the following output:
========== d1
Least Square Fit:
chi2/dof [dof] = 0.32 [4] Q = 0.86 logGBF = 2.0969
Parameters:
a 0.213 (16) [ 0.0 (1.0) ]
s1 0.9432 (82) [ 0.0 (1.0) ]
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 5/0.0)
========== d2
Least Square Fit:
chi2/dof [dof] = 0.58 [4] Q = 0.67 logGBF = 5.3792
Parameters:
a 0.206 (11) [ 0.213 (16) ]
s2 0.4904 (64) [ 0.0 (1.0) ]
s1 0.9462 (64) [ 0.9432 (82) ]
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 5/0.0)
========== d3
Least Square Fit:
chi2/dof [dof] = 0.66 [4] Q = 0.62 logGBF = 5.3767
Parameters:
a 0.1995 (90) [ 0.206 (11) ]
s3 -0.0840 (57) [ 0.0 (1.0) ]
s1 0.9493 (57) [ 0.9462 (64) ]
s2 0.4934 (57) [ 0.4904 (64) ]
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 4/0.0)
========== d4
Least Square Fit:
chi2/dof [dof] = 0.41 [4] Q = 0.81 logGBF = 5.9402
Parameters:
a 0.2012 (78) [ 0.1995 (90) ]
s4 -0.2001 (53) [ 0.0 (1.0) ]
s1 0.9485 (53) [ 0.9493 (57) ]
s2 0.4927 (53) [ 0.4934 (57) ]
s3 -0.0847 (53) [ -0.0840 (57) ]
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 4/0.0)
intercept = 0.2012(78)
Note how the value for s1 improves with each fit despite the fact that
it appears only in the first fit function. This happens because its value
is correlated with that of the intercept a, which appears in every fit
function.
Chained fits are most useful
with very large data sets when it is possible to break the data into
smaller, more manageable chunks. There are a variety of options for
organizing the chain of fits; these are discussed in the
MultiFitter.chained_lsqfit() documentation.
-
class
lsqfit.MultiFitter(models, mopt=None, ratio=False, fast=True, extend=False, **fitterargs)¶ Nonlinear least-squares fitter for a collection of models.
Parameters: - models – List of models, derived from
modelfitter.MultiFitterModel, to be fit to the data. Individual models in the list can be replaced by lists of models or tuples of models; see below. - mopt (object) – Marginalization options. If not
None, marginalization is used to reduce the number of fit parameters. Objectmoptis passed to the models when constructing the prior for a fit; it typically indicates the degree of marginalization (in a model-dependent fashion). Settingmopt=Noneimplies no marginalization. - ratio (bool) – If
True, implement marginalization using ratios:data_marg = data * fitfcn(prior_marg) / fitfcn(prior). IfFalse(default), implement using differences:data_marg = data + (fitfcn(prior_marg) - fitfcn(prior)). - fast (bool) – Setting
fast=True(default) strips any variable not required by the fit from the prior. This speeds fits but loses information about correlations between variables in the fit and those that are not. The information can be restored usinglsqfit.wavgafter the fit. - extend (bool) – If
Truesupports log-normal and other non-Gaussian priors. Seelsqfitdocumentation for details. Default isFalse. - fitname (callable or
None) – Individual fits in a chained fit are assigned default names, constructed from the datatags of the corresponding models, for access and reporting. These names get unwieldy when lots of models are involved. Whenfitnameis notNone(default), each default namednameis replaced byfitname(dname). - wavg_svdcut (float) – SVD cut used for the weighted averages that
combine results from parallel sub-fits in a chained fit (see
MultiFitter.chained_lsqfit()). Default value isNonewhich sets the SVD cut equal to 10x the SVD cut used for other fits (specified by keywordsvdcut). Weighted averages often need larger SVD cuts than the other fits. - fitterargs – Additional arguments for the
lsqfit.nonlinear_fitobject used to do the fits. These can includetol,maxit,svdcut,fitter, etc., as needed.
-
lsqfit(data=None, prior=None, pdata=None, p0=None, **kargs)¶ Compute least-squares fit of models to data.
MultiFitter.lsqfit()fits all of the models together, in a single fit. It returns thelsqfit.nonlinear_fitobject from the fit.To see plots of the fit data divided by the fit function with the best-fit parameters use
fit.show_plots()Plotting requires module
matplotlib.Parameters: - data – Input data. One of
dataorpdatamust be specified but not both.pdatais obtained fromdataby collecting the output fromm.builddata(data)for each modelmand storing it in a dictionary with keym.datatag. - pdata – Input data that has been processed by the
models using
MultiFitter.process_data()orMultiFitter.process_dataset(). One ofdataorpdatamust be specified but not both. - prior – Bayesian prior for fit parameters used by the models.
- p0 – Dictionary , indexed by parameter labels, containing
initial values for the parameters in the fit. Setting
p0=Noneimplies that initial values are extracted from the prior. Settingp0="filename"causes the fitter to look in the file with name"filename"for initial values and to write out best-fit parameter values after the fit (for the next call toself.lsqfit()). - wavg_svdcut (float) – SVD cut used in weighted averages for parallel fits.
- kargs – Arguments that override parameters specified when
the
MultiFitterwas created. Can also include additional arguments to be passed through to thelsqfitfitter.
- data – Input data. One of
-
chained_lsqfit(data=None, pdata=None, prior=None, p0=None, **kargs)¶ Compute chained least-squares fit of models to data.
In a chained fit to models
[s1, s2, ...], the models are fit one at a time, with the fit output from one being fed into the prior for the next. This can be much faster than fitting the models together, simultaneously. The final result comes from the last fit in the chain, and includes parameters from all of the models.The most general chain has the structure
[s1, s2, s3 ...]where eachsnis one of:- a model (derived from
multifitter.MultiFitterModel); - a tuple
(m1, m2, m3)of models, to be fit together in - a single fit (i.e., simultaneously);
- a tuple
- a list
[p1, p2, p3 ...]where eachpnis either - a model or a tuple of models (see #2). The
pnare fit separately, and independently of each other (i.e., in parallel). Results from the separate fits are averaged at the end to provide a single composite result for the collection of fits.
- a list
The final result
fitreturned byMultiFitter.chained_fit()has an extra attributefit.chained_fitswhich is an ordered dictionary containing fit results from each linksnin the chain, and keyed by the models’datatags. If any of these involves parallel fits (see #3 above), it will have an extra attributefit.chained_fits[fittag].sub_fitsthat contains results from the separate parallel fits. To list results from all the chained and parallel fits, useprint(fit.formatall())
To see plots of the fit data divided by the fit function with the best-fit parameters use
fit.show_plots()Plotting requires module
matplotlib.Parameters: - data – Input data. One of
dataorpdatamust be specified but not both.pdatais obtained fromdataby collecting the output fromm.builddata(data)for each modelmand storing it in a dictionary with keym.datatag. - pdata – Input data that has been processed by the
models using
MultiFitter.process_data()orMultiFitter.process_dataset(). One ofdataorpdatamust be specified but not both. - prior – Bayesian prior for fit parameters used by the models.
- p0 – Dictionary , indexed by parameter labels, containing
initial values for the parameters in the fit. Setting
p0=Noneimplies that initial values are extracted from the prior. Settingp0="filename"causes the fitter to look in the file with name"filename"for initial values and to write out best-fit parameter values after the fit (for the next call toself.lsqfit()). - kargs – Arguments that override parameters specified when
the
MultiFitterwas created. Can also include additional arguments to be passed through to thelsqfitfitter.
- a model (derived from
-
static
process_data(data, models)¶ Convert
datato processed data usingmodels.Data from dictionary
datais processed by each model in listmodels, and the results collected into a new dictionarypdatafor use inMultiFitter.lsqfit()andMultiFitter.chained_lsqft().
-
static
process_dataset(dataset, models, **kargs)¶ Convert
datasetto processed data usingmodels.gvar.dataset.Dataset(or similar dictionary) objectdatasetis processed by each model in listmodels, and the results collected into a new dictionarypdatafor use inMultiFitter.lsqfit()andMultiFitter.chained_lsqft(). Assumes that the models have defined methodMultiFitterModel.builddataset(). Keyword argumentskargsare passed on togvar.dataset.avg_data()when averaging the data.
-
static
show_plots(fitdata, fitval, x=None, save=False, view='ratio')¶ Show plots of
fitdata[k]/fitval[k]for each keykinfitval.Assumes
matplotlibis installed (to make the plots). Plots are shown for one correlator at a time. Press keynto see the next correlator; press keypto see the previous one; press keyqto quit the plot and return control to the calling program; press a digit to go directly to one of the first ten plots. Zoom, pan and save using the window controls.There are several different views available for each plot, specified by parameter
view:view='ratio': Data divided by fit (default).view='diff': Data minus fit, divided by data’s standard deviation.view='std': Data and fit.view='log':'std'with log scale on the vertical axis.view='loglog': ‘std’` with log scale on both axes.Press key
vto cycle through these views; or press keysr,d, orlfor the'ratio','diff', or'log'views, respectively.Copies of the plots that are viewed can be saved by setting parameter
save=fmtwherefmtis a string used to create file names: the file name for the plot corresponding to keykisfmt.format(k). It is important that the filename end with a suffix indicating the type of plot file desired: e.g.,fmt='plot-{}.pdf'.
-
static
flatten_models(models)¶ Create 1d-array containing all disctinct models from
models.
- models – List of models, derived from
lsqfit.MultiFitter models are derived from the following
class. Methods buildprior, builddata, fitfcn, and
builddataset are not implemented in this base
class. They need to be overwritten by the derived class (except
for builddataset which is optional).
-
class
lsqfit.MultiFitterModel(datatag, ncg=1)¶ Base class for MultiFitter models.
Derived classes must define methods
fitfcn,buildprior, andbuilddata, all of which are described below. In addition they have attributes:-
datatag¶ lsqfit.MultiFitterbuilds fit data for the correlator by extracting the data labelled bydatatag(eg, a string) from an input data set (eg, a dictionary). This label is stored in theMultiFitterModeland must be passed to its constructor. It must be a hashable quantity, like a string or number or tuple of strings and numbers.
-
ncg¶ When
ncg>1, fit data and functions are coarse-grained by breaking them up into bins of ofncgvalues and replacing each bin by its average. This can increase the fitting speed, because their is less data, without much loss of precision if the data elements within a bin are highly correlated.
Parameters: - datatag – Label used to identify model’s data.
- ncg (int) – Size of bins for coarse graining (default is
ncg=1).
-
buildprior(prior, mopt=None, extend=False)¶ Extract fit prior from
prior.Returns a dictionary containing the part of dictionary
priorthat is relevant to this model’s fit. The code could be as simple as collecting the appropriate pieces: e.g.,def buildprior(self, prior, mopt=None, extend=False): mprior = gv.BufferDict() model_keys = [...] for k in model_keys: mprior[k] = prior[k] return mprior
where
model_keysis a list of keys corresponding to the model’s parameters. Supporting theextendoption requires a slight modification: e.g.,def buildprior(self, prior, mopt=None, extend=False): mprior = gv.BufferDict() model_keys = [...] for k in self.get_prior_keys(prior, model_keys, extend): mprior[k] = prior[k] return mprior
Marginalization involves omitting some of the fit parameters from the model’s prior.
mopt=Noneimplies no marginalization. Otherwisemoptwill typically contain information about what and how much to marginalize.Parameters: - prior – Dictionary containing a priori estimates of all fit parameters.
- mopt (object) – Marginalization options. Ignore if
None. Otherwise marginalize fit parameters as specified bymopt.moptcan be any type of Python object; it is used only inbuildpriorand is passed through to it unchanged. - extend (bool) – If
Truesupports log-normal and other non-Gaussian priors. Seelsqfitdocumentation for details.
-
builddata(data)¶ Extract fit data corresponding to this model from data set
data.The fit data is returned in a 1-dimensional array; the fitfcn must return arrays of the same length.
Parameters: data – Data set containing the fit data for all models. This is typically a dictionary, whose keys are the datatags of the models.
-
fitfcn(p)¶ Compute fit function fit for parameters
p.Results are returned in a 1-dimensional array the same length as (and corresponding to) the fit data returned by
self.builddata(data).If marginalization is supported,
fitfcnmust work with or without the marginalized parameters.Parameters: p – Dictionary of parameter values.
-
builddataset(dataset)¶ Extract fit dataset from
gvar.dataset.Datasetdataset.The code
import gvar as gv data = gv.dataset.avg_data(m.builddataset(dataset))
that builds data for model
mshould be functionally equivalent toimport gvar as gv data = m.builddata(gv.dataset.avg_data(dataset))
This method is optional. It is used only by
MultiFitter.process_dataset().Parameters: dataset – gvar.dataset.Dataset(or similar dictionary) dataset containing the fit data for all models. This is typically a dictionary, whose keys are thedatatags of the models.
-
static
get_prior_keys(prior, keys, extend=False)¶ Return list of keys in dictionary
priorfor keys in listkeys.List
keysis returned ifextend=False. Otherwise the keys returned may differ from those inkeys. For example, a prior that has a keylog(x)would return that key in place of a keyxin listkeys. This support non-Gaussian priors as discussed in thelsqfitdocumentation.
-
static
prior_key()¶ Find base key in
priorcorresponding tok.
-