gnuplot - 线性拟合不单独调整 b 形成 a

Question

我正在使用以下 gnuplot 脚本来绘制线性拟合：

#!/usr/bin/gnuplot
set term cairolatex
set output "linear_fit.tex"
c = 299792458.
x(x) = c / x
y(x) = x
h(x) = a * x + b
fit h(x) "linear_fit.dat" u (x($1)):(y($2)) via a,b
plot "linear_fit.dat" u (x($1)):(y($2)) w points title "", \
    (h(x)) with lines linecolor rgb "black" title "Linear Fit"

但是，在迭代收敛后，b 始终为 1.0：https ://dpaste.de/ozReq/

如何让 gnuplot 调整 b 和 a？

更新：via a使用交替/重复 fit 命令几百次via b确实会产生很好的结果，但这不可能是它应该完成的方式。

更新 2：这是中的数据linear_fit.dat：

# lambda, V
360e-9 1.119
360e-9 1.148
360e-9 1.145
400e-9 0.949
400e-9 0.993
400e-9 0.971
440e-9 0.883
440e-9 0.875
440e-9 0.863
490e-9 0.737
490e-9 0.728
490e-9 0.755
540e-9 0.575
540e-9 0.571
540e-9 0.592
590e-9 0.457
590e-9 0.455
590e-9 0.482

score 3 · Accepted Answer

I think your troubles stem from the fact that your x-values are very large (on the order of 10e14).

If you do not provide gnuplot with an initial guess for a and b, it will assume a=1 and b=1 as starting points for the fit. However, this is a poor initial guess:

enter image description here

Please note the log scale on both the x- and y-axis.
From the gnuplot documentation:

fit may, and often will get "lost" if started far from a solution, where SSR is large and changing slowly as the parameters are varied, or it may reach a numerically unstable region (e.g., too large a number causing a floating point overflow) which results in an "undefined value" message or gnuplot halting.

To improve the chances of finding the global optimum, you should set the starting values at least roughly in the vicinity of the solution, e.g., within an order of magnitude, if possible. The closer your starting values are to the solution, the less chance of stopping at another minimum. One way to find starting values is to plot data and the fitting function on the same graph and change parameter values and replot until reasonable similarity is reached. The same plot is also useful to check whether the fit stopped at a minimum with a poor fit.

In your case, such starting values could be:

a = 1e-15
b = -0.5

I obtained these values by eye-balling your range of values.
With those starting values, the linear fit results in:

Final set of parameters            Asymptotic Standard Error
=======================            ==========================

a               = 1.97355e-015     +/- 6.237e-017   (3.161%)
b               = -0.5             +/- 0.04153      (8.306%)

Which looks like this:

enter image description here

You can play with the control setting of fit (such as setting FIT_LIMIT = 1.e-35) or the starting values to achieve a better fit than this.

EDIT

While I still have not been able to coax gnuplot into modifying both parameters a, b at the same time, I found an alternate approach using R. I am aware that there are many other (scripting) languages that can perform a linear fit and this question was about gnuplot. However, the required effort with R appeared to be minimal.
Here's an example, which, when saved as linear_fit.R and called with

R CMD BATCH linear_fit.R

will provide the two coefficients of the linear fit, that gnuplot failed to provide.

y <- c(1.119, 1.148, 1.145, 0.949, 0.993, 0.971, 0.883, 0.875, 0.863, 
       0.737, 0.728, 0.755, 0.575, 0.571, 0.592, 0.457, 0.455, 0.482)
x <- c(3.60E-007, 3.60E-007, 3.60E-007, 4.00E-007, 4.00E-007, 
       4.00E-007, 4.40E-007, 4.40E-007, 4.40E-007, 4.90E-007, 
       4.90E-007, 4.90E-007, 5.40E-007, 5.40E-007, 5.40E-007, 
       5.90E-007, 5.90E-007, 5.90E-007)
c = 299792458.
x <- c/x
lm.out <- lm(y ~ x)
svg("linear_fit.svg")
plot(x,y) 
abline(lm.out,col="red")
summary(lm.out)

You will end up with an svg-file that contains the plot and a linear_fit.Rout text file. In there you'll find the following coefficients:

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -5.429e-01  4.012e-02  -13.53 3.55e-10 ***
x            2.037e-15  6.026e-17   33.80 2.61e-16 ***

So, in the terminology of the original question, we obtain:

a =  2.037e-15
b = -5.429e-01

These values are very close to the values you quoted from alternating the fit.

In case the comments get purged, these questions were identified as related:

What is gnuplot's internal representation of floating point numbers?

Gnuplot behaves oddly in polynomial fit. Why is that?

gnuplot - 线性拟合不单独调整 b 形成 a

1 回答 1

Related

Reference