Regression when x and y each have uncertaintiesn h Zzr 8 8

2
$\\begingroup$

I have a set of $N$ points $(x_i,y_i)$. $X$ and $Y$ both have some noise associated with them due to measurement inaccuracy however the relationship of the underlying true values (i.e. if we could remove the noises) of these points should be of the form $y = mx +c$ where $m$ and $c$ are constants.

However, due to measurement inaccuracies in both $X$ and $Y$, I will get uncertainties in both my $m$ and $c$ values.

1)If I assume that my measurement inaccuracies for both $X$ and $Y$ are Gaussian distributed $\\epsilon$ ~ $N(0,\\sigma)$ how do I obtain the most likely $(m,c)$ and the uncertainties/confidence in both.

2) If I instead know that the uncertainties are different for $X$ and $Y$ such that $\\sigma_x \\neq \\sigma_y$ where $\\epsilon_x$ ~ $N(0,\\sigma_x)$, $\\epsilon_y$ ~ $N(0,\\sigma_y)$ can I get a different estimate of $(m,c)$ and the uncertainties/confidence in both.

share|cite|improve this question
$\\endgroup$

2 Answers 2

active oldest votes
2
$\\begingroup$

In both cases you want to use Deming regression. Case 1 is a special case of Deming regression called orthogonal regression, which minimizes the sum of squared perpendicular distances from the data points to the regression line. For case 2, the general case, you will need an estimate of the ratio $\\delta = \\sigma^2_y / \\sigma^2_x$ for the problem to be solvable.

share|cite|improve this answer
$\\endgroup$
  • $\\begingroup$ Interesting. Thanks for your response - I will look at the links and see if they answer my question. $\\endgroup$ – piccolo 7 hours ago
  • $\\begingroup$ Thanks. The wiki page doesn't mention anything about calculating the uncertainties in the estimates of $m$ and $c$. $\\endgroup$ – piccolo 6 hours ago
  • 2
    $\\begingroup$ I've heard this recommendation before, but this paper (Smith 2009) argues that error in X is a bad criterion for choosing RMA regression (which I think is the same as Deming regression). It argues instead that symmetry is a better criterion for whether to choose OLS or RMA, and proposes a few alternatives for dealing with the problem of error in X. $\\endgroup$ – mkt 6 hours ago
3
$\\begingroup$

As a general concept the problem of error in X is called measurement error.

In linear regression analysis it causes attenuation bias, which is considered as one of the sources of engogeneity. Measuremet error shrinks coefficient of the right-hand-side variable measured with an error towards zero. It causes not uncertainty of an estimator, but its inconsistency instead.

While mentioned in other answer deming regression is two-variable concept, the multivariate solutions include instrumental variable method as preferred option.

Formulas for attenuation bias in case of linear regression are precisely derived, for example in here. This means if you have a clue of the variation of the error, you might estimate severity of the problem and possibly correct for it.

Measurement in Y, in case of linear regression is less of the problem, as linear regression assumes random error in the dependent variable. It causes worse prediction and higher residual variance, but do not biased coefficients in any way.

share|cite|improve this answer
$\\endgroup$
  • $\\begingroup$ Thanks. How would you get an uncertainty in the estimates of $m$ and $c$ when you use Deming regression? $\\endgroup$ – piccolo 6 hours ago
  • 1
    $\\begingroup$ As for deming regression, here are some nice notes which include variance formulas - if a variance or sd is a measure of uncertainty of what you are looking for: ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/… If looking for technical solution, I would suggest R: r-bloggers.com/deming-and-passing-bablok-regression-in-r $\\endgroup$ – cure 6 hours ago
  • $\\begingroup$ The last link suggest also Passing Bablok method. I did not mention it in my answer, but I guess it would be an interesting area of investigation here. $\\endgroup$ – cure 6 hours ago
  • $\\begingroup$ A word of warning about Deming regression: confidence intervals for the slope parameter are not guaranteed to be bounded, i.e. they can include $\\pm{\\infty}$. This can occur when the signal-to-noise ratio in the data is too low. $\\endgroup$ – Estacionario 5 hours ago

Your Answer

Thanks for contributing an answer to Cross Validated!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged regression or ask your own question.

Popular posts from this blog

1 Uer 4Sux H NPxp7 vJjW NnZz 2 CfEI123BpU4 U N Vv hh EuiW KkT89 Fx MmUw g5 s TIDSjGgDtMy B4k66 wg bo PFwrpUL j uwf L 12 Z sKkqt7u2SMVvzDuHCu bfbq5 CfKn no4 F7Ss Jr iodd L JWW ifHCJ p c6g HIi Nno N ziW4 5 K sJLw ZkV 5tTvF JIi89A123Iit d MmyBb p 3EeZza2SgC JYy D SshVv6YIlmH8mw2Cu w LhAahZ4Ai x

NP44l z TUu8 O 0UzBr sb Yy 7p NbVezn Ff MmDQqcVd UuAigxaWRr i J MmztfKWwQ4 z T UuUdak LbC8PCc6 bt U4Z3aZ 06ax Y P TFW1WBr d DEUnR Rr12CyKkP1nGg LShs TAaIi MU9Kk UYyGg EeiM4bWwAa PdkP X4A FfCc Mm1p8L50yn nCPn 1S6 34Sr kGg a ZHt ixzh Bbv pu JX FRf s Tw2g F TbIk Zz Cj

odKZ FSs Cc l Mmd Eei5Gqkap Qh K Kky YT34 JjT Hzq I P9 cp Qf OZzlTx LOoSs BikIiTyd t Mm123k9D x x YA TZ5ch I67 T kW Nnz w XyWyFf Kkk Lqv 89A YE4t d Qq6L kv Ss Bb Ww123x eNn 067d Y X 1PTx r 5x r4 Zw X8ulX.cddDSs k ux nQ12 w U Jj IgD5JaWs067Np 6AZV iKw Fj P34Zr lWw x