Steiner, F., Hajagos, B.:Irresistance of the classical correlation coefficient |

1
If the rate of
correlation between the data sets x (1) (x and y are the mean values
of the data x 2. The concepts of robustness and
resistance Before some decades two new concepts appear in the modern statistics: robustness and resistance (see e.g. Huber 1981).Robust is the statistical method if it is applicable with great enough statistical efficiency for a not too short interval of probability distribution types: the practitioner needs that the method must be applicable for all types which can occur in his discipline. As in the geosciences even the Cauchy-type can occur and Tarantola 1987 proposes the use of such statistical algorithms which are able to handle also the Cauchy-type even if not this distribution is characteristic to the actual problem but it can be justifiably suspected that outliers can occur in a not negligible percent therefore our statistical methods must be applicable also for data which are distributed according to the Cauchy-type. In the paper Steiner and
Hajagos 2008 is shown that Eq.1. cannot
be applicated if the x The goal of the present article
is, however, to investigate
the resistance of Eq.1. , i.e.,the resistance of the classical
correlation
coefficient. Resistant against outliers is the statistical algorithm if
the
result is not (or not significantly) influenced by outliers, i.e., by
such data
On page 184 in the book Steiner (ed) 1997 the figure shows that the most frequent value denoted by M (as estimate of location) using k=2 tolerates even J=10 such outliers: till this limit the M-value is less than 0.02 (instead of the true value zero). The calculation method of the most frequent value as a statistical algorithm can be called justifiably as resistant. 3. Distortion
of
the classical correlation coefficient caused by outliers. Let be assumed the simplest
case: the x respectively;
the
x If
99 (x For outlier-free case the f(r In the present paper the
regularity of the distortions caused
by the outliers shown in Fig.1 are to be determined accurately enough,
therefore for each outlier, 101 times were 99 Gaussian points
generated, 101 r 4. A
short not to the meaning of the word „outlier” The expression „outlier” has pejorative meaning if we have to characterize the whole set of data ( or the whole of pairs of data as in the above discussed cases): we have seen that the result can be even misleading in presence of a single outlier. The outliers are, however, not always „wrong” data (e.g., they were erroneously typed); in the contrary, in som discipline of science they contain the most valuable informations. In the following paper of J. Verõ 2008 fortunately there are given also concrete examples of such type. Consequently, there is given every reason to hope that in this way the reader is correctly informed about the Janus-faced character of the outliers. A last remark: in some cases
the most frequent value M and
the dihesion e
of the x nbsp; (2) The expert of the discipline
decides if this way to the
selection and valuation of the far lying
data is convenient for him or n |

Captions
Fig. 1. On the
circle of radius 3s lie the points with 99.7% probability if they
represent the
value-pairs of two independent standard Gaussian distributions. Consequently the points marked with serial numbers are in this case outliers. ( Near the origin two „inliers” are also given without series numbers.) Fig. 2. The most probable rc-values calculated according to the classical formula of the correlation coefficient (see Eq. 1.) if 99 value-pairs are generated according to the standard Gaussian rule ( see the circle on Fig. 1.) and only a single outlying point is taken into consideration ( characterised by the same value of x and y, see Fig. 1.). The Fig. 1. in Hajagos and Steiner 2008 corresponds to the case x=y=10 resulting in the rc- Value of 0.5 but even if the outlier is half so far, i.e., x=y=5, rc=0.2 holds instead of zero. |

## References
Andrews, D.
F.,Bickel, P. J., Hampel, F. R., Huber, P. J.,
Rogers, W. H., Tukey, J. W. 1972: |