(Irrelevant?) Correlations...

Software discovers correlations between unrelated events. [http://www.popsci.com/article/science/algorithm-reveals-link-between-sour-cream-and-traffic-accidents?dom=PSC&loc=recent&lnk=4&con=algorithm-reveals-link-between-sour-cream-and-traffic-accidents, http://www.tylervigen.com/]

1st example: Divorces in Maine - Margarine consumption


Correlation: 0.992558

2nd example: Number of people dying from being tangled int their bedsheets - Ski facilities revenues


Correlation: 0.969724

3rd example: Imports of oil from Norway - Drivers who died from train crash


Correlation: 0.954509

4th example: Number of copyrighted art works - Women in New York who dies because they slipped

But if there is a correlation why are they… unrelated?

Even after all these years of “progress”, the best way to “see” something is still just to… choose to see it.
Without any reason. Or maybe there is a reason? Let me check the imports of Norwegian oil…

What really makes us see correlations somewhere and not somewhere else, EVEN THOUGH the data say so?

Our beliefs seem to drive our knowledge…

First, an aside: Who used the word “unrelated”? Neither of the links you posted seem to contain the word (they use words like “spurious” and “unexpected”)

I do think there’s a sense in which these events are “unrelated”. It’s possible to have events that are unrelated in a particular sense or relatively unrelated, and it seems useful and meaningful to make that distinction. For instance, all humans are in some sense family, but it is still used to talk about those humans who are related to you. Even more broadly, all animals are family, and indeed it is likely that all matter emerged from the same singularity. And yet it still seems a worthwhile project to divide the world along any number of lines to make sense of our experiences.

These events are unrelated in the sense that they are not causally related (neither causes the other), and the common cause that creates the correlation is quite abstracted, and has little to do with the particulars of the things themselves and more to do with some common feature that is likely shared by a large set of other phenomena.

For instance, divorce and margarine consumption are likely correlated because both are affected by the economy (a downturn puts stress on marriages and also pushes people towards a cheaper butter alternative). They are “related” in the sense that they are both influenced by the general affluence of the population, but the relation is attenuated and non-obvious, making it meaningful to say that the two concepts are in most senses “unrelated”.

Note the following propositions:

1 everything is a plenum
2 all things within a plenum are related by inclusion in a particular plenum
3 some things are related by sharing a commonality
4 there are no things which do not share commonality within a given space and time
5 some things in different spacial temporal context may be related, with no commonality
6 relevance is either necessary or contingent
7 contingent correlations may be relevant or irrelevant
8 necessary correlations are always relevant
9 necessary relevance is based on logical correlations
10 contingent relevance is based on context, perspective, and definition
11 necessary correlations are always defined as , if not for that (relation) the definition would not, could not apply, hence it would become irrelevant

conclusion: meaning, relevance and co-relation are defined by boundary situations (contexts), within which, proximity (spatial-temporal co-incidence), prescription,(formally arranged commonality), and postscription, (a potential relevance of all points within that boundary) are signified.

Indeed, this is central to Bayesian probability calculations and an important point to make.

Let’s use an example directly from your OP:

Let’s say I have a belief that Divorces in Maine have a near-direct causal relationship with US Per Capita Margarine Consumption.

Now, without looking closely at the details of my model of the world and why I think those are intricately causally linked, we can just say ‘That belief is probably untrue’. We might even give a probability that it’s true – let’s say it has a probability of .00001 (1/100000).

At this point, we should specify (in this case, make up) a couple of other probabilities that you’ll see the utility of soon. We should specify the probability that, IF my belief is true, we will see a strong correlation between those two variables. Let’s just say that’s something like .9 – ie it’s very likely that, if my belief is true, we will see a strong correlation.

We should specify the probability that, IF my belief is false, we will see a strong correlation between those two variables. Let’s just say that’s something like .001 – very unlikely to see a correlation between the two if the belief is false.

Now, someone produces a graph and does a statistical analysis and shows that, lo and behold, over the last 9 years there has been a strong correlation. Surely that proves my belief right!! Wrong. Let’s do the bayesian calculation.

A stands for ‘my belief is true’, ~A stands for ‘my belief is false’
B stands for ‘there is a strong correlation’ and ~B stands for ‘there is not a strong correlation’.

bayes formula is P(A|B) = (P(B|A) * P(A)) / (P(B|A) * P(A) + P(B|~A) * P(~A))

P(A|B) literally means ‘the probability that my belief is true given that there is a strong correlation’, which is exactly what we’re trying to find – we’ve seen that there is a correlation, so we want to know if that proves that my belief is true.

P(B|A) means the probability that there would be a correlation if my belief were true – we specified that above as .9
P(A) means the probability that my belief is true – we specified that above as .00001

P(B|~A) means the probability that there would be a correlation if my belief were false – we specified that above as .001.
P(~A) is just 1 - P(A) which is 1-.00001, which is .99999

put it all together, we get (.9 * .00001) / (.9 * .00001 + .001 * .99999)
which, according to google, is .0089…

So slightly less than 1%

Which means that the probability that my belief is true started out at .00001, and after we saw the evidence of a strong correlation, it raised up to just under .01, but even with that evidence, because of the immense unlikelihood that it was true in the first place, it’s still more likely that the correlation was due to random coincidence than that it was due to my belief being true.

Lesson: If you start out with an extremely unlikely statement, even seemingly strong evidence for that statement can still leave that statement still very unlikely.

And so, when we’re analyzing a statement that we very strongly disbelieve, even if we find evidence for it, that doesn’t mean we should throw away our disbelief. That means we should consider the relative strength of the evidence, see how that affects the probabilities, and see where that statement stands afterward.

So yes, our beliefs do drive our knowledge – and necessarily so. Your prior beliefs form your calculations as to how you should update your beliefs in Bayesianism, and that’s frankly correct and unavoidable. Obviously the worse your prior beliefs are, the more evidence you’ll need to come to the correct beliefs, but if you start out with reasonable priors you’ll be alright. :wink:

The problem with using probabilities is that you have to have a given probability before you start. Your opinion, in this case, is biased toward saying that the unexamined probability is “0.00001”. But where did that come from but a mind already set against the idea?

I you were asked, “What is the probability that there is a God”, you would begin that same assessment process by declaring that the unexamined initial probability is extremely low, perhaps another “0.00001” - unless you already believed in God, then you would give it a much higher probability, perhaps “0.5” (50/50).

Doesn’t it seem reasonable that if you are to assume no knowledge concerning something, in order to examine all of the evidence, you should begin with a 50/50 chance of the idea being true or false?

I agree. There is correlation between definition(meaning), knowledge, and probability. The only real uncertainty lies in what drives this train? Probability is a function of of variables which may be of various mixes of facts and usage; whereby knowledge may be a derivative of that, or may be derived from it. Meaning also may play a determining driving force, or, be driven by such knowledge.

It makes a difference, where the primary intent is coming from, in order to ascertain the degree of relevance/irrelevance play a part, and what that is. (For example in gradual loss of meaning by disuse of a language, may imply a dis-association between the above noted elements), and consequentially a shift in focus or relevance.

 Undo relevance may break up, the continuity (associatebility) of the meaning as a consequence, and change the probability table, resulting from lack of specific and viable correlation. 

  For example , in a case where a girl wears a certain kind of hair band , to attract the attention of a certain boy in class , not knowing, that in that newly formed class, (let's say)  all the girls have the same or similar idea, toward other or even the same boy, changes her chances of getting the attention of that boy, by virtue of the diminished chance of that boy noticing her.

 Conversely, undo emphasis on that particular feature, (a surplus value),may change the boy's attitude, who doesn't happen to value such trifles. In this example, it seems, the focus on knowledge, of the relationship between probable use and attitudes regarding the wearing of trinkets, as a general proposition would tend to offer the best hope of correlating the boy with the girl, thereby establishing a relevance. (their relationship)  -On this ground, common sense would tend to minimize irrelevance and consequent dis-association.  

Can common sense be understood as a matrix of possible and probable associations?

No. Occam’s Razor is a good reason not to do this: the more complex the thing being suggested, the less likely it is a priori, and the more evidence it needs before a sane mind is willing to accept it exists.

Take the most complex thing I know of: a human, brain included. If I were just a soul on the earth for the first time, but I somehow knew English, and out of a straw hat of 3^^3^^3 random concepts (some of which may exist, most of which don’t) I pulled out a card that described in full detail a human being, I would have no reason to think it exists on this planet I find myself. It’s a very very complex thing, why would I think it’s even 50/50 that it exists? I would think it would be very very unlikely that it exists on this planet, and then when I see a human, or see evidence of humans, at that point the probability starts raising. But no, it doesn’t start at 50 50. That’s absurd, and it’s a complete mathematical absurdity actually.

To show why it’s a mathematical absurdity to say that every statement should start out with a probability of 50 50, take the statements ‘a’, ‘b’, and ‘c’. If every statement has a 50/50 probability, then
a has a likelihood of .5
b has a likelihood of .5
c has a likelihood of .5

a and b has a likelihood of .5
a and c has a likelihood of .5
b and c has a likelihood of .5

a and b and c has a likelihood of .5

a and b and ~c has a likelihood of .5

And you may not have seen it, but that’s not mathematically possible already.

if (a and b) has a likelihood of .5, and (a and b and c) has a likelihood of .5, then there’s only 1 possible situation to allow that:
when a and b are true, c has a likelihood of 1.

but then that would contradict “a and b and ~c has a likelihood of .5”

So, the statement you made that all statements should start out a priori with a 50 50 probability is actually provably disallowed by basic probability theory. Some statements by necessity must start out with less or more probability, because the relationships between statements in probability theory makes certain sets of relationships impossible (such as the one I described above).

Occam’s Razor doesn’t apply. One has to not only examine the item before applying it (invalidating the exercise), but also has to have an equally probable alternative. Occam’s Razor’s requires two theories from which to choose. It has nothing to do with any one theory being possibly true or false.

The margarine to divorce association might be very simple, or it might be very complex, or perhaps not really exist at all except by coincidence.

Emmm… ??? … noooo.

If A has 0.5 and B has 0.5, “A and B” has 0.25.
And “A and B and C” has 0.0156.

Ah, so you agree that it’s impossible that every statement starts with a probability of .5. Good start.

And interestingly, that ties in to Occam’s Razor – the idea that if you have a probability for the statement A, and you have a probability for the statement B, the statement ‘A and B’ must be AT MOST as probable as the least probable of the two (and in the vast majority of cases less probable than either) is the beginnings of understanding occam’s razor. ‘A and B’ is a strictly more complex proposition than ‘A’, and also more complex than ‘B’, and that’s why ‘A and B’ is always at most as likely than the least likely of A and B, and almost always less likely than that even.

Flannel, isn’t the probability of PROPOSITION- starting or not,-have a .5 chance?(of relevance) Rather it is the truth value of that proposition’? A PROPOSITION EITHER IS OR IS NOT APPLICABLE OR RELEVANT AS CORRELATED , but relevance can only be established set on basis of such co-relation. When two things are related, , the chances are .5.that they do. However, two things can be said to be relevant, only and only if, they pass that test.

Relevance, or lack of, can be said to occur, only if two propositions are co-related. However, even in two co-related propositions, there may be no basis to speak of a probability for relevance.
Relevance has to be based on other factors than co-relation.

‘relevance’ and co-relation’ are not mutually inclusive propositions. Therefore although the razor can apply in one sense, it cannot in another.

That didn’t make sense to me, sorry.

Simply pointing out that relevance and correlation are propositionally not on the same plane. It is due to the double meaning of ‘relevance’ One is, relate-ability, the other, is the degree of truth in the effect of such co-relation will effect.

Two propositions may be correlates, while at the same time not relevant in the sense of worthy of such co-relation. There may be no value in co-relating two propositions, because essential relevance may not be possible to demonstrate.

Probability does apply co-relationally on 50-50 probability scale, however the probable value in doing so, may be not. I think, that it is the latter which You signify. I may be wrong.
Propositional form and content are usually not dis associated even in common sense language.

You’re stringing words together in a way that I find alien and impossible to understand. The only thing I got in that post was your middle paragraph (and even that was worded in a very alien way to me). In that middle paragraph I think you’re saying just the obvious thing: statistical correlation can happen even when two things don’t have any causal relationship between each other. Apart from that middle paragraph, it’s all Chinese to me.

Usually in these cases, there is a 3rd extraneous variable accounting for both. I was given an example of this back in my school days: ice cream sales in the US are positively correlated with deaths in India. The connection? Summer heat.

Not quite. If two things do not have a causal relation, how can a statistical relationship happen? If that was true or truism, any two unrelated things could have a statistical relationship. But two non casual things or events can’t possibly have a statistical relationship, by definition.

This is why, a common ground, some kind of causality be demonstrated between them, in order for a statistics be concluded between them relationally. That common ground, is, whether the two things are neither: mutually inclusive, or mutually exclusive. In the former case the 1 thing = the other thing, so there is no relationship. The latter case demonstrate a total exclusivity where their relationship is purely formal, having no ground or a basis for correlation. This is purely tossing a coin.

There is no sense in which the seeking of this ground for probability ever exceeds or lessens the equal probability of having basis for a statistical analysis for relevance.
The third possibility, in case this ground is established, is where there is some chance that 1 & 2 are in fact correlates.(relate-able on some basis) Here, statistical analysis is possible, using greater or lesser chances than in the establishment of the ground.

It is the language surrounding the use of relevance, which introduces this confusion.

I think we are pretty much on the same page, Flannel, Gib, it’s just the way we use the words which seems to present more of a challenge. It creates artificial differing points of view between possibility and probability.

You seem to have no idea of what Occam’s Razor or probability is actually about.
I’m going to leave it at that.

That can be one explanation for some correlations, but I think the pictures in the OP really are just quite coincidentally correlated for a few years. You’ll notice they all span something like 10 years – that’s probably by design. It was a short lived, incredibly rare coincidence that the software was designed to dig and look for.

If two things do not have a causal relation, statistical correlation can happen by coincidence. Unrelated things can have a statistical correlation – that’s the whole point of the software in the OP – to use data mining to find correlations between unrelated things. And that is absolutely and positively NOT the definition. The definition of statistical relationship has nothing to do with causality – it is a mathematical relationship between two data sets. The two data sets absolutely do not have to have some causal relationship; to have a high correlation, they merely have to have some level of similarity in the shape of their graph. For a lot of the graphs that we see that have high correlation, that correlation will be explainable by causal factors – but not by definition. Not at all. Correlation can happen by coincidence. Case in point: OP.

That may be true but I still think the likelihood of a 3rd extraneous variable will be higher than the likelihood of sheer coincidence.