Quantcast
Viewing all articles
Browse latest Browse all 10

A scientist modelling a scientist modelling science

The is a follow up from Nathaniel’s post. One of the ways that the probabilities of probabilities can be used is in asking what experiments would be best for a scientist to do. We can do this because scientists would like to have a logically consistent system that describes the world but make measurements which are not completely certain – the interpretation of probability as uncertain logic is justified.

Lets make a probabilist model of scientific inquiry. To do this, the first component we need is a model of “what science knows”, or equally, “what the literature says”. For the purposes here, I will only consider what science knows about one statement: “The literature says X is true”. I’ll write this as Image may be NSFW.
Clik here to view.
p(X|L)
and its negation as Image may be NSFW.
Clik here to view.
p(\bar{X}|L) = 1 - p(X|L)
. This is a really minimal example.

Of course, what the literature says is determined by someone who is reading it, not absolutely. In this example, this person is the scientist who is deciding what experiments to do. The scientists state of knowledge about the literature can be written Image may be NSFW.
Clik here to view.
p(p(X|L)|S)
. It is a distribution over the probability representing what the literature thinks. In this situation, it doesn’t matter what the literature actually says, just what the scientist thinks that the literature says.

The scientist can choose to do experiments and make a contribution to the literature, but what experiment should they choose? I reckon, the one which provides the most information to the literature. Lets take two experiments and consider what the scientist thinks about what the literature will say after the experiment is added: represented by the probability densities Image may be NSFW.
Clik here to view.
p(p(X|AL)|S)
and Image may be NSFW.
Clik here to view.
p(p(X|BL)|S)
.

We can measure how much information the experimenter expects will be added to the literature. Mathematically, the quantity of interest is the expectation of the Kullback-Liebler divergence (information gain) from what the scientist thinks the literature says now, to what the scientist thinks the literature will say after an experiment. Comparing (dropping the Image may be NSFW.
Clik here to view.
L
and Image may be NSFW.
Clik here to view.
S
for clarity: Image may be NSFW.
Clik here to view.
\mathbb{E}_{p(x)}[\mathbb{E}_{p(x|A)}[D_{KL}(p(x) || p(x|A))]]
and Image may be NSFW.
Clik here to view.
\mathbb{E}_{p(x)}[\mathbb{E}_{p(x|B)}[D_{KL}(p(x) || p(x|B))]]
. (Note: Image may be NSFW.
Clik here to view.
p(p(X|A)) \neq p(p(X)|A)
, the latter refers to what the scientist thinks the literature said before experiment A, in hindsight, after A is performed.)

These equations are a bit complex, I’ll break it down. The information gain (Image may be NSFW.
Clik here to view.
D_{KL}
) is minimal at Image may be NSFW.
Clik here to view.
p(X)
and is convex. Here is some example curves:
Image may be NSFW.
Clik here to view.

These show that greatest information gain is when the experiment changes what the literature thinks the most. The most information is where the change is in opposition to the current state, but all change is good.

Consider a topic where the scientist is completely sure that the literature does not say anything about X: Image may be NSFW.
Clik here to view.
p(p(X)) = \delta(p(x)-0.5)
(Image may be NSFW.
Clik here to view.
\delta
is the dirac delta, a all probability is at p(X)=0.5). There are two experiments, A: the scientist thinks that it will confirm or reject X almost conclusively with equal chance, Image may be NSFW.
Clik here to view.
p(p(X|A)) = 0.5\delta(p(x|A)-0.999) + 0.5\delta(p(x|A)-0.001)
and B: the scientist thinks that it will confirm X to the same degree, or not say anything with equal chance, Image may be NSFW.
Clik here to view.
p(p(X|A)) = 0.5\delta(p(x|A)-0.999) + 0.5\delta(p(x|A)-0.5)
. Some simple algebra will show that the expected information gain will be twice as much for A, the one that will confirm/reject. It seems to work fairly intuitively and is quite a general procedure. I’ll leave it to you to try other examples (of which there are many).

As the information gain is always positive, this applies to the expectation as well. If there is no information gain there is no point in doing any experiments (this is only possible when the scientist thinks the experiment wont show anything). The question of falsifiability is related. If the scientist thinks that there is no experiment that will cause a lower Image may be NSFW.
Clik here to view.
p(X)
, there still might be an informative experiment (see the example above). However, if the scientist thinks that there is no experiment that has the potential to change Image may be NSFW.
Clik here to view.
p(X)
, then they shouldn’t bother doing anything.

The really important thing about this is that there is no notion of scientific truth or falsehood: just a subjective scientist trying to reason the best they can about how to inform their community (the literature) about something.

EDIT: Also, notice that the biggest gain in information is when the literature is made to “change its mind” on something.


Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.

Viewing all articles
Browse latest Browse all 10

Trending Articles