Wednesday, October 12, 2011

Justifying Probabilities: Reliable Decision-Making in Contexts of Biodiversity Catastrophe

Probably needless to say, my ideas for a thesis have changed. Though the grand picture is still unclear, here is something specific I've been working on.

Many conservation biologists (Ellison 1996; Goodman 2002; Wade 2000, 2002) and philosophers of biology (Sarkar 2005) argue that given uncertainty is ubiquitous in conservation contexts, a Bayesian future is in store. From the perspective of uncertainty, Bayesian methodologies have many advantages over frequentist methodologies from the choice of data sets and estimating parameters to updating model structure and providing a formal framework for decision-making. One particular branch of Bayesian methodologies, causal Bayes nets, has become increasingly popular in the literature not only as a framework for accounting for various types of uncertainty, but also as a reaction against deterministic modeling more generally (Reckhow 1999).

An essential aspect (and central point of contention) of Bayesian methodologies in general and causal Bayes nets in particular is the estimation of prior probability distributions. Typically negative responses to Bayesian methodologies have focused on trying to undercut this practice by arguing that it is, at best, arbitrary to assign prior probability distributions before confronting the data. Objective Bayesians generally respond by assigning uniform probability distributions of the variables of interest, whereas subjective Bayesians opt to make use of "background assumptions." In other words, to be a subjective Bayesian in the contexts of conservation biology implies using information from the biological domain to assign prior probability distributions.

Communicating to decision-makers about how and why prior probability distributions were assigned the way they were is important whether one is an objective or a subjective Bayesian. In the philosophy of climate modeling, Parker (2010) has provided a list of three conditions that ought to be met before scientists present a probability distribution to decision-makers. The first is ownership. Scientists must be willing to claim the probability distribution as a representation of their own degree of belief and uncertainty before presenting it to decision-makers. The second is justification. Scientists must justify why they have chosen this particular distribution as opposed to others. Finally, the third is robustness. Scientists must present probability distributions that do not rely on contentious assumptions.

All three conditions apply to the subjective Bayesian and the last two apply to the objective Bayesian. When conservation biologists present their causal Bayes nets to decision-makers, they should not only be explicit about posterior probability distributions, they should also be explicit about prior probability distributions. In particular, whether one is of the objective or subjective camp, one ought to provide justification for how and why one assigned prior probability distributions and be ready to show that such an assignment is robust. I focus on analyzing Parker's notion of justification in these contexts, and I aim to develop one strategy scientists could use to justify the assignment of prior probability distributions given a particular context.

My argument is as follows. Though some causes C have low probabilities, C may have high consequences on the probability distribution P(V) over the values v of the variable V estimating an attribute of the species of interest. Given some data set of C and V, one might be able to estimate the the prior P(V). However, C is, by definition, practically incalculable because we typically have a small sample size of observation on it. Supposing C causes V, if C is practically incalculable, then P(V|C) will be practically incalculable and any estimation of the prior P(V) will likely overestimate the expectation E(V) and underestimate the variance Var(V) and risk of making a decision E(V)U(V), where U is a utility function of V. Hence, when assigning prior P(V), the P(V) must be assumed, not generated from data.

One way to justify assuming a P(V) is by picking the one makes decision-making reliable given this domain of C. That is, given the problem of decision-making under risk of C causing V, P(V) can be justified by making decision-making reliable. A Gaussian distribution is risky in these contexts because it is unreliable in accounting for C. A power law distribution (e.g., Zipf's law distribution) is less risky and more reliable in accounting for C than the Gaussian distribution. Assuming a power law distribution is more reliable in the following sense. Whereas one cannot discover that the true P(V) is a power law distribution by initially assuming it is a Gaussian distribution, one can discover that the true P(V) is Gaussian by assuming it is a power law distribution. Therefore, we have good reason to assume a power law distribution for our prior P(V) given this domain of C.

No comments:

Post a Comment