Thursday, March 29, 2012

First-year Review

As my first year at ASU comes to a close, I am told to submit some materials for review. Here's some of the information I'm submitting.

Early Draft: Explaining Success and Assessing Optimality of Species Distribution Modeling in Conservation Biology (email me)
One of the tasks of conservation biology is to prioritize places in a region for their incorporation into a conservation area network such that species of interest are represented. Current practice assumes that prioritizing places to meet representation goals corresponds to solving the inductive problem of modeling actual species distributions in the region. In addition, current practice assumes that standard methods such as maximum likelihood estimation are up for the job. In other words, conservation biologists assume that models of actual species distributions inferred from statistical and machine learning methods are the inputs to place prioritization algorithms. Recently, Sarkar and colleagues have challenged this assumption and claim to offer a better solution. However, the challenge is left largely underdeveloped. This paper tries to make the skeptical sentiment of Sarkar and colleagues more explicit in order to (i) explain under what conditions species distribution modeling is successful and (ii) assess its optimality within conservation biology.


Research Proposal

I am mainly interested in how methodological choices in conservation biology can possibly be rational given the cognitive aims and background assumptions of conservation biologists. Currently, I am interested in exploring this broad topic with respect to species distribution modeling and population viability analysis. However, for dissertation purposes, I will likely choose one or the other. So, I see two possible tracks for a dissertation.



Other than the introductory chapter, the first two chapters will probably include mapping out the current paradigm in conservation biology---i.e., adaptive management---and my approach to rational choice in either species distribution modeling or population viability analysis. This approach will likely concern how to apply decision theoretic criteria such as admissibility and multi-objective decision making frameworks to methodological choices. So, the emphasis is on how to make choices regarding inductive methods rather than, say, how to make direct management or policy decisions. Of course, however, the choices with respect to inductive methods can help or hinder management or policy choices. As such, much of the discussion will be based around what cognitive aims are reflections of values to conserve biodiversity. The next four chapters would differ depending on whether I choose species distribution modeling or population viability analysis as my main topic of interest.



If I choose species distribution modeling, then the focus of chapters four and five will be the graphical/interventionist theory of causation and causal interpretations of statistical/machine learning models, respectively. The problem species distribution modelers set out to solve is a system structure problem. They are attempting to understand the causes of species distributions. As such, I see chapters four and five as laying out the theoretical framework behind such modeling. Chapters six and seven would then go on to try and explain the rational choices species distributions modelers should make given the background assumptions and cognitive aims associated with solving system structure problems in conservation biology qua adaptive management. Chapter six would focus on actual methods used to estimate causal parameters (MLE, Bayesian estimators, etc.). Chapter seven would focus on actual methods used to select causal models of species distributions (AIC, BIC, etc.).


On the other hand, if I choose population viability analysis, then the focus of chapters four and five will be the graphical/interventionist theory of causation and theories of actual causation, which have been somewhat of an after thought in the graphical/interventionist theory of causation. The problem population viability analysts set out to solve is an actual cause problem. They are attempting to understand not only the causes of species extinction but also the actual course of the fate of the species through time (i.e., dynamics). That is, population viability analysts are interested in understanding the actual causes of species extinction. As such, I see chapters four and five, again, as laying out the theoretical framework behind such analysis. Chapters six and seven would then go on to try and explain the rational methodological choices population viability analysts should make given the background assumptions and cognitive aims associated with solving actual cause problems in conservation biology qua adaptive management. Chapter six would focus on actual methods used to estimate parameters. Chapter seven would focus on actual methods used to select models of population viability (AIC, BIC, etc.).


Plan of Study

Fall 2012
Research Prospectus Writing
Mathematical Population Biology
Stochastic Modeling in Biology
Choice and Belief


Spring 2013
Biotic Distributions
Bayesian Modeling for Life Sciences
Biology and Society Lab


Fall 2013
History of Science I
Statistical Learning


Spring 2014
History of Science II
Computability


That's a wrap! Good luck with finals everyone!

Monday, January 9, 2012

Update, Again

This might be the form of posts for a while. At the end of this semester, I hope to have 1 chapter, if not 2 chapters, drafted up.

(1) I am interested in the realism and reliability of population viability analysis in conservation biology. Population viability analysis (PVA) includes many kinds of models and methods, which typically purport to estimate and predict, given hypothetical conservation plans, either the probability distribution over the time to extinction or the cumulative probability that some population becomes extinct within some time interval. My specific interests are in (i) making explicit the objective of PVA and whether such an objective is realistic given practical constraints and limited resources, (ii) making explicit how PVA models do and realistically should represent causal knowledge and uncertainty that results from various kinds of stochasticity, (iii) analyzing the reliability of PVA methods for causal inference and the propagation of uncertainty through time, and (iv) analyzing the reliability of PVA algorithms for designing or recommending conservation plans. In my dissertation work, I plan to work on this set of issues. (2) At a much more abstract level, I am interested in theory structure, and, in particular, the algorithmic structure one finds in conservation biology (and, perhaps, other scientists such as economics). Such structure does not immediately appear equivalent to the proof-theoretic or model structure philosophers have typically payed attention to. Finally, I am also interested in (3) environmental philosophy and (4) environmental law. In the interest of scope, however, topics (2-4) will most likely have to be relegated to post-dissertation work.

Sunday, November 27, 2011

Beginnings

I have the beginnings of a first chapter. If you want to have a look, email me!

Wednesday, November 23, 2011

Update of Research Statement


This isn't what I promised in the last post, but it's an update of something!

First, I am interested in the reliability of risk assessment methods in dealing with uncertainty. Specifically, I am focused on the reliability of such methods in population viability analysis. "Population viability analysis" is a term that includes many kinds of risk assessment methods, which typically purport to measure either the probability distribution over the time to (quasi-)extinction or the cumulative probability that some population becomes (quasi-)extinct within some time frame. Currently, Bayesian frameworks for population viability analysis are on the rise, and I am interested in the reliability of such systematic approaches to population viability analysis, Bayesian or otherwise. I am also interested in a logically prior question to the question about the reliability of such methods: what is the proper "target" of conservation, and hence, population viability analysis? Second, and at a much more abstract level, I am interested in theory structure, and, in particular, the inferentialist or algorithmic structure one finds in, for example, conservation biology. In my dissertation work, I plan to work on these two sets of problems. Finally, I am also interested in decision making under uncertainty and looking at theory structure in other sciences such as, say, economics. In the interest of scope, however, such topics will most likely have to be relegated to post-dissertation work.

Monday, November 21, 2011

Plan of Attack

I'll probably be able to update this blog with another lengthy post about where my dissertation research is headed in a couple of weeks. Until then, here's a list of books I plan on reading, and a list of courses I plan on taking in the near future. In other words, my plan of attack over the next year or so.

Courses List
Spring 2012 Courses
Probability
Statistical Modeling
Research/Independent Study

Fall 2012 Courses
Stochastic Processes in Biology
Dynamic Modeling in Biology
Prospectus
Complex Adaptive Systems?

Spring 2013 Courses
Bayesian Modeling in the Life Sciences
Computability and Incompleteness OR Applied Multivariate Analysis
Research

Reading List (Recommendations Helpful!)
Currently Reading
Idea of Biodiversity (Takacs)
Data Analysis with Regression (Gelman)
Computability (Cutland)
Logic of Reliable Inquiry (Kelly)
Articulating Reasons (Brandom)
Making it Explicit (Brandom)

Future Reading
Introduction to Stochastic Processes (Allen)
Bayesian Choice (Robert)
Statistical Decision Theory and Bayesian Analysis (Burger)
Bayesian Data Analysis (Gelman et al)
Probability Theory (Jaynes)
Risk Assessment in Conservation Biology (Burgman et al)
Matrix Population Models (Caswell)
Likelihood (Edwards)
Model Selection and Multimodel Inference (Burnham & Anderson)
Ecological Detective (Hilborn & Mangel)
Nature of Scientific Evidence (Taper & Lele)
Statistical Decision Functions (Wald)
Games and Information (Rasmusen)
Evolution of the Social Contract (Skyrms)
Enterprise of Knowledge (Levi)
Unified Neutral Theory of Biodiversity and Biogeography (Hubbell)
Scientific Image (van Fraassen)
Dappled World (Cartwright)
Method in Ecology (Shrader-Frechette)
Unsimple Truths (Mitchell)
Stochastic Population Dynamics in Ecology and Conservation (Lande et al)
Mathematical Modeling in Biology (Edelstein-Keshet)
Complexity (Mitchel)
Scientific Reasoning (Howson & Urbach)
Linear Causal Modeling with Structural Equations (Mulaik)
Structural Equations with Latent Variables (Bollen)

Thursday, October 13, 2011

Big Picture: Extinction Systems and Reliable Decision-Making

My last post was very specific. I said that the big picture wasn't quite clear yet. Well, it's not. But it's not entirely unclear so as not to be blogged about! Here's a brief outline of what I'm thinking now.

So, the first chapter studies the problem of making reliable decisions given priors and hypothetical inputs (management plans or intended outcomes) to a Bayesian model and a complex system of extinction (perhaps in general or something like frogs in particular). (The previous post details where I'm heading with respect to this chapter.)

The second and third chapters go together. We will take the decision-making process one-step further by considering the adaptive management framework in conservation biology. So, now we can make plans and decisions by considering updated priors and new hypothetical inputs. In the second chapter we will introduce adaptive management and discuss the conditions a system must meet to be amenable to adaptive management. In the third chapter we will discuss Bayesian adaptive management and some other popular framework(s) (so, might have to include these in the first chapter discussion as well) and compare them for optimal decision making in the adaptive management framework. One might be optimal for this context, the other that, etc. Whatever.

The fourth and fifth chapters go together. We will take note that the previous chapters have the beginnings of an algorithm or heuristic device for making decisions under various kinds of uncertainty (from highest level of uncertainty to less degrees of it). In the fourth chapter we will make the case that current algorithms in conservation biology are good but lacking/bad but not unsalvageable. Again, whatever. In the fifth chapter we will present our algorithm and consider extensions to other forms of uncertainty and systems relevant to extinction concerns.

Finally, in the sixth chapter we will examine how this algorithmic structure of conservation biology fits with accounts of discipline/theory structure from the philosophy of science. It's not laws, it's not syntactic, it's not semantic, it's algorithmic. Input a particular problem and system of uncertainty and output a reliable decision making protocol.

Obviously this will change (and it has changed; I just haven't shared every version). But, it's a start.

Wednesday, October 12, 2011

Justifying Probabilities: Reliable Decision-Making in Contexts of Biodiversity Catastrophe

Probably needless to say, my ideas for a thesis have changed. Though the grand picture is still unclear, here is something specific I've been working on.

Many conservation biologists (Ellison 1996; Goodman 2002; Wade 2000, 2002) and philosophers of biology (Sarkar 2005) argue that given uncertainty is ubiquitous in conservation contexts, a Bayesian future is in store. From the perspective of uncertainty, Bayesian methodologies have many advantages over frequentist methodologies from the choice of data sets and estimating parameters to updating model structure and providing a formal framework for decision-making. One particular branch of Bayesian methodologies, causal Bayes nets, has become increasingly popular in the literature not only as a framework for accounting for various types of uncertainty, but also as a reaction against deterministic modeling more generally (Reckhow 1999).

An essential aspect (and central point of contention) of Bayesian methodologies in general and causal Bayes nets in particular is the estimation of prior probability distributions. Typically negative responses to Bayesian methodologies have focused on trying to undercut this practice by arguing that it is, at best, arbitrary to assign prior probability distributions before confronting the data. Objective Bayesians generally respond by assigning uniform probability distributions of the variables of interest, whereas subjective Bayesians opt to make use of "background assumptions." In other words, to be a subjective Bayesian in the contexts of conservation biology implies using information from the biological domain to assign prior probability distributions.

Communicating to decision-makers about how and why prior probability distributions were assigned the way they were is important whether one is an objective or a subjective Bayesian. In the philosophy of climate modeling, Parker (2010) has provided a list of three conditions that ought to be met before scientists present a probability distribution to decision-makers. The first is ownership. Scientists must be willing to claim the probability distribution as a representation of their own degree of belief and uncertainty before presenting it to decision-makers. The second is justification. Scientists must justify why they have chosen this particular distribution as opposed to others. Finally, the third is robustness. Scientists must present probability distributions that do not rely on contentious assumptions.

All three conditions apply to the subjective Bayesian and the last two apply to the objective Bayesian. When conservation biologists present their causal Bayes nets to decision-makers, they should not only be explicit about posterior probability distributions, they should also be explicit about prior probability distributions. In particular, whether one is of the objective or subjective camp, one ought to provide justification for how and why one assigned prior probability distributions and be ready to show that such an assignment is robust. I focus on analyzing Parker's notion of justification in these contexts, and I aim to develop one strategy scientists could use to justify the assignment of prior probability distributions given a particular context.

My argument is as follows. Though some causes C have low probabilities, C may have high consequences on the probability distribution P(V) over the values v of the variable V estimating an attribute of the species of interest. Given some data set of C and V, one might be able to estimate the the prior P(V). However, C is, by definition, practically incalculable because we typically have a small sample size of observation on it. Supposing C causes V, if C is practically incalculable, then P(V|C) will be practically incalculable and any estimation of the prior P(V) will likely overestimate the expectation E(V) and underestimate the variance Var(V) and risk of making a decision E(V)U(V), where U is a utility function of V. Hence, when assigning prior P(V), the P(V) must be assumed, not generated from data.

One way to justify assuming a P(V) is by picking the one makes decision-making reliable given this domain of C. That is, given the problem of decision-making under risk of C causing V, P(V) can be justified by making decision-making reliable. A Gaussian distribution is risky in these contexts because it is unreliable in accounting for C. A power law distribution (e.g., Zipf's law distribution) is less risky and more reliable in accounting for C than the Gaussian distribution. Assuming a power law distribution is more reliable in the following sense. Whereas one cannot discover that the true P(V) is a power law distribution by initially assuming it is a Gaussian distribution, one can discover that the true P(V) is Gaussian by assuming it is a power law distribution. Therefore, we have good reason to assume a power law distribution for our prior P(V) given this domain of C.