class: title-slide, middle <style type="text/css"> .title-slide { background-image: url('assets/img/bg.jpg'); background-color: #23373B; background-size: contain; border: 0px; background-position: 600px 0; line-height: 1; } </style> # Day 1. Probabilities. <hr width="65%" align="left" size="0.3" color="orange"></hr> ## Solution to exercises <hr width="65%" align="left" size="0.3" color="orange" style="margin-bottom:40px;" alt="@Martin Sanchez"></hr> .instructors[ **ECL707/807** - Dominique Gravel ] <img src="assets/img/logo.png" width="25%" style="margin-top:20px;"></img> <img src="assets/img/Bios2_reverse.png" width="23%" style="margin-top:20px;margin-left:35px"></img> --- # Solution 1.1. Marginal probability : `\(P(A=1) = P(A_{10})+P(A_{11}) = 0.15 + 0.05 = 0.2\)` Conditional probability : `\(P(A=1|B=0) = \frac{P(A=1,B=0)}{P(B=0)} = \frac{0.15}{1 - (0.20 + 0.05)} = 0.2\)` --- # Exercise 1.3. What is the probability there is not a single seed ? `\(P(s=0) = (1-0.01)\times(1-0.2)\times(1-0.17)\times(1-0.24)\times(1-0.06)\)` `\(P(s=0) = 0.47\)` What is the probability to pick at least one seed in the trap ? `\(P(n>0) = 1-P(s>0) = 0.53\)` --- # Exercise 1.4. Under this hypothesis, sites and species are considered independent. The probability that a given species `\(i\)` is located at both sites is therefore obtained by the square of regional prevalence. We have to sum them across species to get `\(N_{11}\)`, such that we get : `\(N_{11} = 0.22*0.22 + 0.15*0.15+0.37*0.37 = 0.21\)` We could do the same with the complements to obtain `\(N_{10}\)` and `\(N_{01}\)`. Note these are symetrical : `\(N_{10} = 0.22*(1-0.22)+0.15*(1-0.15)+0.37*(1-0.37)=0.53\)` And the resulting expected Jaccard index is : `\(J(A,B) = \frac{0.21}{0.21+.53+.53}=0.17\)` --- # Solution 1.4. In this situation composition of the two sites will depend on `\(D_{xy}\)`. We will therefore define a function to compute the different quantities necessary for the Jaccard. Occurrence of species `\(i\)` at two sites is a joint probability. We will consider that the occurrence probability at site `\(A\)` is given by regional prevalence and then use conditional probability (the function) to derive the joint probability : `\(P(A_i = 1, B_i = 1) = P(B_i = 1 |A_i = 1)P(A_i = 1)\)` --- # Solution 1.4. Then we need to get the complement, for only site A but not site B : `\(P(A_i = 1, B_i = 0) = P(B_i = 0 |A_i)P(A_i = 1)\)` And the inverse : `\(P(A_i = 0, B_i = 1) = P(B_i = 1 |A_i = 0)P(A_i = 0)\)` We finally follow the previous recipe to compute the Jacccard index. --- # Solution 1.5. Marginal occurrence probability at optimal climate is `\(P(X=1|E^+) = 0.2\)` Conditional occurrence probability at optimal climate and habitat is `\(P(X1|E^+, H^+)=0.8\)` Conditional occurrence probability at optimal climate but poor habitat is `\(P(X1|E^+, H^-)=0.05\)` --- # Solution 1.5. Using rules of probability, we get `\(P(H^+\)` as follows : `\(P(X=1|E^+) = P(X=1|E^+,H^+)P(H^+)+ P(X=1|E^+,H^-)P(H^-)\)` `\(P(X=1|E^+) = P(X=1|E^+,H^+)P(H^+)+ P(X=1|E^+,H^-)(1-P(H^+))\)` `\(P(H^+) = \frac{P(X|E^+)-P(X|E^+,H^-)}{P(X|E^+,H^+)-P(X|E^+,H^-)}\)` `\(P(H^+) = \frac{0.20-0.05}{0.80-0.05}\)` `\(P(H^+) = 0.2\)` --- # Solution 1.6. We first have to define the quantities properly. First thing, define what you are looking for : `\(P(link | obs)\)` : the probability there is an interaction given the observation And then the additional data given in the problem : `\(P(obs | link)\)` : the probability of the observation given the hypothesis there is a link `\(P(link)\)` : prior knowledge on the probability there is a link --- # Solution 1.6. Using Bayes theorem, we have the following solution : `\(P(link | obs) = \frac{P(obs|link)P(link)}{P(obs)}\)` We define the occurrence of an interaction a success of a bernouilli trial. For the numerator, we use the binomial distribution with `\(5\)` observations to find that the probability of observing no interaction with associated probability `\(P(link)\)` is : $P(obs|link) = 0.33 $ --- # Solution 1.6. The trick is often to compute the denominator. It's easy to conceive the first part : $P(obs|link)*P(link) = 0.33 x 0.20 $ The second one needs more thought but it's obvious at the end : `\(P(obs|link = 0)P(link = 0) = 1 x 0.80\)` Which gives the denominator : `\(P(obs) = 0.33 x 0.20 + 1 x 0.80\)` --- # Solution 1.6. Collecting the pieces together, we obtain the following answer : `\(P(link | obs) = \frac{0.33 x 0.20}{0.33 x 0.20 + 1 x 0.8} = 0.076\)` This is the probability the two species interact given the observations derived using prior information. **Congratulations, you discovered the essence of Bayes statistics without knowing it !**