Consider the following method of estimating for a Poisson distribution. Observe
ID: 3666069 • Letter: C
Question
Consider the following method of estimating for a Poisson distribution. Observe that p0 = P(X = 0) = e^(). Letting Y denote the number of zeros from an i.i.d. sample of size n, might be estimated by (estimate)= log(Y/n). Note that Y bin(n, p0).
Also write a R program to compare, with discussion, these two estimators using the Monte Carlo simulation method, at = .1 and 1. Do these for data sample n = 50 and 100, and M = 1000 Monte Carlo replicates. Within the R code, calculate the efficiency (ratio of variances) and comment on how well these compare with the propogation of errors method above.
Explanation / Answer
We have data x that we regard as being generated by a probability distribution F(x| ), which depends on a parameter . We wish to know Eh(X, ) for some ˆ function h( ). For example, if itself is estimated from the data as (x) and h(X, ) = ˆ (X) ]2 , then Eh(X, ) is the mean square error of the estimate. As another [ example, if ˆ 1 if | (X) | > 0 otherwise h(X, ) = ˆ then Eh(X, ) is the probability that | (X) | > . We realize that if were known, we could use the computer to generate independent random variables X1 , X2 , . . . , X B from F(x| ) and then appeal to the law of large numbers: Eh(X, ) 1 B B h(Xi , ) i=1 This approximation could be made arbitrarily precise by choosing B sufciently large. ˆ The parametric bootstrap principle is to perform this Monte Carlo simulation using ˆ ) to generate the Xi . It is difcult to in place of the unknown —that is, using F(x| give a concise answer to the natural question: How much error is introduced by using ˆ in place of ? The answer depends on the continuity of Eh(X, ) as a function of —if small changes in can give rise to large changes in Eh(X, ), the parametric bootstrap will not work well. 8.10 Problems 1. The following table gives the observed counts in 1-second intervals for Berkson’s data (Section 8.2). What are the expected counts from a Poisson distribution? Do they match the observed counts? n 0 1 2 3 4 5+ Observed 5267 4436 1800 534 111 21 2. The Poisson distribution has been used by trafc engineers as a model for light trafc, based on the rationale that if the rate is approximately constant and the trafc is light (so the individual cars move independently of each other), the distribution of counts of cars in a given time interval or space area should be nearly Poisson (Gerlough and Schuhl 1955). The following table shows the number of right turns during 300 3-min intervals at a specic intersection. Fit a Poisson distribution. Comment on the t by comparing observed and expected counts. It is useful to know that the 300 intervals were distributed over various hours of the day and various days of the week. 8.10 Problems n Frequency 0 1 2 3 4 5 6 7 8 9 10 11 12 13+ 313 14 30 36 68 43 43 30 14 10 6 4 1 1 0 3. One of the earliest applications of the Poisson distribution was made by Student (1907) in studying errors made in counting yeast cells or blood corpuscles with a haemacytometer. In this study, yeast cells were killed and mixed with water and gelatin; the mixture was then spread on a glass and allowed to cool. Four different concentrations were used. Counts were made on 400 squares, and the data are summarized in the following table: Number of Cells Concentration 1 Concentration 2 Concentration 3 Concentration 4 0 1 2 3 4 5 6 7 8 9 10 11 12 213 128 37 18 3 1 0 0 0 0 0 0 0 103 143 98 42 8 4 2 0 0 0 0 0 0 75 103 121 54 30 13 2 1 0 1 0 0 0 0 20 43 53 86 70 54 37 18 10 5 2 2 a. Estimate the parameter for each of the four sets of data. b. Find an approximate 95% condence interval for each estimate. c. Compare observed and expected counts. 4. Suppose that X is a discrete random variable with P(X = 0) = 2 3 314 Chapter 8 Estimation of Parameters and Fitting of Probability Distributions 1 3 2 P(X = 2) = (1 ) 3 1 P(X = 3) = (1 ) 3 P(X = 1) = where 0 1 is a parameter. The following 10 independent observations were taken from such a distribution: (3, 0, 2, 1, 3, 2, 1, 0, 2, 1). a. Find the method of moments estimate of . b. Find an approximate standard error for your estimate. c. What is the maximum likelihood estimate of ? d. What is an approximate standard error of the maximum likelihood estimate? e. If the prior distribution of is uniform on [0, 1], what is the posterior density? Plot it. What is the mode of the posterior? 5. Suppose that X is a discrete random variable with P(X = 1) = and P(X = 2) = 1 . Three independent observations of X are made: x1 = 1, x2 = 2, x3 = 2. a. Find the method of moments estimate of . b. What is the likelihood function? c. What is the maximum likelihood estimate of ? d. If has a prior distribution that is uniform on [0, 1], what is its posterior density? 6. Suppose that X bin(n, p). ˆ a. Show that the mle of p is p = X/n. b. Show that mle of part (a) attains the Cram´ r-Rao lower bound. e c. If n = 10 and X = 5, plot the log likelihood function. 7. Suppose that X follows a geometric distribution, P(X = k) = p(1 p)k1 and assume an i.i.d. sample of size n. a. Find the method of moments estimate of p. b. Find the mle of p. c. Find the asymptotic variance of the mle. d. Let p have a uniform prior distribution on [0, 1]. What is the posterior distribution of p? What is the posterior mean? 8. In an ecological study of the feeding behavior of birds, the number of hops between ights was counted for several birds. For the following data, (a) t a geometric distribution, (b) nd an approximate 95% condence interval for p, (c) 8.10 Problems 315 examine goodness of t. (d) If a uniform prior is used for p, what is the posterior distribution and what are the posterior mean and standard deviation? Number of Hops Frequency 1 2 3 4 5 6 7 8 9 10 11 12 48 31 20 9 6 5 4 2 1 1 2 1 9. How would you respond to the following argument? This talk of sampling distributions is ridiculous! Consider Example A of Section 8.4. The experimenter found the mean number of bers to be 24.9. How can this be a “random variable” with an associated “probability distribution” when it’s just a number? The author of this book is guilty of deliberate mystication! 10. Use the normal approximation of the Poisson distribution to sketch the approxiˆ mate sampling distribution of of Example A of Section 8.4. According to this ˆ approximation, what is P(|0 | > ) for = .5, 1, 1.5, 2, and 2.5, where 0 denotes the true value of ? 11. In Example A of Section 8.4, we used knowledge of the exact form of the sampling ˆ distribution of to estimate its standard error by s = ˆ ˆ n This was arrived at by realizing that X i follows a Poisson distribution with parameter n0 . Now suppose we hadn’t realized this but had used the bootstrap, letting the computer do our work for us by generating B samples of size n = 23 of Poisson random variables with parameter = 24.9, forming the mle of from each sample, and then nally computing the standard deviation of the resulting ˆ collection of estimates and taking this as an estimate of the standard error of . Argue that as B , the standard error estimated in this way will tend to s . ˆ 12. Suppose that you had to choose either the method of moments estimates or the maximum likelihood estimates in Example C of Section 8.4 and C of Section 8.5. Which would you choose and why? 13. In Example D of Section 8.4, the method of moments estimate was found to be = 3X . In this problem, you will consider the sampling distribution of . ˆ ˆ a. Show that E() = —that is, that the estimate is unbiased. ˆ 316 Chapter 8 Estimation of Parameters and Fitting of Probability Distributions b. Show that Var() = (3 2 )/n. [Hint: What is Var(X )?] ˆ c. Use the central limit theorem to deduce a normal approximation to the sampling distribution of . According to this approximation, if n = 25 and = 0, ˆ what is the P(|| > .5)? ˆ 14. In Example C of Section 8.5, how could you use the bootstrap to estimate the ˆ following measures of the accuracy of : (a) P(| 0 | > .05), (b) E(| 0 |), ˆ ˆ (c) that number such that P(| 0 | > ) = .5. ˆ 15. The upper quartile of a distribution with cumulative distribution F is that point q.25 such that F(q.25 ) = .75. For a gamma distribution, the upper quartile depends on and , so denote it as q(, ). If a gamma distribution is t to data as in ˆ Example C of Section 8.5 and the parameters and are estimated by and , ˆ ˆ the upper quartile could then be estimated by q = q(, ). Explain how to use ˆ ˆ ˆ the bootstrap to estimate the standard error of q. 16. Consider an i.i.d. sample of random variables with density function f (x| ) = a. b. c. d. |x| 1 exp 2 Find the method of moments estimate of . Find the maximum likelihood estimate of . Find the asymptotic variance of the mle. Find a sufcient statistic for . 17. Suppose that X 1 , X 2 , . . . , X n are i.i.d. random variables on the interval [0, 1] with the density function f (x|) = (2) [x(1 x)]1 ()2 where > 0 is a parameter to be estimated from the sample. It can be shown that E(X ) = Var(X ) = a. b. c. d. e. 1 2 1 4(2 + 1) How does the shape of the density depend on ? How can the method of moments be used to estimate ? What equation does the mle of satisfy? What is the asymptotic variance of the mle? Find a sufcient statistic for . 18. Suppose that X 1 , X 2 , . . . , X n are i.i.d. random variables on the interval [0, 1] with the density function f (x|) = (3) x 1 (1 x)21 () (2) where > 0 is a parameter to be estimated from the sample. It can be shown 8.10 Problems 317 that E(X ) = Var(X ) = a. b. c. d. 1 3 2 9(3 + 1) How could the method of moments be used to estimate ? What equation does the mle of satisfy? What is the asymptotic variance of the mle? Find a sufcient statistic for . 19. Suppose that X 1 , X 2 , . . . , X n are i.i.d. N (µ, 2 ). a. If µ is known, what is the mle of ? b. If is known, what is the mle of µ? c. In the case above ( known), does any other unbiased estimate of µ have smaller variance? 20. Suppose that X 1 , X 2 , . . . , X 25 are i.i.d. N (µ, 2 ), where µ = 0 and = 10. Plot ˆ the sampling distributions of X and 2 . 21. Suppose that X 1 , X 2 , . . . , X n are i.i.d. with density function f (x| ) = e(x ) , x and f (x| ) = 0 otherwise. a. Find the method of moments estimate of . b. Find the mle of . (Hint: Be careful, and don’t differentiate before thinking. For what values of is the likelihood positive?) c. Find a sufcient statistic for . 22. The Weibull distribution was dened in Problem 67 of Chapter 2. This distribution is sometimes t to lifetimes. Describe how to t this distribution to data and how to nd approximate standard errors of the parameter estimates. 23. A company has manufactured certain objects and has printed a serial number on each manufactured object. The serial numbers start at 1 and end at N , where N is the number of objects that have been manufactured. One of these objects is selected at random, and the serial number of that object is 888. What is the method of moments estimate of N ? What is the mle of N ? 24. Find a very new shiny penny. Hold it on its edge and spin it. Do this 20 times and count the number of times it comes to rest heads up. Letting denote the probability of a head, graph the log likelihood of . Next, repeat the experiment in a slightly different way: This time spin the coin until 10 heads come up. Again, graph the log likelihood of . 25. If a thumbtack is tossed in the air, it can come to rest on the ground with either the point up or the point touching the ground. Find a thumbtack. Before doing any experiment, what do you think , the probability of it landing point up, is? Next, toss the thumbtack 20 times and graph the log likelihood of . Then do 318 Chapter 8 Estimation of Parameters and Fitting of Probability Distributions another experiment: Toss the thumbtack until it lands point up 5 times, and graph the log likelihood of based on this experiment. Find and graph the posterior distribution arising from a uniform prior on . Find the posterior mean and standard deviation and compare the posterior with a normal distribution with that mean and standard deviation. Finally, toss the thumbtack 20 more times and compare the posterior distribution based on all 40 tosses to that based on the rst 20. 26. In an effort to determine the size of an animal population, 100 animals are captured and tagged. Some time later, another 50 animals are captured, and it is found that 20 of them are tagged. How would you estimate the population size? What assumptions about the capture/recapture process do you need to make? (See Example I of Section 1.4.2.) 27. Suppose that certain electronic components have lifetimes that are exponentially distributed: f (t| ) = (1/ ) exp(t/ ), t 0. Five new components are put on test, the rst one fails at 100 days, and no further observations are recorded. a. What is the likelihood function of ? b. What is the mle of ? c. What is the sampling distribution of the mle? d. What is the standard error of the mle? (Hint: See Example A of Section 3.7.) 28. Why do the intervals in the left panel of Figure 8.8 have different centers? Why do they have different lengths? 29. Are the estimates of 2 at the centers of the condence intervals shown in the right panel of Figure 8.8? Why are some intervals so short and others so long? For which of the samples that produced these condence intervals was 2 smallest? ˆ 30. The exponential distribution is f (x; ) = ex and E(X ) = 1 . The cumulative distribution function is F(x) = P(X x) = 1 ex . Three observations are made by an instrument that reports x1 = 5 and x2 = 3, but x3 is too large for the instrument to measure and it reports only that x3 > 10. (The largest value the instrument can measure is 10.0.) a. What is the likelihood function? b. What is the mle of ? 31. George spins a coin three times and observes no heads. He then gives the coin to Hilary. She spins it until the rst head occurs, and ends up spinning it four times total. Let denote the probability the coin comes up heads. a. What is the likelihood of ? b. What is the MLE of ? 32. The following 16 numbers came from normal random number generator on a computer: 5.3299 6.5941 4.1547 4.2537 3.5281 2.2799 3.1502 4.7433 3.7032 0.1077 1.6070 1.5977 6.3923 5.4920 3.1181 1.7220 319 8.10 Problems a. What would you guess the mean and variance (µ and 2 ) of the generating normal distribution were? b. Give 90%, 95%, and 99% condence intervals for µ and 2 . c. Give 90%, 95%, and 99% condence intervals for . d. How much larger a sample do you think you would need to halve the length of the condence interval for µ? 33. Suppose that X 1 , X 2 , . . . , X n are i.i.d. N (µ, 2 ), where µ and are unknown. How should the constant c be chosen so that the interval (, X + c) is a 95% condence interval for µ; that is, c should be chosen so that P( < µ X + c) = .95. 2 34. Suppose that X 1 , X 2 , . . . , X n are i.i.d. N (µ0 , 0 ) and µ and 2 are estimated by the method of maximum likelihood, with resulting estimates µ and 2 . Suppose ˆ ˆ the bootstrap is used to estimate the sampling distribution of µ. ˆ 2 a. Explain why the bootstrap estimate of the distribution of µ is N (µ, n ). ˆ ˆ ˆ ˆ2 b. Explain why the bootstrap estimate of the distribution of µ µ0 is N (0, n ). ˆ c. According to the result of the previous part, what is the form of the bootstrap condence interval for µ, and how does it compare to the exact condence interval based on the bootstrap condence interval for µ, and how does it compare to the exact condence interval based on the t distribution? 35. (Bootstrap in Example A of Section 8.5.1) Let U1 , U2 , . . . , U1029 be independent uniformly distributed random variables. Let X 1 equal the number of Ui less than .331, X 2 equal the number between .331 and .820, and X 3 equal the number greater than .820. Why is the joint distribution of X 1 , X 2 , and X 3 multinomial with probabilities .331, .489, and .180 and n = 1029? 36. How do the approximate 90% condence intervals in Example E of Section 8.5.3 compare to those that would be obtained approximating the sampling distributions ˆ of and by normal distributions with standard deviations given by s and s ˆ ˆ ˆ as in Example C of Section 8.5? 37. Using the notation of Section 8.5.3, suppose that and are lower and upper quantiles of the distribution of . Show that the bootstrap condence interval ˆ ˆ for can be written as (2 , 2 ). 38. Continuing Problem 37, show that if the sampling distribution of is symmetric ˆ about , then the bootstrap condence interval is ( , ). 39. In Section 8.5.3, the bootstrap condence interval was derived from consideration ˆ of the sampling distribution of 0 . Suppose that we had started with considering ˆ the distribution of / . How would the argument have proceeded, and would the bootstrap interval that was nally arrived at have been different? 40. In Example A of Section 8.5.1, how could you use the bootstrap to estimate the ˆ ˆ ˆ following measures of the accuracy of : (a) P(| 0 | > .01), (b) E(| 0 |), ˆ 0 | > ) = .5? (c) that number such that P(| 41. What are the relative efciencies of the method of moments and maximum likelihood estimates of and in Example C of Section 8.4 and Example C of Section 8.5? 320 Chapter 8 Estimation of Parameters and Fitting of Probability Distributions 42. The le gamma-ray contains a small quantity of data collected from the Compton Gamma Ray Observatory, a satellite launched by NASA in 1991 (http://cossc.gsfc.nasa.gov/). For each of 100 sequential time intervals of variable lengths (given in seconds), the number of gamma rays originating in a particular area of the sky was recorded. Assuming a model that the arrival times are a Poisson process with constant emission rate ( = events per second), estimate . What is the estimated standard error? How might you informally check the assumption that the emission rate is constant? What is the posterior distribution of if an improper gamma prior is used? 43. The le gamma-arrivals contains another set of gamma-ray data, this one consisting of the times between arrivals (interarrival times) of 3,935 photons (units are seconds). a. Make a histogram of the interarrival times. Does it appear that a gamma distribution would be a plausible model? b. Fit the parameters by the method of moments and by maximum likelihood. How do the estimates compare? c. Plot the two tted gamma densities on top of the histogram. Do the ts look reasonable? d. For both maximum likelihood and the method of moments, use the bootstrap to estimate the standard errors of the parameter estimates. How do the estimated standard errors of the two methods compare? e. For both maximum likelihood and the method of moments, use the bootstrap to form approximate condence intervals for the parameters. How do the condence intervals for the two methods compare? f. Is the interarrival time distribution consistent with a Poisson process model for the arrival times? 44. The le bodytemp contains normal body temperature readings (degrees Fahrenheit) and heart rates (beats per minute) of 65 males (coded by 1) and 65 females (coded by 2) from Shoemaker (1996). Assuming that the population distributions are normal (an assumption that will be investigated in a later chapter), estimate the means and standard deviations of the males and females. Form 95% condence intervals for the means. Standard folklore is that the average body temperature is 98.6 degrees Fahrenheit. Does this appear to be the case? 45. A Random Walk Model for Chromatin. A human chromosome is a very large molecule, about 2 or 3 centimeters long, containing 100 million base pairs (Mbp). The cell nucleus, where the chromosome is contained, is in contrast only about a thousandth of a centimeter in diameter. The chromosome is packed in a series of coils, called chromatin, in association with special proteins (histones), forming a string of microscopic beads. It is a mixture of DNA and protein. In the G0/G1 phase of the cell cycle, between mitosis and the onset of DNA replication, the mitotic chromosomes diffuse into the interphase nucleus. At this stage, a number of important processes related to chromosome function take place. For example, DNA is made accessible for transcription and is duplicated, and repairs are made of DNA strand breaks. By the time of the next mitosis, the chromosomes have been duplicated. The complexity of these and other processes raises many 8.10 Problems 321 questions about the large-scale spatial organization of chromosomes and how this organization relates to cell function. Fundamentally, it is puzzling how these processes can unfold in such a spatially restricted environment. At a scale of about 103 Mbp, the DNA forms a chromatin ber about 30 nm in diameter; at a scale of about 101 Mbp the chromatin may form loops. Very little is known about the spatial organization beyond this scale. Various models have been proposed, ranging from highly random to highly organized, including irregularly folded bers, giant loops, radial loop structures, systematic organization to make the chromatin readily accessible to transcription and replication machinery, and stochastic congurations based on random walk models for polymers. A series of experiments (Sachs et al., 1995; Yokota et al., 1995) were conducted to learn more about spatial organization on larger scales. Pairs of small DNA sequences (size about 40 kbp) at specied locations on human chromosome 4 were ourescently labeled in a large number of cells. The distances between the members of these pairs were then determined by ourescence microscopy. (The distances measured were actually two-dimensional distances between the projections of the paired locations onto a plane.) The empirical distribution of these distances provides information about the nature of large-scale organization. There has long been a tradition in chemistry of modeling the congurations of polymers by the theory of random walks. As a consequence of such a model, the two-dimensional distance should follow a Rayleigh distribution f (r | ) = r exp 2 r 2 2 2 Basically, the reason for this is as follows: The random walk model implies that the joint distribution of the locations of the pair in R 3 is multivariate Gaussian; by properties of the multivariate Gaussian, it can be shown the joint distribution of the locations of the projections onto a plane is bivariate Gaussian. As in Example A of Section 3.6.2 of the text, it can be shown that the distance between the points follows a Rayleigh distribution. In this exercise, you will t the Rayleigh distribution to some of the experimental results and examine the goodness of t. The entire data set comprises 36 experiments in which the separation between the pairs of ourescently tagged locations ranged from 10 Mbp to 192 Mbp. In each such experimental condition, about 100–200 measurements of two-dimensional distances were determined. This exercise will be concerned just with the data from three experiments (short, medium, and long separation). The measurements from these experiments is contained in the les Chromatin/short, Chromatin/medium, Chromatin/long. a. What is the maximum likelihood estimate of for a sample from a Rayleigh distribution? b. What is the method of moments estimate? c. What are the approximate variances of the mle and the method of moments estimate? 322 Chapter 8 Estimation of Parameters and Fitting of Probability Distributions d. For each of the three experiments, plot the likelihood functions and nd the mle’s and their approximate variances. e. Find the method of moments estimates and the approximate variances. f. For each experiment, make a histogram (with unit area) of the measurements and plot the tted densities on top. Do the ts look reasonable? Is there any appreciable difference between the maximum likelihood ts and the method of moments ts? g. Does there appear to be any relationship between your estimates and the genomic separation of the points? h. For one of the experiments, compare the asymptotic variances to the results obtained from a parametric bootstrap. In order to do this, you will have to generate random variables from a Rayleigh distribution with parameter . Show that if X follows a Rayleigh distribution with = 1, then Y = X follows a Rayleigh distribution with parameter . Thus it is sufcient to gure out how to generate random variables that are Rayleigh, = 1. Show how Proposition D of Section 2.3 of the text can be applied to accomplish this. B = 100 bootstrap samples should sufce for this problem. Make a histogram of the values of the . Does the distribution appear roughly normal? Do you think that the large sample theory can be reasonably applied here? Compare the standard deviation calculated from the bootstrap to the standard errors you found previously. i. For one of the experiments, use the bootstrap to construct an approximate 95% condence interval for using B = 1000 bootstrap samples. Compare this interval to that obtained using large sample theory. 46. The data of this exercise were gathered as part of a study to estimate the population size of the bowhead whale (Raftery and Zeh 1993). The statistical procedures for estimating the population size along with an assessment of the variability of the estimate were quite involved, and this problem deals with only one aspect of the problem—a study of the distribution of whale swimming speeds. Pairs of sightings and corresponding locations that could be reliably attributed to the same whale were collected, thus providing an estimate of velocity for each whale. The velocities, v1 , v2 , . . . , v210 (km/h), were converted into times t1 , t2 , . . . , t210 to swim 1 km—ti = 1/vi . The distribution of the ti was then t by a gamma distribution. The times are contained in the le whales. a. Make a histogram of the 210 values of ti . Does it appear that a gamma distribution would be a plausible model to t? b. Fit the parameters of the gamma distribution by the method of moments. c. Fit the parameters of the gamma distribution by maximum likelihood. How do these values compare to those found before? d. Plot the two gamma densities on top of the histogram. Do the ts look reasonable? e. Estimate the sampling distributions and the standard errors of the parameters t by the method of moments by using the bootstrap. f. Estimate the sampling distributions and the standard errors of the parameters t by maximum likelihood by using the bootstrap. How do they compare to the results found previously? 8.10 Problems 323 g. Find approximate condence intervals for the parameters estimated by maximum likelihood. 47. The Pareto distribution has been used in economics as a model for a density function with a slowly decaying tail: f (x|x0 , ) = x0 x 1 , x x0 , >1 Assume that x0 > 0 is given and that X 1 , X 2 , . . . , X n is an i.i.d. sample. a. Find the method of moments estimate of . b. Find the mle of . c. Find the asymptotic variance of the mle. d. Find a sufcient statistic for . 48. Consider the following method of estimating for a Poisson distribution. Observe that p0 = P(X = 0) = e Letting Y denote the number of zeros from an i.i.d. sample of size n, might be estimated by Y ˜ = log n Use the method of propagation of error to obtain approximate expressions for the variance and the bias of this estimate. Compare the variance of this estimate to the variance of the mle, computing relative efciencies for various values of . Note that Y bin(n, p0 ). 49. For the example on muon decay in Section 8.4, suppose that instead of recording x = cos , only whether the electron goes backward (x < 0) or forward (x > 0) is recorded. a. How could be estimated from n independent observations of this type? (Hint: Use the binomial distribution.) b. What is the variance of this estimate and its efciency relative to the method of moments estimate and the mle for = 0, .1, .2, .3, .4, .5, .6, .7, .8, .9? 50. Let X 1 , . . . , X n be an i.i.d. sample from a Rayleigh distribution with parameter > 0: x 2 2 x 0 f (x| ) = 2 ex /(2 ) , (This is an alternative parametrization of that of Example A in Section 3.6.2.) a. Find the method of moments estimate of . b. Find the mle of . c. Find the asymptotic variance of the mle. 51. The double exponential distribution is 1 < x < f (x| ) = e|x | , 2 For an i.i.d. sample of size n = 2m + 1, show that the mle of is the median of the sample. (The observation such that half of the rest of the observations are 324 Chapter 8 Estimation of Parameters and Fitting of Probability Distributions smaller and half are larger.) [Hint: The function g(x) = |x| is not differentiable. Draw a picture for a small value of n to try to understand what is going on.] 52. Let X 1 , . . . , X n be i.i.d. random variables with the density function 0x 1 f (x| ) = ( + 1)x , a. b. c. d. Find the method of moments estimate of . Find the mle of . Find the asymptotic variance of the mle. Find a sufcient statistic for . 53. Let X 1 , . . . , X n be i.i.d. uniform on [0, ]. a. Find the method of moments estimate of and its mean and variance. b. Find the mle of . c. Find the probability density of the mle, and calculate its mean and variance. Compare the variance, the bias, and the mean squared error to those of the method of moments estimate. d. Find a modication of the mle that renders it unbiased. 54. Suppose that an i.i.d. sample of size 15 from a normal distribution gives X = 10 and s 2 = 25. Find 90% condence intervals for µ and 2 . 55. For two factors—starchy or sugary, and green base leaf or white base leaf—the following counts for the progeny of self-fertilized heterozygotes were observed (Fisher 1958): Type Count Starchy green Starchy white Sugary green Sugary white 1997 906 904 32 According to genetic theory, the cell probabilities are .25(2 + ), .25(1 ), .25(1 ), and .25, where (0 < < 1) is a parameter related to the linkage of the factors. a. Find the mle of and its asymptotic variance. b. Form an approximate 95% condence interval for based on part (a). c. Use the bootstrap to nd the approximate standard deviation of the mle and compare to the result of part (a). d. Use the bootstrap to nd an approximate 95% condence interval and compare to part (b). 56. Referring to Problem 55, consider two other estimates of . (1) The expected number of counts in the rst cell is n(2 + )/4; if this expected number is equated to the count X 1 , the following estimate is obtained: 4X 1 ˜ 1 = 2 n 8.10 Problems 325 (2) The same procedure done for the last cell yields 4X 4 ˜ 2 = n Compute these estimates. Using that X 1 and X 4 are binomial random variables, show that these estimates are unbiased, and obtain expressions for their variances. Evaluate the estimated standard errors and compare them to the estimated standard error of the mle. 57. This problem is concerned with the estimation of the variance of a normal distribution with unknown mean from a sample X 1 , . . . , X n of i.i.d. normal random variables. In answering the following questions, use the fact that (from Theorem B of Section 6.3) (n 1)s 2 2 n1 2 and that the mean and variance of a chi-square random variable with r df are r and 2r , respectively. a. Which of the following estimates is unbiased? s2 = 1 n1 n i=1 (X i X )2 2 = ˆ 1 n n i=1 (X i X )2 b. Which of the estimates given in part (a) has the smaller MSE? n c. For what value of does i=1 (X i X )2 have the minimal MSE? 58. If gene frequencies are in equilibrium, the genotypes A A, Aa, and aa occur with probabilities (1 )2 , 2 (1 ), and 2 , respectively. Plato et al. (1964) published the following data on haptoglobin type in a sample of 190 people: Haptoglobin Type Hp1-1 10 Hp1-2 68 Hp2-2 112 a. b. c. d. Find the mle of . Find the asymptotic variance of the mle. Find an approximate 99% condence interval for . Use the bootstrap to nd the approximate standard deviation of the mle and compare to the result of part (b). e. Use the bootstrap to nd an approximate 99% condence interval and compare to part (c). 59. Suppose that in the population of twins, males (M) and females (F) are equally likely to occur and that the probability that twins are identical is . If twins are not identical, their genes are independent. a. Show that 1+ 1 P(MM) = P(FF) = P(MF) = 4 2 326 Chapter 8 Estimation of Parameters and Fitting of Probability Distributions b. Suppose that n twins are sampled. It is found that n 1 are MM, n 2 are FF, and n 3 are MF, but it is not known which twins are identical. Find the mle of and its variance. 60. Let X 1 , . . . , X n be an i.i.d. sample from an exponential distribution with the density function f (x| ) = 1 x/ e , 0x < a. Find the mle of . b. What is the exact sampling distribution of the mle? c. Use the central limit theorem to nd a normal approximation to the sampling distribution. d. Show that the mle is unbiased, and nd its exact variance. (Hint: The sum of the X i follows a gamma distribution.) e. Is there any other unbiased estimate with smaller variance? f. Find the form of an approximate condence interval for . g. Find the form of an exact condence interval for . 61. Laplace’s rule of succession. Laplace claimed that when an event happens n times in a row and never fails to happen, the probability that the event will occur the next time is (n + 1)/(n + 2). Can you suggest a rationale for this claim? 62. Show that the gamma distribution is a conjugate prior for the exponential distribution. Suppose that the waiting time in a queue is modeled as an exponential random variable with unknown parameter , and that the average time to serve a random sample of 20 customers is 5.1 minutes. A gamma distribution is used as a prior. Consider two cases: (1) the mean of the gamma is 0.5 and the standard deviation is 1, and (2) the mean is 10 and the standard deviation is 20. Plot the two posterior distributions and compare them. Find the two posterior means and compare them. Explain the differences. 63. Suppose that 100 items are sampled from a manufacturing process and 3 are found to be defective. A beta prior is used for the unknown proportion of defective items. Consider two cases: (1) a = b = 1, and (2) a = 0.5 and b = 5. Plot the two posterior distributions and compare them. Find the two posterior means and compare them. Explain the differences. 64. This is a continuation of the previous problem. Let X = 0 or 1 according to whether an item is defective. For each choice of the prior, what is the marginal distribution of X before the sample is taken? What are the marginal distributions after the sample is taken? (Hint: for the second question, use the posterior distribution of .) 65. Suppose that a random sample of size 20 is taken from a normal distribution with unknown mean and known variance equal to 1, and the mean is found to ¯ be x = 10. A normal distribution was used as the prior for the mean, and it was found that the posterior mean was 15 and the posterior standard deviation was 0.1. What were the mean and standard deviation of the prior? 8.10 Problems 327 66. Let the unknown probability that a basketball player makes a shot successfully be . Suppose your prior on is uniform on [0, 1] and that she then makes two shots in a row. Assume that the outcomes of the two shots are independent. a. What is the posterior density of ? b. What would you estimate the probability that she makes a third shot to be? 67. Evans (1953) considered tting the negative binomial distribution and other distributions to a number of data sets that arose in ecological studies. Two of these sets will be used in this problem. The rst data set gives frequency counts of Glaux maritima made in 500 contiguous 20-cm2 quadrants. For the second data set, a plot of potato plants 48 rows wide and 96 ft long was examined. The area was split into 2304 sampling units consisting of 2-ft lengths of row and in each unit the number of potato beetles was counted. Fit Poisson and negative binomial distributions, and comment on the goodness of t. For these data, the method of moments should be fairly efcient. Count Glaux maritima Potato Beetles 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 15 27 42 77 77 89 57 48 24 14 16 9 3 1 190 264 304 260 294 219 183 150 104 90 60 46 29 36 19 12 11 6 10 2 4 1 3 4 1 1 0 0 1 328 Chapter 8 Estimation of Parameters and Fitting of Probability Distributions 68. Let X 1 , . . . , X n be an i.i.d. sample from a Poisson distribution with mean , and n let T = i=1 X i . a. Show that the distribution of X 1 , . . . , X n given T is independent of , and conclude that T is sufcient for . b. Show that X 1 is not sufcient. c. Use Theorem A of Section 8.8.1 to show that T is sufcient. Identify the functions g and h of that theorem. 69. Use the factorization theorem (Theorem A in Section 8.8.1) to conclude that n T = i=1 X i is a sufcient statistic when the X i are an i.i.d. sample from a geometric distribution. 70. Use the factorization theorem to nd a sufcient statistic for the exponential distribution. 71. Let X 1 , . . . , X n be an i.i.d. sample from a distribution with the density function , 0 < < and 0 x < f (x| ) = (1 + x)+1 Find a sufcient statistic for . 72. Show that tion. n i=1 X i and n i=1 X i are sufcient statistics for the gamma distribu- 73. Find a sufcient statistic for the Rayleigh density, x 2 2 f (x| ) = 2 ex /(2 ) , x 0 74. Show that the binomial distribution belongs to the exponential family.