Não foi possível enviar o arquivo. Será algum problema com as permissões?
Diferenças
Aqui você vê as diferenças entre duas revisões dessa página.
Ambos lados da revisão anterior Revisão anterior Próxima revisão | Revisão anterior Próxima revisão Ambos lados da revisão seguinte | ||
disciplinas:verao2007:exercicios [2007/01/30 14:16] paulojus |
disciplinas:verao2007:exercicios [2007/02/15 09:29] paulojus |
||
---|---|---|---|
Linha 14: | Linha 14: | ||
- Load the Paraná data-set from geoR using the command <code R>data(parana)</code> and inspect its documentation using <code R>help(parana)</code>. For these data, consider the same questions as were raised in Exercise 1.4. | - Load the Paraná data-set from geoR using the command <code R>data(parana)</code> and inspect its documentation using <code R>help(parana)</code>. For these data, consider the same questions as were raised in Exercise 1.4. | ||
- Read the Chapter 2 of Diggle & Ribeiro (2007) (you can get {{disciplinas:pdf:chapter2.pdf|this chapter here}}) | - Read the Chapter 2 of Diggle & Ribeiro (2007) (you can get {{disciplinas:pdf:chapter2.pdf|this chapter here}}) | ||
- | |||
- | |||
==== Semana 2 ==== | ==== Semana 2 ==== | ||
Linha 23: | Linha 21: | ||
- Inspect [[http://leg.ufpr.br/geoR/tutorials/Rcruciani.R|an example geoestatistical analysis]] for the hydraulic conductivity data. | - Inspect [[http://leg.ufpr.br/geoR/tutorials/Rcruciani.R|an example geoestatistical analysis]] for the hydraulic conductivity data. | ||
- Consider the following two models for a set of responses, $Y_i : i=1,\ldots,n$ associated with a sequence of positions $x_i: i=1,\ldots,n$ along a one-dimensional spatial axis $x$. | - Consider the following two models for a set of responses, $Y_i : i=1,\ldots,n$ associated with a sequence of positions $x_i: i=1,\ldots,n$ along a one-dimensional spatial axis $x$. | ||
- | - $Y_i = \alpha + \beta x_i + Z_i$, where $\alpha$ and $\beta$ are parameters and the $Z_i$ are mutually independent with mean zero and variance $\sigma_Z^2$. | + | - <latex>$Y_i = \alpha + \beta x_i + Z_i$</latex>, where <latex>$\alpha$</latex>$ and <latex>$\beta$</latex> are parameters and the <latex>$Z_i$</latex> are mutually independent with mean zero and variance $\sigma_Z^2$. |
- | - $Y_i = A + B x_i + Z_i$ where the $Z_i$ are as in (a) but $A$ and $B$ are now random variables, independent of each other and of the $Z_i$, each with mean zero and respective variances $\sigma_A^2$ and $\sigma_B^2$.\\ For each of these models, find the mean and variance of $Y_i$ and the covariance between $Y_i$ and $Y_j$ for any $j \neq i$. Given a single realisation of either model, would it be possible to distinguish between them? | + | - <latex>$Y_i = A + B x_i + Z_i$</latex> where the <latex>$Z_i$</latex> are as in (a) but //A// and //B// are now random variables, independent of each other and of the $Z_i$, each with mean zero and respective variances $\sigma_A^2$ and $\sigma_B^2$.\\ For each of these models, find the mean and variance of $Y_i$ and the covariance between $Y_i$ and $Y_j$ for any $j \neq i$. Given a single realisation of either model, would it be possible to distinguish between them? |
- | - Suppose that $Y=(Y_1,\ldots,Y_n)$ follows a multivariate Gaussian distribution with ${\rm E}[Y_i]=\mu$ and ${\rm Var}\{Y_i\}=\sigma^2$ and that the covariance matrix of $Y$ can be expressed as $V=\sigma^2 R(\phi)$. Write down the log-likelihood function for $\theta=(\mu,\sigma^2,\phi)$ | + | - Suppose that $Y=(Y_1,\ldots,Y_n)$ follows a multivariate Gaussian distribution with ${\rm E}[Y_i]=\mu$ and ${\rm Var}\{Y_i\}=\sigma^2$ and that the covariance matrix of $Y$ can be expressed as $V=\sigma^2 R(\phi)$. Write down the log-likelihood function for $\theta=(\mu,\sigma^2,\phi)$ based on a single realisation of $Y$ and obtain explicit expressions for the maximum likelihood estimators of $\mu$ and $\sigma^2$ when $\phi$ is known. Discuss how you would use these expressions to find maximum likelihood estimators numerically when $\phi$ is unknown. |
- | based on a single realisation of $Y$ and obtain explicit expressions for the maximum likelihood estimators of $\mu$ and $\sigma^2$ when $\phi$ is known. Discuss how you would use these expressions to find maximum likelihood estimators numerically when $\phi$ is unknown. | + | - Is the following a legitimate correlation function for a one-dimensional spatial process $S(x) : x \in \IR$? Give either a proof or a counter-example. <latex> $$ |
- | - Is the following a legitimate correlation function for a one-dimensional spatial process $S(x) : x \in \IR$? Give either a proof or a counter-example. <code> $$ | + | |
\rho(u) = \left\{ | \rho(u) = \left\{ | ||
\begin{array}{rcl} | \begin{array}{rcl} | ||
Linha 34: | Linha 31: | ||
\end{array} | \end{array} | ||
\right. | \right. | ||
- | $$ </code> | + | $$ </latex> |
- | - Consider the following method of simulating a realisation of a one-dimensional spatial process on $S(x) : x \in \IR$, with mean zero, variance 1 and correlation function $\rho(u)$. Choose a set of points $x_i \in \IR : i=1,\ldots,n$. Let $R$ denote the correlation matrix of $S=\{S(x_1),\ldots,S(x_n)\}$. Obtain the singular value decomposition of $R$ as $R = D \Lambda D^\prime$ where $\lambda$ is a diagonal matrix whose non-zero entries are the eigenvalues of $R$, in order from largest to smallest. Let $Y=\{Y_1,\ldots,Y_n\}$ be an independent random sample from the standard Gaussian distribution, ${\rm N}(0,1)$. Then the simulated realisation is | + | - Consider the following method of simulating a realisation of a one-dimensional spatial process on $S(x) : x \in \IR$, with mean zero, variance 1 and correlation function $\rho(u)$. Choose a set of points $x_i \in \IR : i=1,\ldots,n$. Let $R$ denote the correlation matrix of $S=\{S(x_1),\ldots,S(x_n)\}$. Obtain the singular value decomposition of $R$ as $R = D \Lambda D^\prime$ where $\lambda$ is a diagonal matrix whose non-zero entries are the eigenvalues of $R$, in order from largest to smallest. Let $Y=\{Y_1,\ldots,Y_n\}$ be an independent random sample from the standard Gaussian distribution, ${\rm N}(0,1)$. Then the simulated realisation is <latex> S = D \Lambda^{\frac{1}{2}} Y </latex> |
- | <code> $ S = D \Lambda^{\frac{1}{2}} Y. $ </code> | + | - Write an ''R'' function to simulate realisations using the above method for any specified set of points $x_i$ and a range of correlation functions of your choice. Use your function to simulate a realisation of $S$ on (a discrete approximation to) the unit interval $(0,1)$. |
- | - Write an \R{} function to simulate realisations using the above method for any specified set of points $x_i$ and a range of correlation functions of your choice. Use your function to simulate a realisation of $S$ on (a discrete approximation to) the unit interval $(0,1)$. | + | - Now investigate how the appearance of your realisation $S$ changes if in the equation above you replace the diagonal matrix $\Lambda$ by truncated form in which you replace the last $k$ eigenvalues by zeros. |
- | - Now investigate how the appearance of your realisation $S$ changes if in (\ref{eqn03:svdexercise}) you replace the diagonal matrix $\Lambda$ by truncated form in which you replace the last $k$ eigenvalues by zeros. | + | |
- | ==== Semana 3 ==== | ||
- | - Fit a model to the surface elevation data assuming a linear trend model on the coordinates and a Matérn correlation function with parameter kappa=2.5. | ||
- | Use the fitted model as the true model and perform a simulation study (i.e. simulate from this model) to compare parameter estimation based on maximum likelihood, restricted maximum likelihood and variograms. | ||
- | - Simulate 200 points in the unit square from the Gaussian model without measurement error, | ||
- | constant mean equals to zero, unit variance and exponential correlation function with $\phi=0.25$ | ||
- | and anisotropy parameters $(\psi_A=\pi/3, \psi_R=2)$. Obtain parameter estimates (using maximum likelihood): | ||
- | * assuming a isotropic model | ||
- | * try to estimate the anisotropy parameters | ||
- | Compare the results and repeat the exercise for $\phi_R=4$. | ||
- | - Consider a stationary trans-Gaussian model with known transformation function $h(\cdot)$, let $x$ be an arbitrary | ||
- | location within the study region and define $T=h^{-1}{S(x)}$. Find explicit expressions for ${\rm P}(T>c|Y)$ where | ||
- | $Y=(Y_1,...,Y_n)$ denotes the observed measurements on the untransformed scale and: | ||
- | * $h(u)=u$ | ||
- | * $h(u) = \log u$ | ||
- | * $h(u) = \sqrt{u}$ | ||
- | - Analyse the Paraná data-set or any other data set of your choice assuming priors obtaining: | ||
- | * a map of the predicted values over the area | ||
- | * a map of the predicted std errors over the area | ||
- | * a map of the probabilities of being above a certain (arbitrarily) choosen threshold over the area | ||
- | * a map of the 10th, 25th, 50th, 75th and 90th percentiles over the area | ||
- | * the predictive distribution of the porportion of the area with the value of the study variable below a certain threshold. (as a suggestion you can use the 30th percentile of the data as the value of such a threshold) | ||
- | ==== Semana 4 ==== | ||
- | - Consider the stationary Gaussian model in which $Y_i = \beta + S(x_i) + Z_i :i=1,\ldots,n$, where $S(x)$ is a stationary Gaussian process with mean zero, variance $\sigma^2$ and correlation | ||
- | function $\rho(u)$, whilst the $Z_i$ are mutually independent ${\rm N}(0,\tau^2)$ random variables. Assume that all parameters except $\beta$ are known. Derive the Bayesian predictive distribution of $S(x)$ for an arbitrary location $x$ when $\beta$ is assigned an improper uniform prior, $\pi(\beta)$ constant for all real $\beta$. | ||
- | Compare the result with the ordinary kriging formulae. | ||
- | - For the model assumed in the previous exercise, assuming a correlation | ||
- | function parametrised by a scalar parameter $\phi$ obtain the posterior | ||
- | distribution for: | ||
- | * a normal prior for $\beta$ and assuming the remaining parameters | ||
- | are known | ||
- | * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2)$ and | ||
- | assuming the correlation parameter is known | ||
- | * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2|\phi)$ | ||
- | and assuming a generic prior $p(\phi)$ for correlation parameter. | ||
- | - Analyse the Paraná data-set or any other data set of your choice assuming priors for the model parameters and obtaining: | ||
- | * the posterior distribution for the model parameters | ||
- | * a map of the predictive mean over the area | ||
- | * a map of the predictive median over the area | ||
- | * the predictive distribution at three arbitrary selected locations | ||
- | within the area | ||
- | - Obtain simulations from the Poison model as shown in Figure 4.1 | ||
- | of the text book for the course. | ||
- | - Try to reproduce or mimic the results shown in Figure 4.2 of the text | ||
- | book for the course simulating a data set and obtaining a similar | ||
- | data-analysis. **Note:** for the example in the book we have used | ||
- | //set.seed(34)//. | ||
- | - Reproduce the simulated binomial data shown in Figure 4.6. Use the package //geoRglm// in conjunction with priors of your choice to obtain predictive distributions for the signal $S(x)$S at locations $x=(0.6, 0.6)$ and $x=(0.9, 0.5)$. | ||
- | - Compare the predictive inferences which you obtained in the previous exercise with those obtained by fitting a linear Gaussian model to the empirical logit transformed data, $\log\{(y+0.5)/(n-y+0.5)\}$. | ||
- | - Compare the results of the two previous exercises and comment generally. | + | ==== Semana 3 ==== |
+ | - Fit a model to the surface elevation data assuming a linear trend model on the coordinates and a Matérn correlation function with parameter kappa=2.5. Use the fitted model as the true model and perform a simulation study (i.e. simulate from this model) to compare parameter estimation based on maximum likelihood, restricted maximum likelihood and variograms. | ||
+ | - Simulate 200 points in the unit square from the Gaussian model without measurement error, constant mean equals to zero, unit variance and exponential correlation function with $\phi=0.25$ and anisotropy parameters $(\psi_A=\pi/3, \psi_R=2)$. Obtain parameter estimates (using maximum likelihood): | ||
+ | * assuming a isotropic model | ||
+ | * try to estimate the anisotropy parameters \\ Compare the results and repeat the exercise for $\phi_R=4$. | ||
+ | - Consider a stationary trans-Gaussian model with known transformation function $h(\cdot)$, let $x$ be an arbitrary | ||
+ | location within the study region and define $T=h^{- 1}{S(x)}$. Find explicit expressions for ${\rm P}(T>c|Y)$ where | ||
+ | $Y=(Y_1,...,Y_n)$ denotes the observed measurements on the untransformed scale and: | ||
+ | * <latex>$h(u)=u$</latex> | ||
+ | * <latex>$h(u) = \log u$</latex> | ||
+ | * <latex>$h(u) = \sqrt{u}$</latex>. | ||
+ | - Analyse the Paraná data-set or any other data set of your choice assuming priors obtaining: | ||
+ | * a map of the predicted values over the area | ||
+ | * a map of the predicted std errors over the area | ||
+ | * a map of the probabilities of being above a certain (arbitrarily) choosen threshold over the area | ||
+ | * a map of the 10th, 25th, 50th, 75th and 90th percentiles over the area | ||
+ | * the predictive distribution of the porportion of the area with the value of the study variable below a certain threshold. (as a suggestion you can use the 30th percentile of the data as the value of such a threshold) | ||
+ | |||
+ | |||
+ | ==== Semana 4 ==== | ||
+ | |||
+ | - Consider the stationary Gaussian model in which $Y_i = \beta + S(x_i) + Z_i :i=1,\ldots,n$, where $S(x)$ is a stationary Gaussian process with mean zero, variance $\sigma^2$ and correlation function $\rho(u)$, whilst the $Z_i$ are mutually independent ${\rm N}(0,\tau^2)$ random variables. Assume that all parameters except $\beta$ are known. Derive the Bayesian predictive distribution of $S(x)$ for an arbitrary location $x$ when $\beta$ is assigned an improper uniform prior, $\pi(\beta)$ constant for all real $\beta$. Compare the result with the ordinary kriging formulae. | ||
+ | - For the model assumed in the previous exercise, assuming a correlation function parametrised by a scalar parameter $\phi$ obtain the posterior distribution for: | ||
+ | * a normal prior for $\beta$ and assuming the remaining parameters are known | ||
+ | * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2)$ and assuming the correlation parameter is known | ||
+ | * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2|\phi)$ and assuming a generic prior $p(\phi)$ for correlation parameter. | ||
+ | - Analyse the Paraná data-set or any other data set of your choice assuming priors for the model parameters and obtaining: | ||
+ | * the posterior distribution for the model parameters | ||
+ | * a map of the predictive mean over the area | ||
+ | * a map of the predictive median over the area | ||
+ | * the predictive distribution at three arbitrary selected locations within the area | ||
+ | - Obtain simulations from the Poison model as shown in Figure 4.1 of the text book for the course. | ||
+ | - Try to reproduce or mimic the results shown in Figure 4.2 of the text book for the course simulating a data set and obtaining a similar data-analysis. **Note:** for the example in the book we have used //set.seed(34)//. | ||
+ | - Reproduce the simulated binomial data shown in Figure 4.6. Use the package //geoRglm// in conjunction with priors of your choice to obtain predictive distributions for the signal $S(x)$S at locations $x=(0.6, 0.6)$ and $x=(0.9, 0.5)$. Compare the predictive inferences which you obtained in the previous exercise with those obtained by fitting a linear Gaussian model to the empirical logit transformed data, $\log\{(y+0.5)/(n-y+0.5)\}$. Compare the results of the two previous analysis and comment generally. | ||
==== Semana 5 ==== | ==== Semana 5 ==== | ||
- | - The //composite likelihood// (CL) is obtained by the product of independent distributions for pairs of variables at data locations. Assume a gaussian model with constant mean and isotropic exponential | + | - The //composite likelihood// (CL) is obtained by the product of independent distributions for pairs of variables at data locations. Assume a Gaussian model with constant mean and isotropic exponential correlation function. |
- | correlation function. | + | * write down the expression of the CL and discuss how parameter estimates could be obtained |
- | * write down the expression of the CL and discuss how parameter estimates could be obtained | + | * write down a code to obtain CL parameter estimates for the s100 data set and compare with the ones given by ML and REML. |
- | * write down a code to obtain CL parameter estimates for the s100 data set and compare with the ones given by ML and REML. | + | |