disciplinas:verao2007:exercicios

Não foi possível enviar o arquivo. Será algum problema com as permissões?

Você está aqui: start » disciplinas » verao2007 » exercicios

Diferenças

Aqui você vê as diferenças entre duas revisões dessa página.

--- disciplinas:verao2007:exercicios [2007/01/30 14:16]
paulojus
+++ disciplinas:verao2007:exercicios [2007/02/15 09:29]
paulojus
@@ Linha 14: / Linha 14: @@
   - Load the Paraná data-set from geoR  using the command <code R>data(parana)</code> and inspect its documentation using <code R>help(parana)</code>. For these data, consider the same questions as were raised in Exercise 1.4.
   - Read the Chapter 2 of Diggle & Ribeiro (2007) (you can get {{disciplinas:pdf:chapter2.pdf|this chapter here}})
 ==== Semana 2 ====
@@ Linha 23: / Linha 21: @@
   - Inspect [[http://leg.ufpr.br/geoR/tutorials/Rcruciani.R|an example geoestatistical analysis]] for the hydraulic conductivity data.
   - Consider the following two models for a set of responses, $Y_i : i=1,\ldots,n$ associated with a sequence of positions $x_i: i=1,\ldots,n$ along a one-dimensional spatial axis $x$.
-    - $Y_i = \alpha + \beta x_i + Z_i$, where $\alpha$ and $\beta$ are parameters and the $Z_i$ are mutually independent with mean zero and variance $\sigma_Z^2$.
+    - <latex>$Y_i = \alpha + \beta x_i + Z_i$</latex>, where <latex>$\alpha$</latex>$ and <latex>$\beta$</latex> are parameters and the <latex>$Z_i$</latex> are mutually independent with mean zero and variance $\sigma_Z^2$.
-    - $Y_i = A + B x_i + Z_i$ where the $Z_i$ are as in (a) but $A$ and $B$ are now random variables, independent of each other and of the $Z_i$, each with mean zero and respective variances $\sigma_A^2$ and $\sigma_B^2$.\\ For each of these models, find the mean and variance of $Y_i$ and the covariance between $Y_i$ and $Y_j$ for any $j \neq i$. Given a single realisation of either model, would it be possible to distinguish between them?
+    - <latex>$Y_i = A + B x_i + Z_i$</latex> where the <latex>$Z_i$</latex> are as in (a) but //A// and //B// are now random variables, independent of each other and of the $Z_i$, each with mean zero and respective variances $\sigma_A^2$ and $\sigma_B^2$.\\ For each of these models, find the mean and variance of $Y_i$ and the covariance between $Y_i$ and $Y_j$ for any $j \neq i$. Given a single realisation of either model, would it be possible to distinguish between them?
-  - Suppose that $Y=(Y_1,\ldots,Y_n)$ follows a multivariate Gaussian distribution with ${\rm E}[Y_i]=\mu$ and ${\rm Var}\{Y_i\}=\sigma^2$ and that the covariance matrix of $Y$ can be expressed as $V=\sigma^2 R(\phi)$. Write down the log-likelihood function for $\theta=(\mu,\sigma^2,\phi)$
+  - Suppose that $Y=(Y_1,\ldots,Y_n)$ follows a multivariate Gaussian distribution with ${\rm E}[Y_i]=\mu$ and ${\rm Var}\{Y_i\}=\sigma^2$ and that the covariance matrix of $Y$ can be expressed as $V=\sigma^2 R(\phi)$. Write down the log-likelihood function for $\theta=(\mu,\sigma^2,\phi)$ based on a single realisation of $Y$ and obtain explicit expressions for the maximum likelihood estimators of $\mu$ and $\sigma^2$ when $\phi$ is known. Discuss how you would use these expressions to find maximum likelihood estimators numerically when $\phi$ is unknown.
-based on a single realisation of $Y$ and obtain explicit expressions for the maximum likelihood estimators of $\mu$ and $\sigma^2$ when $\phi$ is known. Discuss how you would use these expressions to find maximum likelihood estimators numerically when $\phi$ is unknown.
+  - Is the following a legitimate correlation function for a one-dimensional spatial process $S(x) : x \in \IR$? Give either a proof or a counter-example. <latex> $$
-  - Is the following a legitimate correlation function for a one-dimensional spatial process $S(x) : x \in \IR$? Give either a proof or a counter-example. <code> $$
   \rho(u) = \left\{
     \begin{array}{rcl}
@@ Linha 34: / Linha 31: @@
     \end{array}
   \right.
-$$ </code>
+$$ </latex>
-  - Consider the following method of simulating a realisation of a one-dimensional spatial process on $S(x) : x \in \IR$, with mean zero, variance 1 and correlation function $\rho(u)$. Choose a set of points $x_i \in \IR : i=1,\ldots,n$. Let $R$ denote the correlation matrix of $S=\{S(x_1),\ldots,S(x_n)\}$. Obtain the singular value decomposition of $R$ as $R = D \Lambda D^\prime$ where $\lambda$ is a diagonal matrix whose non-zero entries are the eigenvalues of $R$, in order from largest to smallest. Let $Y=\{Y_1,\ldots,Y_n\}$ be an independent random sample from the standard Gaussian distribution, ${\rm N}(0,1)$. Then the simulated realisation is
+  - Consider the following method of simulating a realisation of a one-dimensional spatial process on $S(x) : x \in \IR$, with mean zero, variance 1 and correlation function $\rho(u)$. Choose a set of points $x_i \in \IR : i=1,\ldots,n$. Let $R$ denote the correlation matrix of $S=\{S(x_1),\ldots,S(x_n)\}$. Obtain the singular value decomposition of $R$ as $R = D \Lambda D^\prime$ where $\lambda$ is a diagonal matrix whose non-zero entries are the eigenvalues of $R$, in order from largest to smallest. Let $Y=\{Y_1,\ldots,Y_n\}$ be an independent random sample from the standard Gaussian distribution, ${\rm N}(0,1)$. Then the simulated realisation is <latex> S = D \Lambda^{\frac{1}{2}} Y </latex>
-<code> $ S = D \Lambda^{\frac{1}{2}} Y. $ </code>
+  - Write an ''R'' function to simulate realisations using the above method for any specified set of points $x_i$ and a range of correlation functions of your choice. Use your function to simulate a realisation of $S$ on (a discrete approximation to) the unit interval $(0,1)$.
-- Write an \R{} function to simulate realisations using the above method for any specified set of points $x_i$ and a range of correlation functions of your choice. Use your function to simulate a realisation of $S$ on (a discrete approximation to) the unit interval $(0,1)$.
+  - Now investigate how the appearance of your realisation $S$ changes if in the equation above you replace the diagonal matrix $\Lambda$ by truncated form in which you replace the last $k$ eigenvalues by zeros.
-- Now investigate how the appearance of your realisation $S$ changes if in (\ref{eqn03:svdexercise}) you replace the diagonal matrix $\Lambda$ by truncated form in which you replace the last $k$ eigenvalues by zeros.
-==== Semana 3 ====
-- Fit a model to the surface elevation data assuming a linear trend model on the coordinates and a Matérn correlation function with parameter kappa=2.5.
-Use the fitted model as the true model and perform a simulation study (i.e. simulate from this model) to compare parameter estimation based on  maximum likelihood, restricted maximum likelihood and variograms.
-- Simulate 200 points in the unit square from the Gaussian model without measurement error,
-constant mean equals to zero, unit variance and exponential correlation function with $\phi=0.25$
-and anisotropy parameters $(\psi_A=\pi/3, \psi_R=2)$. Obtain parameter estimates (using maximum likelihood):
-  * assuming  a isotropic model
-  * try to estimate the anisotropy parameters
-Compare the results and repeat the exercise for $\phi_R=4$.
-- Consider a stationary trans-Gaussian model with known transformation function $h(\cdot)$, let $x$ be an arbitrary
-location within the study region and define $T=h^{-1}{S(x)}$. Find explicit expressions for ${\rm P}(T>c|Y)$ where
-$Y=(Y_1,...,Y_n)$ denotes the observed measurements on the untransformed scale and:
-  * $h(u)=u$
-  * $h(u) = \log u$
-  * $h(u) = \sqrt{u}$
-- Analyse the Paraná data-set or any other data set of your choice assuming priors obtaining:
-  * a map of the predicted values over the area
-  * a map of the predicted std errors over the area
-  * a map of the probabilities of being above a certain (arbitrarily) choosen threshold over the area
-  * a map of the 10th, 25th, 50th, 75th and 90th percentiles over the area
-  * the predictive distribution of the porportion of the area with the value of the study variable below a certain threshold. (as a suggestion you can use the 30th percentile of the data as the value of such a threshold)
-==== Semana 4 ====
-- Consider the stationary Gaussian model in which $Y_i = \beta + S(x_i) + Z_i :i=1,\ldots,n$, where $S(x)$ is a stationary Gaussian process with mean zero, variance $\sigma^2$ and correlation
-function $\rho(u)$, whilst the $Z_i$ are mutually independent ${\rm N}(0,\tau^2)$ random variables. Assume that all parameters except $\beta$ are known. Derive the Bayesian predictive distribution of $S(x)$ for an arbitrary location $x$ when $\beta$ is assigned an improper uniform prior, $\pi(\beta)$ constant for all real $\beta$.
-Compare the result with the ordinary kriging formulae.
-- For the model assumed in the previous exercise, assuming a correlation
-function parametrised by a scalar parameter $\phi$ obtain the posterior
-distribution for:
-  * a normal prior for $\beta$ and assuming the remaining parameters
-are known
-  * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2)$ and
-assuming the correlation parameter is known
-  * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2|\phi)$
-and assuming a generic prior $p(\phi)$ for correlation parameter.
-- Analyse the Paraná data-set or any other data set of your choice assuming priors for the model parameters and obtaining:
-  * the posterior distribution for the model parameters
-  * a map of the predictive mean over the area
-  * a map of the predictive median over the area
-  * the predictive distribution at three arbitrary selected locations
-within the area
-- Obtain simulations from the Poison model as shown in Figure 4.1
-of the text book for the course.
-- Try to reproduce or mimic the results shown in Figure 4.2 of the text
-book for the course simulating a data set and obtaining a similar
-data-analysis. **Note:** for the example in the book we have used
-//set.seed(34)//.
-- Reproduce the simulated binomial data shown in Figure 4.6. Use the package //geoRglm// in conjunction with priors of your choice to obtain predictive distributions for the signal $S(x)$S at locations $x=(0.6, 0.6)$ and $x=(0.9, 0.5)$.
-- Compare the predictive inferences which you obtained in the previous exercise  with those obtained by fitting a linear Gaussian model to the empirical logit transformed data,  $\log\{(y+0.5)/(n-y+0.5)\}$.
-- Compare the results of the two previous exercises and comment generally.
+==== Semana 3 ====
+  - Fit a model to the surface elevation data assuming a linear trend model on the coordinates and a Matérn correlation function with parameter kappa=2.5.  Use the fitted model as the true model and perform a simulation study (i.e. simulate from this model) to compare parameter estimation based on  maximum likelihood, restricted maximum likelihood and variograms.
+  - Simulate 200 points in the unit square from the Gaussian model without measurement error, constant mean equals to zero, unit variance and exponential correlation function with $\phi=0.25$ and anisotropy parameters $(\psi_A=\pi/3, \psi_R=2)$. Obtain parameter estimates (using maximum likelihood):
+    * assuming  a isotropic model
+    * try to estimate the anisotropy parameters \\ Compare the results and repeat the exercise for $\phi_R=4$.
+  - Consider a stationary trans-Gaussian model with known transformation function $h(\cdot)$, let $x$ be an arbitrary
+location within the study region and define $T=h^{- 1}{S(x)}$. Find explicit expressions for ${\rm P}(T>c|Y)$ where
+$Y=(Y_1,...,Y_n)$ denotes the observed measurements on the untransformed scale and:
+    * <latex>$h(u)=u$</latex>
+    * <latex>$h(u) = \log u$</latex>
+    * <latex>$h(u) = \sqrt{u}$</latex>.
+  - Analyse the Paraná data-set or any other data set of your choice assuming priors obtaining:
+    * a map of the predicted values over the area
+    * a map of the predicted std errors over the area
+    * a map of the probabilities of being above a certain (arbitrarily) choosen threshold over the area
+    * a map of the 10th, 25th, 50th, 75th and 90th percentiles over the area
+    * the predictive distribution of the porportion of the area with the value of the study variable below a certain threshold. (as a suggestion you can use the 30th percentile of the data as the value of such a threshold)
+==== Semana 4 ====
+  - Consider the stationary Gaussian model in which $Y_i = \beta + S(x_i) + Z_i :i=1,\ldots,n$, where $S(x)$ is a stationary Gaussian process with mean zero, variance $\sigma^2$ and correlation function $\rho(u)$, whilst the $Z_i$ are mutually independent ${\rm N}(0,\tau^2)$ random variables. Assume that all parameters except $\beta$ are known. Derive the Bayesian predictive distribution of $S(x)$ for an arbitrary location $x$ when $\beta$ is assigned an improper uniform prior, $\pi(\beta)$ constant for all real $\beta$. Compare the result with the ordinary kriging formulae.
+  - For the model assumed in the previous exercise, assuming a correlation function parametrised by a scalar parameter $\phi$ obtain the posterior distribution for:
+    * a normal prior for $\beta$ and assuming the remaining parameters are known
+    * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2)$ and assuming the correlation parameter is known
+    * a normal-scaled-inverse-$chi^2$ prior for $(\beta, \sigma^2|\phi)$ and assuming a generic prior $p(\phi)$ for correlation parameter.
+  - Analyse the Paraná data-set or any other data set of your choice assuming priors for the model parameters and obtaining:
+    * the posterior distribution for the model parameters
+    * a map of the predictive mean over the area
+    * a map of the predictive median over the area
+    * the predictive distribution at three arbitrary selected locations within the area
+  - Obtain simulations from the Poison model as shown in Figure 4.1 of the text book for the course.
+  - Try to reproduce or mimic the results shown in Figure 4.2 of the text book for the course simulating a data set and obtaining a similar data-analysis. **Note:** for the example in the book we have used //set.seed(34)//.
+  - Reproduce the simulated binomial data shown in Figure 4.6. Use the package //geoRglm// in conjunction with priors of your choice to obtain predictive distributions for the signal $S(x)$S at locations $x=(0.6, 0.6)$ and $x=(0.9, 0.5)$. Compare the predictive inferences which you obtained in the previous exercise  with those obtained by fitting a linear Gaussian model to the empirical logit transformed data,  $\log\{(y+0.5)/(n-y+0.5)\}$. Compare the results of the two previous analysis and comment generally.
 ==== Semana 5 ====
-- The //composite likelihood// (CL) is obtained by the product of independent distributions for pairs of variables at data locations. Assume a gaussian model with constant mean and isotropic exponential
+  - The //composite likelihood// (CL) is obtained by the product of independent distributions for pairs of variables at data locations. Assume a Gaussian model with constant mean and isotropic exponential correlation function.
-correlation function.
+    * write down the expression of the CL and  discuss how parameter estimates could be obtained
-  * write down the expression of the CL and  discuss how parameter estimates could be obtained
+    * write down a code to obtain CL  parameter estimates for the s100  data set and compare with the ones given by ML and REML.
-  * write down a code to obtain CL  parameter estimates for the s100  data set and compare with the ones given by ML and REML.

Diferenças

Navegação

Busca

Ferramentas

QR Code