********** Announcements ************
· We are expecting to have students with MATH 181A preparation from Winter and Spring 2016 mostly. These two classes were taught using different textbooks (not a big problem), and covered different amounts of materials (bigger problem). More specifically, Sp 2016 did not cover any material on hypothesis testing, or use the Statistical programming language and environment R in their homework. To make up for these deficiencies, the plan for this course will be:
1) Start with hypothesis testing mostly following the structure of Chapter 10 in Wackerly et al. (see below for reference books);
2) Statistical programming language and environment R will be introduced during TA sessions at start of the quarter, and will be used in homework assignments.
· In place of syllabus: after make up/review of general materials on hypothesis testing (basic concepts, Z tests, t-tests, likelihood ratio tests), we will focus on linear regression, with some elements of analysis of variance (ANOVA) as a special case. We will cover contingency tables and chi-squared tests towards the end. More detailed information are available as the course progresses and seen from list of topics covered and homework assignments.
· I will collect feedbacks during the course, i.e. a slip summarizing: 1) what you have learned; 2) what you don’t understand. These can be used as bonus points for your total score in the class. The dates of collection and points will be announced.
· The midterm will be on Friday, Oct. 28, in class. A cheat sheet is allowed (this is true for all exams). The materials covered will be through the end of week 4. Bring sheets of paper (or a blue book) to write the exam on.
· On Friday 9/30 at end of the lecture, I will collect feedback slips from section A01 (i.e. those enrolled for the 5pm TA session) students telling me: 1) when you had 181A, 2) what you know about hypothesis testing, 3) what you don’t understand about what we discussed so far. Students from A02 are welcome to provide feedbacks, but only those in A01 will receive 2 bonus points towards their total score (of 100 maximum without bonus) for the quarter.
· On Friday 10/21 at end of the lecture, I will collect feedback slips from section A02 students telling me: 0) your major, 1) when you had 181A, 2) what you have learned since the start of this quarter, 3) what you don’t understand about what we discussed so far. Students from A01 are welcome to provide feedbacks, but only those in A02 will receive 2 bonus points towards their total score.
· Wednesday Nov 9 lecture will be replaced by TA session, partly because Friday 11/11 is a holiday. Among other things, you will discuss the lm() function in R for linear regression.
· On Friday 11/18 at end of the lecture, I will collect feedback slips from section A01 students telling me: 1) what you don’t understand about what we have discussed so far; 2) any other feedbacks in general. Students from A02 are welcome to provide feedbacks, but only those in A01 will receive 2 bonus points towards their total scores for the quarter.
· On Wednesday 11/23 at end of the lecture, I will collect feedback slips from section A02 students telling me: 1) what you don’t understand about what we have discussed so far; 2) any other feedbacks in general. Students from A01 are welcome to provide feedbacks, but only those in A02 will receive 2 bonus points towards their total scores for the quarter.
· Just like for the midterm, please bring sheets of paper (or a blue book) to write the final exam on.
Lecture: MWF 4, PETER 102
Instructor: Ronghui (Lily) Xu
Office: APM 5856
Teaching Assistants: Jue (Marquis) Hou (click link to TA’s webpage)
Office: APM 6442
Office Hours: see TA webpage
1. Wackerly et al., "Mathematical Statistics with Applications"
2. Larsen and Marx, "An Introduction to Mathematical Statistics and Its Applications" (5th ed. For homework assignments)
3. Rice, "Mathematical Statistics and Data Analysis" (3rd ed. For homework assignments)
Some R codes used in lecture, see the TA’s webpage (link above) for more R examples.
· Hypothesis testing: basics
· Large sample Z-tests
· Sample size calculation
· Duality with confidence intervals
· One-sample and two-sample t-tests
· Nonparametric tests of one- and two-sample location
· Comparing two-sample variances
· Likelihood ratio test
· Neyman-Pearson lemma
· Uniformly most powerful tests
· Linear regression: least-squares estimation, matrix form of multiple linear regression
· Simple linear regression: inference, prediction, correlation
· Multiple linear regression: vector-valued random variables, inference, F-test, R-squared
· Analysis of variance, multiple comparisons using Bonferroni and Tukey’s method
· Contingency tables and Pearson’s chi-squared test, likelihood ratio test for Multinomial distribution
Homework: due each (following) week at TA sessions or in TA dropbox by end of that day (check with TA for exact time) – be sure to turn in your R program codes (as applicable) as well as the a complete solution including the setup of the problem, etc. Wackerly book Chapter 10 exercises will be on the A.S. soft reserves. Replace all ‘Applet’ exercises in the assignments with R exercises.
Week 1: Wackerly 10.6, 10.7, 10.19, 10.25; also:
1) Use R to redo example 10.6 in Wackerly book (see lecture notes also) using the exact Binomial distribution. Compare with the results using normal approximation.
2) Continue with example 10.8 in Wackerly book (see lecture notes also), take a grid with increment of 0.1 in [15, 20], compute the power of the test at each value of the grid as the alternative hypothesis, and plot the power curve. What is the limit of the power as μ approaches 15?
3) Derive the sample size formula of Z test for the two-sided alternative hypothesis.
Week 2: Wackerly 10.46, 10.54, 10.66, 10.71; also:
1) Redo 10.19, 10.25 (2-sided) using confidence intervals;
2) R simulation – repeat the following 100 times: first set sample size n=6, and generate n data points from a) Normal (0, 1); b) Uniform (0, 1); c) Exponential (1); d) Poisson (5), compute the t-statistic. Plot the histogram of these 100 values of the t-statistic for each of a) b) c), superimpose each histogram with a smoothed density if you can for visual purposes. Do they look like a t-distribution to you with 5 degrees of freedom? Finally repeat c) with n=15, and explain the purpose this exercise and summarize the interpretation of the results.
Week 3: Wackerly 10.81, 10.82, 15.7, 15.9, 15.12, 15.28; Larsen and Marx 6.4.14, 6.4.15
Week 4 (not due): Wackerly 10.107, 10.109a, 10.99ab, 10.97abc; also:
For testing two-sample means (2-sided) assuming equal variances, check that the likelihood ratio test is equivalent to the 2-sample t-test.
Week 5: Wackerly 10.97d, 10.99cd; also:
For a random sample of size n=20 from Normal (μ, 1), under the null hypothesis: μ=0. Plot the power curves in the same figure of the three Z-tests for the following three alternative hypotheses: 1) μ>0; 2) μ<0; 3) μ≠0. Assume a 0.05 significance level. Comment on the relation of your plots with the UMP test.
Week 6: Larsen and Marx 11.2.4-6, 11.2.14, 11.3.10; Use R to do the following –
also for each model that you fit, plot the residuals and comment on the plots:
Larsen and Marx 11.2.2, Wackerly 11.69, 11.72
Week 7: Larsen and Marx 11.3.24; also use R to do
Larsen and Marx 11.3.16: before you do the parts a) b) from the book, first do a scatter plot of the data, as well as a residual plot to assess the linear model assumption; test the null hypothesis β1 = 0 at 2-sided 0.05 significance level, and give the 95% confidence interval for β1. Write your solution as a mini-report including appropriate tables and/or figures as needed, and append the R codes at the end.
Week 8: Larsen and Marx 11.4.9; Rice chapter 14 – 11, 16; use R to do
Larsen and Marx 11.4.13, Wackerly 11.74
Week 9: Larsen and Marx 12.2.7, 12.2.8, 12.3.1 (this is the data set from the anova example in class,
see .pdf file, do it both ways: using the table for studentized range, and use R), 12.3.6, 12.3.7; use R to do:
Larsen and Marx 12.2.1, 12.3.1 (see above)
Week 10 (not due): Wackerly 10.97: 1) find the MLE of θ, 2) carry out the likelihood ratio (i.e. goodness-of-fit) test
for the Trinomial distribution specified in this problem using θ.
Larsen and Marx 10.5.7: 1) use R to carry out Pearson’s chi-squared test, 2) consider it as a comparison of two Binomial distributions, do the two parts (i.e. carry out the tests) of the last problem on midterm using this data, 3) compare the likelihood ratio test in the first part of 2) with the likelihood ratio under the Multinomial distribution (here k=4), are they the same or not, why?
Larsen and Marx 10.3.7
Grading: 35% Homework (drop 1 lowest
score) + 25% Midterm + 40% Final