# Ed Psych 589

EdPsych/Psych/Soc 589

C.J. Anderson

Fall 2018

*Last revised: November 15, 2018*

**General Information**

**Announcements**(10/16: up-dated with SAS, R and data)

**Lecture notes**

**Homework and Exams (the final exam/project is posted below)**

**Handy program and links**

Questions or problems regarding this site should be sent to cja@illinois.edu.

- Syllabus (2018).
**Computing:**You will have 2 options (i.e., I can provide help with):**SAS**statistical software: You can obtain a free educational versions from SAS.com. Look for "SAS Software for Learning". There is a University Edition and an OnDemand version. The latter requires an internet connection but it also includes more procedures (e.g., GRAPH, ETS and OR). Alternatively, you can purchase a license from webstore.**R**: It's free. There is more up-front learning with this and you will need to learn more packages and commands than SAS; however, you can't beat the graphics! I still consider myself a learner of R and will include R scripts to go with lectures; however, these will be rolled out as the semester progresses. There is also some online R information that goes with the Agresti text.

**Announcements:**

- Oct 16: In lecture we went over graphing data but using text rather than points. The SAS and R code to do this. The data to use for SAS with code to create data file, and a text file for R
- If you are registered for the class, you should have received a note from me and piazza.com.

**Lectures Notes:**

General comments: Only print/download one lectures at a time. I am making changes to the notes for Fall 2018. Up-dated notes will have the solid block "I" logo rather than the column logo. The notes below are in a more printer friendly than the ones used in class.

Since I am adding R, I will roll out R scripts as I am learn the packages for categorical data and work out R scripts of examples in lecture (and eventually homeworks).

- Introduction.
- SAS:
- Poisson distribution: World Cup Soccer, 1994.
- Poisson distribution: World Cup Soccer, 1998.
- Figure of Poisson Distribution.
- Binomial example: Heights of candidates for US President for the candidates for US President.
- For the peer nominations of bullies see Lecture below on Poisson regression.

- R:
- Figure of likelihood of binomial
- Examples of Poisson distributions.
- Poisson and world cup example.
- Tests for proportion. (exact only)
- Tests for proportion. (exact, asymptotic and more)

- SAS:
- Introduction to 2-way tables and basic measures of association.
- Chi-squared tests.
- SAS:
- Test of Independence: FECHLD X MAPAID (GSS 1994 data). (includes how to save standardized residuals to sas data set and then create a table of them).
- Test of Independence: Admissions scandal.
- Test of homoegeneous distributions: French skiers and effectiveness of vitamin C in preventing colds.
- Example of partitioning chi-square.

- R:
- Test of Independence in 4 x 5 table (GSS items on working moms):
- Test of Independence in 2 x 2 table (UofI Admission scandal)
- Testing homogeneous distributions: French skiers and effectiveness of vitamin C in preventing colds (same as "Working with 2-way tables above).
- Example of partitioning chi-square.

- SAS:
- Tests of ordinal association.
- SAS:
- Two "Likert" type items: FECHLD X MAPAID (GSS 1994 data).
- A 2 x J table with ordered columns: Gender by Prestige using different scores.
- A 3 x 3 table with with bad & good scores: School of Psychiatric Thought by Origin of schizophrenia (Agresti, 1990).
- Power computation and plots: Power computuation.
- Test for trend: Cochran-Armitage of Framingham heart study data.

- R:
- Two "Likert" type items: 4 x 5 table (GSS items on working moms):
- CHM for ordinal association in 3 x 3 table. (schools of psychartic thoughts x orgins of schizophrenia).
- Trying different scores (Farmer et al. data). Includes how to get good scores from the data. (2 x 8 table).
- Power of Mantel-Haenzel for ordinal association vs general test. Note: This include a function that I wrote that will compute power curves and tables. You can change the parameters input into the function for your own use....same and 1994 GSS example above.
- Cochran-Armitage Trend test in (2 x J) tables. Note there are multiple ways to test this. There are limitations to the CochranArmitageTest( ) function (i.e., I can't figure out whether or how to change scores). I also include an example of prop.trend.test( ) function.

- SAS:
- Exact tests for small samples.
- Three-way tables.
- SAS:
- Conditional indpendence & marginal dependence: Cal graduate admissions
- Conditional independnece & marginal dependence: Boy Scouts (hypothetical).
- Conditional independence & marginal dependence: Blue collar workers x management x supervisor.
- Conditional dependence & marginal independence: (Hypothetical) DIF.
- Conditional dependence & marginal dependence: Age x Smoking x Breathing Test result.
- Homogeneous association: Attitude toward media x year x race.
- Hurricane_Katrina

- R:
- Conditional indpendence & marginal dependence: Data: Cal graduate admissions
- Conditional indpendence & marginal dependence: R script: Cal graduate admissions
- Conditional independnece & marginal dependence: Data: Boy Scouts (hypothetical).
- Conditional independnece & marginal dependence: R script: Boy Scouts (hypothetical).
- Conditional independence & marginal dependence: Blue collar workers x management x supervisor.
- Conditional dependence & marginal independence: (Hypothetical) DIF.
- Conditional dependence & marginal dependence: Age x Smoking x Breathing Test result.
- Homogeneous association: Attitude toward media x year x race

- SAS:
- Introduction to Generalized Linear Models (GLMs).
- Supplemental reading on GLMs: Draft Chapter on GLMs from Anderson, Verkuilen & Johnson (in preparation). Applied Generalized Linear Mixed Models.
- SAS:
- Example of a variety of GLMS: T-cell and Hodgkin's disease.
- SAS graph used to make figures in lectures: Step and S shaped functions.
- More SAS graph (& data steps): Plotting cummulative distribution functions (normal & logistic).

- R:
- Example of a variety of GLMS: T-cell and Hodgkin's disease.

- Introduction to GLMS for Dichotomous data
- For a description of the variables in the high school & beyond data: is here.
- SAS:
- hsb-data.sas.
- Linear probability, logit & probit fit to grouped data: High School and Beyond.

Note: you need to first run the program that creates the data set (i.e., hsb-data.sas) - High School and Beyond jittered data, loess and fitted values from logit model.

Note: you need to first run the program that creates the data set (i.e., hsb-data.sas)

- R:
- Graphing data, fitted values, loess in both SAS and R:
- How to do jitter, loess and fitted for both R and SAS. (word document with SAS code and R script.
- Donner Party data for SAS.
- Donner Party data for R (plain text file)

- Poisson regression
- SAS:
- Deaths due to AIDS.
- Crab data.
- Lung Cancer. This includes graphing using sgplot (data and model fitted values).

- R:
- Inference, modeling checking, and fixes for when things go wrong..
- SAS:
- R:

- The Basics of Logistic Regression
- SAS:
- hsb-data.sas. < li>For a description of the variables, HSB coding.
- Logistic regression and model checking: High School & Beyond.
- Regression diagnositics: ESR data.
- SAS/MACRO (& example) for computing Range of Influence statistics: RangeInfluence.sas

- R:
- HSB data in csv format: HSB data. < li>Logistic regression and model checking: High School & Beyond.
- ESR example.
- text file with ESR data.
- No R script for Range of Influcence Statistic

- SAS:
- Multiple logistic Regression (lecture notes)
- SAS
- hsb-data.sas. For a description of the variables, HSB coding.
- High School & Beyond.
- csv file of Titanic passenger data..
- SAS program that imports data (you need to change path to where you save the data).
- SAS ESR Example of exact inference

- R:

- SAS
- Log-linear Models
- SAS:
- GSS 1992 Presidential Election
- Blue Collar Worker data and how to compute the dissimilarity index in SAS
- Revised GSS 1992 Presidential Election that shows linear x linear model and a nominal x linear model. These are topics that really will be talked about later.
- SAS GENMOD example for doing
**Bayesian**model fitting (easy).

- R:
- Blue collar data
- 4-way example
- data for 4-way example
- GSS Presidential election data
- R code using package glmnet to do
**lasso**in Poisson and logit regression. This is a small example, so really does not show the power of this method.

.
- SAS:
- Model Building for Log-Linear & Logit models
- SAS
- Wickens Olzak data with SAS code (graphical models and linear by linear)
- HSB example of linear by linear & uniform association models
- SAS program showing effect of sampling 0's (hypothetical from Wickens)
- Health concerns by genders (data from Fienberg). How to handle structural zeros & include indicators for specific cells
- Fitting RC(1) to HSB using NLP...with & without ordinal restrictions, and other models too. This code with some minor changes can also be run in SAS/NLMIXED and probably HPNLMIX

- R:
- R for Wickens-Olzak (modeling signal detection data)
- Text file of Wicken-Olzak data (signal detection)
- R for linear x linear, RC(M) association model and correspondence analysis
- Text file of HSB data for linear by linear models as well as RC(M) association and correspondence analysis.
- Text file of HSB data
- R for GSS data
- Text file of GSS data
- R for 4-way table
- Text file 4-way data set (marital status x permatiral sex x extermatrial sex x gender).
- R-code for dealing with zeros and modeling association.

- SAS
- Mulitnomial
logit models. (baseline and conditional) --- notes will be published within a day or 2, but everything below is here.
- Optional Extra reading:
- Anderson & Rutkowski (2008). Mulitnomial logit models. (baseline and conditional) Multinomial Logistic Regression. In Osborne "Best Practices in Quantitative Methods".
- Anderson (2009). Categorical Data Analysis with a Psychometric Twist. In Milsap & Maydeu-Olivares "The Sage Handbook of Quantitative Psychology". 311-336.
- Anderson, Kim and Keller (2014) Multilevel Modeling of Categorical Response Variables. In Rutkowski, von Davier, & Rutkowski "A Handbook of International Large-Scale Assessment: Background, Techinical Issues, and Methods of Data Analysis.

- SAS
- High School and Beyond Data (and code to create plots).
- Baseline logistic models for High School and Beyond Data.
- Grouped NYLS data of employment x Father's education x Race. (to show connection between log-linear and logit models.)
- Conditional logit model of choice of chocolate. (attributes of objects as predictors).
- Conditional logit model of differenct combinations of choice set. (attributes of objects in sets as predictors).
- Conditional logit model of model of transportation. (conditional logistic with predictors of the chocie objects and predictors of individual making the choice.

- R
- HSB: Baseline logistic models
- HSB loginear and multinomial logistic regression models: categorical predictors
- Fathers and Sons: Logit and loglinear models.
- Discrete choice models. This is R script for chocolates, brands & prices, and mode of transportion examples.
- chocolates_data.txt Chocolates data (as text file)
- brands_price_data.txt Brands-price data in format for glm
- brands_mdc_data.txt Brands-price data in format for mlogit
- transportation_data.txt Mode of transportation in format for mlogit

SAS program code:

- McNemar Test (Siskell & Ebert and cell phone use).
- Proportional Log-linear models of quasi-independence, symmetry, quasi-symmetry and test of marginal homogenity

**Homework and Exams**

Note: The SAS programs were ones that I used in creating the answer keys; that is, they're may be extra things in them that were not needed and I didn't write them with the intenion of posting them on the web.

- Homework writing guide
- Homework 1
- Homework assignment 1 (due Thursday September 13 )
- Answer key.
- SAS program for problem 1 (Agresti 1.8).
- SAS program heights of predidental candidates.
- R script for answer to problems.

- Homework assignment 1 (due Thursday September 13 )
- Homework 2
- Homework assignment 2. (due September 20)
- Answer key for homework 2.
- SAS for Agresti (2007) 2.19.
- SAS for Agresti (2007) 2.27.
- SAS heart disease by coffee example
- R script for homework 2.

- Homework 3
- Homework 4 (due Thursday
**October 11,**2018)- Homework assignment 4. (date on assigment sheet is wrong...correct date is Oct 11)
- Data from problems 3.11 and 3.12 ...to save you time, this is SAS code that creates the data set.
- Text file of data from problems 3.11 and 3.12
- Answer key for homework 4.
- SAS problem 1.
- SAS for Agresti (2007) 3.11 and 3.12.
- SAS for Agresti (2007) 3.18.
- R for homework 4

- Homework 5
- Homework assignment 5.
- Soccer data (plus a little SAS).
- Crab data (plus a little SAS).
- Soccer data as txt file.
- Crab data as txt file.
- Answer key.
- R markup of code for homework 5.
- R code for problem 1 plain text.
- R code problems 2, 3, and 4.
- SAS for Agresti (2007) 3.18.
- SAS for Agresti (2007) 3.13, 3.14 and ZIP.

- Homework 6
- Homework 7 (due November 15, 2018)
- Homework 8 (none for 2018)
- Final and Projects --- final is posted --- Due data is Friday December 14, 4pm.
- If you are doing a project, here is a description of what to include in your project. You can also look at the final, which shows in more detail what I would like to see.
- If you are taking the final:

**Handy Programs and Links:**

- Supplement to Agresti (2002, 2007): Software for categorical data (SAS, R/S-splus, Stata, others)
- Appendix to Agresti (2007): SAS and Data sets from Agresti (2007) and links to R and S-plus materials.
- College won't let me post executable code