# Exercise 1 with solution - New York City /r/ with Rbrul - GoldVarb-esque logistic regression with categorical predictor # # Start R. # Load Rbrul. source("http://www.danielezrajohnson.com/Rbrul.R") # Start Rbrul. rbrul() # # Open an Internet browser. # Go to "http://www.danielezrajohnson.com/ds.csv". # Save page on Desktop or in other directory as "ds.csv". # From the Rbrul main menu, choose "1" to "load/save data". 1 # If prompted to save current data, press "Enter" for "No". # Navigate to Desktop or other directory. [navigate] # Enter the number to the left of the "ds.csv" file. [number] # "What separates the columns?" Enter "c" for commas. c # The MAIN MENU should appear again, with "current data structure" as follows: # Current data structure: # r (integer with 2 values): 1 0 # store (factor with 3 values): Saks Macy's Klein's # emphasis (factor with 2 values): normal emphatic # word (factor with 2 values): fouRth flooR # # This is the Labov "fourth floor" department store data collected in 1962. # # Question 1: which of the factor groups has a statistically-significant effect on the use of /r/, and which of their factors favor and disfavor /r/, and to what extent? # # Enter "5-modeling" and the MODELING MENU will open. 5 # Enter "1" to "choose variables". 1 # Enter "1" to choose "r" as the response, or dependent variable. 1 # Press "Enter" as "r" is a binary response. # Enter "2" to choose "1" (presence of /r/) as the application value. 1 # Enter "2", "3", "4" to choose "store", "emphasis", and "word" as the potential predictors. 2 3 4 # "Are any of these predictors continuous?" The answer is no, they are all categorical/factors, so press "Enter". # "Any grouping factors (random effects)?" The answer is no, these are all ordinary fixed effects, so press "Enter". # "Consider an/another pairwise interaction between predictors?" Why not? Let's consider the possible interaction of any pair of the predictors. So enter "2", "3", "2", "4", "3", "4". 2 3 2 4 3 4 # # The MODELING MENU should now show, under "Current variables are:" # response.binary: r (1 vs. 0) # fixed.factor: store emphasis word # fixed.interaction: store:emphasis store:word emphasis:word # # Enter "5" for "step-up/step-down". 5 # A long output results. The most relevant part of the output, near the end, is: # # BEST STEP-UP MODEL WAS WITH store (1.08e-18) + word (8.18e-09) [A] # # STEP-UP AND STEP-DOWN MATCH! # # STEPPING DOWN: # # $store # factor logodds tokens 1/1+0 centered weight # Saks 0.900 177 0.475 0.711 # Macy's 0.436 336 0.372 0.607 # Klein's -1.337 216 0.097 0.208 # # $word # factor logodds tokens 1/1+0 centered weight # flooR 0.493 347 0.412 0.621 # fouRth -0.493 382 0.228 0.379 # # Both in step-up and step-down, the groups selected as significant are STORE and WORD. The significance levels are shown in parentheses (For example, 1.08e-18 for STORE means p = 1.08 x 10 raised to the power of -18). # # EMPHASIS is not selected, since its significance is only 0.07 (this figure can be seen at step-up Run 6 or step-down Run 9). # # The only interaction considered in the step-up model is STORE:WORD, after STORE and WORD are individually added. This interaction is not significant (p = 0.34). # In the step-down model, all the interaction effects are dropped first, as well as the EMPHASIS main effect. # # We see the same output as from GoldVarb in Exercise 1, in the "centered weight" column. For STORE, Saks strongly favors /r/ (.711), Macy's mildly favors /r/ (.607), and Klein's strongly disfavors /r/ (0.208). Within WORD, the unstressed "fourth" disfavors /r/ (0.379), while stressed "floor" favors /r/ (0.621). # # The Rbrul output also presents results in log-odds, and shows token numbers and response proportions in the same table as the factor weights. # # Enter "9" to return to the main menu. 9