Notes on Further Customizing Graphics for Assignment 1

Make sure you have installed the package lattice and then next called it into memory, with the library(lattice) function.

A good explanation of the lattice package is found at http://www.statmethods.net/advgraphs/trellis.html

library(lattice)
## Warning: package 'lattice' was built under R version 3.1.3

Load the data

site<-"http://faculty.gvsu.edu/kilburnw/PLS300_files/Lab2data.RData" 
load(file=url(site)) # Loads it into R

And then we'll jump ahead to a histogram, with the dataset and variable name (identified with poll$hclintonft; poll is the dataset (or 'data frame' in R-speak):

histogram(~poll$hclintonft, main="Histogram, Feelings toward Hillary Clinton, 2004", xlab="feeling thermometer score")

plot of chunk unnamed-chunk-3

Change the interval (or bin) width with nint= —- but using 5, as in this example is not a good idea!

histogram(~poll$hclintonft, main="Histogram, Feelings toward Hillary Clinton, 2004", xlab="feeling thermometer score", nint=5)

plot of chunk unnamed-chunk-4

Feelings toward H Clinton by gender:

histogram( ~poll$hclintonft | factor(poll$gender), main="Histogram, Feelings toward Hillary Clinton, 2004", xlab="feeling thermometer score")

plot of chunk unnamed-chunk-5

Then we can structure a 3-way comparison, by adding an additional variable behind factor(poll$gender), in the form FACTOR1 * FACTOR2, which here is |factor(poll$gender) * factor(poll$party).

histogram( ~ poll$hclintonft |factor(poll$gender) * factor(poll$party), main="Histogram, Feelings toward Hillary Clinton, 2004", xlab="feeling thermometer score")

plot of chunk unnamed-chunk-6

# trying out black and white: 
trellis.device(color=FALSE)
histogram( ~ poll$hclintonft |factor(poll$gender) * factor(poll$party), main="Histogram, Feelings toward Hillary Clinton, 2004", xlab="feeling thermometer score")

plot of chunk unnamed-chunk-8

# back to color: 
trellis.device(color=TRUE)

On to density plots. We'll switch back to using data=poll, which saves a little typing, since we are adding multiple social groups to the density plot:

densityplot(~ jewsft + catholicsft + muslimsft, data = poll,  plot.points = FALSE, auto.key=TRUE)

plot of chunk unnamed-chunk-10

You can replace the variable names in the legend with a few modifications to the auto.key() function. We use a combine function, c(), to combine the names – in the order in which the variables appear in the densityplot statement:

densityplot(~ jewsft + catholicsft + muslimsft, data = poll,  plot.points = FALSE, auto.key=list(text=c("Jews", "Catholics", "Muslims")) )

plot of chunk unnamed-chunk-11

The plot.points=FALSE option is useful, since otherwise you end up with

densityplot(~ jewsft + catholicsft + muslimsft, data = poll, auto.key=list(text=c("Jews", "Catholics", "Muslims")) )

plot of chunk unnamed-chunk-12

To panel responses across levels of a factor, use the | FACTOR syntax, or try use groups=, as below:

densityplot(~ jewsft + catholicsft + muslimsft, data = poll, groups = gender,  plot.points = FALSE, auto.key=TRUE)

plot of chunk unnamed-chunk-13

nes12$party<-droplevels(nes12$pid_self)
levels(nes12$party)<-c("Democrat", "Republican", "Independent")

In this example, I combine feelings toward Hillary Clinton across multiple years:

densityplot(~ poll$hclintonft + nes08$hclintonft + nes12$ft_hclinton, plot.points = FALSE, auto.key=list(text=c("H Clinton 2004", "H Clinton 2008", "H Clinton 2012")) )

plot of chunk unnamed-chunk-15

The options for title=“”, xlab=“”, and ylab=“”, work for all functions, including bwplot and boxplot.

# Plots can be structured for multiple comparisons, such as across males and females by party ID
bwplot(factor(partyid) ~ cheneyft | gender, data=poll)

plot of chunk unnamed-chunk-16

# In this case above, though, the variation across panels is subtle. 

## the boxplot() function, instead of bwplot() is better:
boxplot(cheneyft ~ partyid, data=poll)

plot of chunk unnamed-chunk-16

boxplot(cheneyft ~ party, data=poll, main="Feelings toward VP Cheney", xlab="Party Identification") # party is a three point part identification scale

plot of chunk unnamed-chunk-16

boxplot(feministsft ~ partyid, data=poll)

plot of chunk unnamed-chunk-16

Now, in the figure below, notice how a basic scatterplot – xyplot() overdraws points one on top of each other.

# A scatterplot of Bush's by Cheney's feeling thermometer scores
xyplot(bushft ~ cheneyft, data=poll, ylab="Bush", xlab="Cheney", main="Feeling Thermometer Comparisons") 

plot of chunk unnamed-chunk-17

Some points are overdrawn — one on top of the other.

Jittering adds random noise to each point so that we can see where the points cluster together.

xyplot(jitter(bushft) ~ jitter(cheneyft), data=poll, ylab="Bush", xlab="Cheney", main="Feeling Thermometer Comparisons") 

plot of chunk unnamed-chunk-18

# You can increase the jitteryness with the "factor=" option
xyplot(jitter(bushft, factor=2) ~ jitter(cheneyft, factor=2), data=poll, ylab="Bush", xlab="Cheney", main="Feeling Thermometer Comparisons") 

plot of chunk unnamed-chunk-18