I am having some issues trying to compete a LOESS regression with a data set. I have been able to properly create the line, but I am unable to get it to plot correctly.
I ran through the data like this.
animals.lo <- loess(X15p5 ~ Period, animals, weights = n.15p5)
animals.lo
summary(animals.lo)
plot(X15p5~ Period, animals)
lines(animals$X15p5, animals.lo, col="red")
At this point I received an error
"Error in xy.coords(x, y) : 'x' and 'y' lengths differ"
I searched around and read that this issue could be due to the points needing to be ordered, so I proceeded.
a <- order(animals$Period)
lines(animals$X15p5[a], animals.lo$Period[a], col="red", lwd=3)
There were no errors at this point, but the LOESS line was still not showing up in the plot. The points were displayed correctly, but not the line.
This is similar to the data set I am using...
structure(list(Site = c("Cat", "Dog", "Bear", "Chicken", "Cow",
"Bird", "Tiger", "Lion", "Leopard", "Wolf", "Puppy", "Kitten",
"Emu", "Ostrich", "Elephant", "Sheep", "Goat", "Fish", "Iguana",
"Monkey", "Gorilla", "Baboon", "Lemming", "Mouse", "Rat", "Hamster",
"Eagle", "Parrot", "Crow", "Dove", "Falcon", "Hawk", "Sparrow",
"Kite", "Chimpanzee", "Giraffe", "Bear", "Donkey", "Mule", "Horse",
"Zebra", "Ox", "Snake", "Cobra", "Iguana", "Lizard", "Fly", "Mosquito",
"Llama", "Butterfly", "Moth", "Worm", "Centipede", "Unicorn",
"Pegasus", "Griffin", "Ogre", "Monster", "Demon", "Witch", "Vampire",
"Mummy", "Ghoul", "Zombie"), Region = c(6L, 4L, 4L, 5L, 7L, 6L,
2L, 4L, 6L, 7L, 7L, 4L, 6L, 4L, 4L, 4L, 4L, 3L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 8L, 4L, 6L, 6L,
4L, 2L, 7L, 4L, 2L, 2L, 7L, 3L, 4L, 7L, 4L, 4L, 4L, 7L, 7L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 8L), Period = c(-2715, -3500,
-3500, -4933.333333, -2715, -2715, -2715, -3500, -2715, -4350,
-3500, -3500, -2950, -4350, -3650, -3500, -3500, -2715, -3650,
-4350, -3500, -3500, -3400, -4350, -3500, -3500, -4350, -3900,
-3808.333333, -4233.333333, -3500, -3900, -3958.333333, -3900,
-3500, -3500, -3500, -2715, -3650, -2715, -2715, -2715, -2715,
-3500, -2715, -2715, -3500, -4350, -3650, -3650, -4350, -5400,
-3500, -3958.333333, -3400, -3400, -4350, -3600, -4350, -3650,
-3500, -2715, -5400, -3500), Value = c(0.132625995, 0.163120567,
0.228840125, 0.154931973, 0.110047847, 0.054347826, 0.188679245,
0.245014245, 0.128378378, 0.021428571, 0.226277372, 0.176923077,
0.104938272, 0.17659805, 0.143798024, 0.086956522, 0.0625, 0.160714286,
0, 0.235588972, 0, 0, 0.208333333, 0.202247191, 0.364705882,
0.174757282, 0, 0.4, 0.1, 0.184027778, 0.232876712, 0.160493827,
0.74702381, 0.126984127, 0.080645161, 0.06557377, 0, 0.057692308,
0.285714286, 0.489361702, 0.108695652, 0.377777778, 0, 0.522727273,
0.024390244, 0.097560976, 0.275, 0, 0.0625, 0.255319149, 0.135135135,
0.216216216, 0.222222222, 0.296296296, 0.222222222, 0.146341463,
0.09375, 0.125, 0.041666667, 0.078947368, 0.2, 0.137931034, 0.571428571,
0.142857143), Sample_size = c(188.5, 105.75, 79.75, 70, 52.25,
46, 39.75, 39, 37, 35, 34.25, 32.5, 32.4, 30.76666667, 30.36666667,
28.75, 28, 28, 28, 26.6, 25, 25, 24, 22.25, 21.25, 20.6, 20,
20, 20, 19.2, 18.25, 18, 18, 16.8, 15.5, 15.25, 15, 13, 12.6,
11.75, 11.5, 11.25, 11, 11, 10.25, 10.25, 10, 10, 9.6, 9.4, 9.25,
9.25, 9, 9, 9, 8.2, 8, 8, 8, 7.6, 7.5, 7.25, 7, 7), Sample_sub = c(25,
17.25, 18.25, 10.8452381, 5.75, 2.5, 7.5, 9.555555556, 4.75,
0.75, 7.75, 5.75, 3.4, 5.433333333, 4.366666667, 2.5, 1.75, 4.5,
0, 6.266666667, 0, 0, 5, 4.5, 7.75, 3.6, 0, 8, 2, 3.533333333,
4.25, 2.888888889, 13.44642857, 2.133333333, 1.25, 1, 0, 0.75,
3.6, 5.75, 1.25, 4.25, 0, 5.75, 0.25, 1, 2.75, 0, 0.6, 2.4, 1.25,
2, 2, 2.666666667, 2, 1.2, 0.75, 1, 0.333333333, 0.6, 1.5, 1,
4, 1)), .Names = c("Site", "Region", "Period", "Value", "Sample_size",
"Sample_sub"), class = "data.frame", row.names = c(NA, -64L))
I have been working for this a while and trying to read up as much as I can, but I haven't been able to make any additional headway. Any advice or guidance would be greatly appreciated.
Follow-up on adding confidence interval to plot
I have been trying to add in confidence intervals following another example found on the site on this page How to get the confidence intervals for LOWESS fit using R? .
The example given on that page is:
plot(cars)
plx<-predict(loess(cars$dist ~ cars$speed), se=T)
lines(cars$speed,plx$fit)
lines(cars$speed,plx$fit - qt(0.975,plx$df)*plx$se, lty=2)
lines(cars$speed,plx$fit + qt(0.975,plx$df)*plx$se, lty=2)
I adapted that as this:
plot(X15p5 ~ Period, animals)
animals.lo2<-predict(loess(animals$X15p5 ~ animals$Period), se=T)
a <- order(animals$Period)
lines(animals$Period[a],animals.lo2$fit, col="red", lwd=3)
lines(animals$Period[a],animals.lo2$fit - qt(0.975,animals.lo2$df)*animals.lo2$se, lty=2)
lines(animals$Period[a],animals.lo2$fit + qt(0.975,animals.lo2$df)*animals.lo2$se, lty=2)
Although this does provide confidence intervals, the regression line is all wrong. I'm not sure if it is an issue with the predict
function, or another issue. Thanks again!