使用 -package 中的SLID
收入数据集car
:
library(car)
dat <- SLID[!is.na(SLID$wage),] # Remove missing values
dat$income <- dat$wage*40*50 # "translate" the wages to their full time annual earnings equivalent.
dat$id <- seq(1,nrow(dat))
# Create a data.frame with a person ID and their annual income:
keep <- data.frame(id = seq(1, nrow(dat)),
income = dat$income)
keep <- keep[order(keep$income, decreasing = TRUE),] # Descending ordering according to income
keep$accum <- cumsum(keep$income) # Cumulative sum of the descending incomes
keep$pct <- keep$accum/sum(keep$income)*100 # % of the total income
keep$check <- keep$pct<80 # Check where the % is smaller than 80%
threshold <- min(which(keep$check == FALSE)) # First line where % is larger than 80%
border <- threshold/nrow(keep)*100 # Check which percentile that was
border <- round(border, digits = 2)
paste0(border, "% of the people earn 80% of the income")
#[1] "62.41% of the people earn 80% of the income"
正如我们所期望的那样,经典的 80-20 规则将显示“20% 的人赚取 80% 的收入”。如您所见,此规则不适用于此处..
颠倒的论点:
# The 20% of the people earn X % of total income:
linenr <- ceiling(1/5*nrow(keep))
outcome2 <- round(keep$pct[linenr], digits = 2)
paste0(outcome2, "% of total income is earned by the top 20% of the people")
# [1] "36.07% of total income is earned by the top 20% of the people"
请注意,此处提供的数字并不代表现实世界:)
此外,维基百科有更多关于帕累托原则的信息,也称为 80-20 规则。似乎这条规则出现在多种环境中,例如商业、经济和数学。