Perhaps this helps:
df <- data.frame(group=c("a","b","c","a","b","c","a","b","c"),
var1=1:9, var2=c(1,2,3,NA,5,6,7,8,9))
with(df, length(cbind(var1, var2)))
> with(df, length(cbind(var1, var2)))
[1] 18
length() treats cbind(var1, var2) as a matrix, which is just a vector with dimensions, hence you get the length reported as prod(nrow(mat), ncol(mat)) where mat is the resulting matrix.
Ideally you'd use nrow() instead of length(), but perhaps more widely applicable is the NROW() function, which will treat a vector as a 1-column matrix for purposes of evaluating the function. nrow() won't work for a vector input
> nrow(1:10)
NULL
E.g. try these:
aggregate(cbind(var1,var2) ~ group, df, NROW)
aggregate(var1 ~ group, df, NROW)
> aggregate(cbind(var1,var2) ~ group, df, NROW)
group var1 var2
1 a 2 2
2 b 3 3
3 c 3 3
> aggregate(var1 ~ group, df, NROW)
group var1
1 a 3
2 b 3
3 c 3
and as you have NA, you probably don't want the incomplete cases removed, which would happen by default. This is seen above and hence why the number of rows for group a is 2. For that add na.action = na.pass to the call:
aggregate(cbind(var1,var2) ~ group, df, NROW, na.action = na.pass)
> aggregate(cbind(var1,var2) ~ group, df, NROW, na.action = na.pass)
group var1 var2
1 a 3 3
2 b 3 3
3 c 3 3
The issues is that in building up the data frame to pass to aggregate.data.frame, the usual model frame generation process takes place and aggregate.formula has the na.action argument set to na.omit by default - which is standard behaviour in modelling functions that use formula interfaces.
If you want to count the number of non-NA values per variable then you need a completely different approach, perhaps using is.na(), as in
foo <- function(x) sum(!is.na(x))
aggregate(cbind(var1,var2) ~ group, df, foo, na.action = na.pass)
> aggregate(cbind(var1,var2) ~ group, df, foo, na.action = na.pass)
group var1 var2
1 a 3 2
2 b 3 3
3 c 3 3
Which works by counting the number of non-NA values through coercion of first TRUE -> FALSE via ! and then resulting TRUEs are converted to 1 and FALSEs to 0, which sum() then adds for us.