1

I have some German data that contains umlaut, most importantly it seems to have wrong encoding.

Loading with read.dta

If I try straight on

> t <- read.dta(fileName)
Error in factor(rval[[v]], levels = tt[[ll[v]]], labels = names(tt[[ll[v]]])) : 
  invalid 'labels'; length 4 should be 1 or 3

So I do instead

t <- read.dta(fileName, convert.factors = FALSE)
> head(t)
    persnr    betnr idnum    begorig    endorig     begepi     endepi frau gebjahr nation nation_gr famst
1 65170081 51705278    36 2000-01-01 2000-12-31 2000-01-01 2000-12-31    0    1967      0        10    NA
2 65170081 51705278    36 2001-01-01 2001-12-31 2001-01-01 2001-12-31    0    1967      0        10    NA
3 65170081 51705278    36 2002-01-01 2002-12-31 2002-01-01 2002-12-31    0    1967      0        10    NA
4 65170081 51705278    36 2003-01-01 2003-12-31 2003-01-01 2003-12-31    0    1967      0        10    NA
5 65170081 51705278    36 2004-01-01 2004-12-31 2004-01-01 2004-12-31    0    1967      0        10    NA
6 65170081 51705278    36 2005-01-01 2005-12-31 2005-01-01 2005-12-31    0    1967      0        10    NA

Loading with read_dta

Here's using the haven package:

>x <- read_dta(fileName)
>head(x)
Error: `x` and `labels` must be same type
> str(pers)
Classes 'tbl_df', 'tbl' and 'data.frame':   361921 obs. of  45 variables:

I don't understand the Error that I'm getting when using head(). I get the same error when trying to convert it to a data table using

data.table(read_dta(fileName))

When I do this, I will first see the error, and R will subsequently crash.

Test data:

The data file is contained in this zip file and is called LIAB_lm_9310_v1_pers.dta.

4

0 回答 0