Context
This question concerns sequence analysis using TraMineR
package. The package offers automatic transformation of temporal sequences (statuses in time) to event sequences (changes between statuses in time). One of the recurrent issues in my analyses concerns the options to distinguish events of change between equal statuses.
Question-specific example
Suppose we have sequences of employment statuses, e.g. work, unemployment, inactivity, retirement. The analysis is focused on career transitions, distinguishing between stable and transitional careers. All kinds of transitions are relevant, from work to unemployment, inactivity to work, but also (and most importantly) from work to work!
Question
For TraMineR
an event takes place when a status in a sequence is changed. For instance, the respondent had 3 years of work and then 1 in unemployment: Work-Work-Work-Unemployment (assuming annual interval). This is the STS format, representing statuses in time. However, in SPELL format we have additional information, e.g:
Status Time1 Time2
Work 1 2
Work 2 3
Work 3 3
Unemployment 3 4
From the table above we can clearly see that two work-to-work transition events have occurred (otherwise there would be just one line: Work from 1 to 3). The question is whether there is any convenient way to extract an event object from the sequence object based on these data.
Data
My data contains work-related respondent statuses in the SPELL format (status, begin & end time), like this:
to.SO <- structure(list(ID = c(10, 11, 11, 12, 13, 13, 13, 13, 14, 14,
14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15), status = c(1,
1, 1, 1, 1, 1, 1, 1, 2, 3, 1, 2, 3, 2, 3, 1, 1, 1, 3, 1, 3, 3,
1, 3), time1 = c(1, 1, 104, 1, 1, 60, 109, 121, 1, 42, 47, 54,
64, 72, 78, 85, 116, 1, 29, 39, 69, 74, 78, 88), time2 = c(125,
104, 125, 125, 60, 109, 121, 125, 42, 47, 54, 64, 72, 78, 85,
116, 125, 29, 39, 69, 74, 78, 88, 125)), .Names = c("ID", "status",
"time1", "time2"), row.names = 10:33, class = "data.frame")
What I have tried
As per this post I must convert SPELL to STS first, then define sequences:
sts.data <- seqformat(data=to.SO,from="SPELL",to="STS",
id="ID",begin="time1",end="time2",status="status",
limit=125,process=FALSE)
sts.seq <- seqdef(sts.data,right="DEL")
alphabed <- c("Work","Study","Unemployed")
alphabet(sts.seq) <- alphabed
The information I require is already lost at this step, but until the bug (see link) is resolved there is no other way. It still shows what I want to achieve:
sts.seqe <- seqecreate(sts.seq) # creating events
sts.seqe
My results
Here, the first four event sequences are identical. If you look at the SPELL data (to.SO), it is apparent that there are multiple work-to-work transitions involved for respondents with id 11 and 13. In my other article I solve this by ascribing different statuses to job-1, job-2 and so forth. It is a less desirable strategy however, since it (1) explodes the number of statuses making subsequent dissimilarity analysis difficult and (2) is not theoretically important which job in career it is, the status of employment alone should cover it.
Thanks
I imagine this goes beyond the existing package capabilities, but perhaps I am missing something. Thanks in advance for reading this long post (at least) and for having any suggestions.