1

I strive to produce a visual dynamic animation of timestamped transactions, where each transaction represents a contribution of a person to an artifact/file. To this end, I am using the R packages networkDynamic, network and ndtv.

The transactions have (in contrast to the examples in the networkDynamic package vignette) "real" timestamps. I want to wrap the rendering process inside a function that

  • starts rendering at the beginning of a "natural time frame" such as a day or a week (which most probably is not the timestamp of the first event)
  • renders "natural" labels to the players timeline instead of integers
  • uses "natural" slices such as a week/month/year based on the input data

I think I have managed to make the first slice start at the beginning of the week of the first event using lubridates floor_date. I have not looked into the last issue yet (labelling), because unfortunately, I have troubles to determine proper slicing parameters for my data set.

Please find below a reproducible example for RStudio. The example includes three lists named slice.par, one that does work, and two that don't. Simply hardcoding a parameter configuration that (only) works with the concrete example is not my goal, firstly because my real data set is much bigger (and therefore 'playing around' with the parameters costs much time) and secondly because I would like to have a function that works with many different data sets.

if (!require("pacman")) install.packages("pacman")
library("pacman")
pacman::p_load(network, networkDynamic, ndtv, lubridate)

UtilNumericAsDate <- function(nuUnixTimestamp) {
    return(as.POSIXct(nuUnixTimestamp, origin = "1970-01-01 00:00.00 UTC", tz = "UTC"))
}

UtilDateAsNumeric <- function(oTimestamp) {
    return(as.numeric(as.POSIXct(oTimestamp)))
}

stTransac <- "
'contributorId', 'artifactId', 'weight', 'instantId'
'A', 'a1', '1', '2003-06-01 23:09:40'
'A', 'a2', '1', '2004-02-27 11:48:41'
'A', 'a1', '2', '2006-06-25 20:36:49'
'A', 'a3', '1', '2007-01-28 00:35:31'
'A', 'a3', '2', '2007-04-25 16:03:57'
'A', 'a3', '3', '2007-07-19 19:43:49'
'B', 'a1', '1', '2008-02-06 12:37:56'
'C', 'a3', '1', '2008-04-07 02:27:36'
'C', 'a2', '1', '2008-06-01 02:15:35'
'C', 'a2', '2', '2008-10-05 02:32:45'
'B', 'a1', '2', '2009-06-22 01:57:45'
'C', 'a4', '1', '2009-09-15 02:56:33'
'C', 'a5', '1', '2010-06-30 19:42:25'
'C', 'a6', '1', '2011-06-12 23:58:17'
'B', 'a3', '1', '2013-08-30 19:34:28'
'C', 'a1', '1', '2014-10-23 20:49:54'
'C', 'a1', '2', '2014-10-24 16:46:07'
'A', 'a2', '2', '2015-09-26 16:58:17'
'A', 'a7', '1', '2015-10-04 17:40:12'
'A', 'a8', '1', '2015-12-02 10:55:47'
"

dfTransac <- read.csv(text = stTransac, sep = "," , quote = '\'' , strip.white = TRUE, stringsAsFactors = FALSE)

dfEdges <- unique(dfTransac[,1:2])
veUniqueContributors   <- unique(dfEdges[[1]])
veUniqueArtifacts      <- unique(dfEdges[[2]])
nuNrUniqueContributors <- length(veUniqueContributors)
nuNrUniqueArtifacts    <- length(veUniqueArtifacts)

net <- network.initialize(0, directed = TRUE, bipartite = length(veUniqueContributors))

add.vertices.networkDynamic(net, nuNrUniqueContributors, vertex.pid = veUniqueContributors)
add.vertices.networkDynamic(net, nuNrUniqueArtifacts, vertex.pid = veUniqueArtifacts)

net %v% "vertex.names" <- c(veUniqueContributors, veUniqueArtifacts)
net %v% "vertex.type"  <- c(rep("p", length(veUniqueContributors)), rep("a", length(veUniqueArtifacts)))
net %v% "vertex.col"   <- c(rep("blue", length(veUniqueContributors)), rep("gray", length(veUniqueArtifacts)))
net %v% "vertex.sides" <- c(rep(8, length(veUniqueContributors)), rep(4, length(veUniqueArtifacts)))
net %v% "vertex.rot"   <- c(rep(0, length(veUniqueContributors)), rep(45, length(veUniqueArtifacts)))
net %v% "vertex.lwd"   <- c(rep(1, length(veUniqueContributors)), rep(0, length(veUniqueArtifacts)))
net %v% "vertex.cex"   <- c(rep(2, length(veUniqueContributors)), rep(1, length(veUniqueArtifacts)))
set.network.attribute(net,'vertex.pid','vertex.names')
set.network.attribute(net,'edge.pid','edge.names')

add.edges.networkDynamic(net,
                         tail = get.vertex.id(net, dfEdges[[1]]),
                         head = get.vertex.id(net, dfEdges[[2]]),
                         edge.pid = paste0(dfEdges[[1]], "->", dfEdges[[2]]))

activate.edges(net,
               e = get.edge.id(net, paste0(dfTransac[["contributorId"]], "->", dfTransac[["artifactId"]])),
               at = UtilDateAsNumeric(dfTransac$instantId))

activate.edge.attribute(net,
                        prefix = "weight",
                        value = dfTransac$weight,
                        e = get.edge.id(net, paste0(dfTransac[["contributorId"]], "->", dfTransac[["artifactId"]])),
                        at = UtilDateAsNumeric(dfTransac$instantId))

reconcile.vertex.activity(net = net, mode = "encompass.edges", edge.active.default = FALSE)

nuStart      <- range(get.change.times(net, ignore.inf = FALSE))[1]
nuEnd        <- range(get.change.times(net, ignore.inf = FALSE))[2]

nuWeekStart  <- UtilDateAsNumeric(floor_date(UtilNumericAsDate(nuStart), "week"))
nuWeekEnd    <- UtilDateAsNumeric(ceiling_date(UtilNumericAsDate(nuEnd), "week"))

# This doesn't work: "Monthly" slices, 5 year aggregation
# Error: Attribute 'vertex.sides' had illegal missing values for vertex.sides or was not present in plot.network.default.
slice.par <- list(start = nuWeekStart,
                          end = nuWeekEnd,
                          interval = 1*60*60*24*7*4.5,
                          aggregate.dur = 1*60*60*24*7*52*5,
                          rule = "any")

# This doesn't work either: "Bimonthly" slices, "Bimonthly" aggregation
# Error: Attribute 'weight' had illegal missing values for edge.lwd or was not present in plot.network.default.
slice.par <- list(start = nuWeekStart,
                          end = nuWeekEnd,
                          interval = 1*60*60*24*7*4.5*2,
                          aggregate.dur = 1*60*60*24*7*4.5*2,
                          rule = "any")

# This works: "Bimonthly" slices, 5 year aggregation
slice.par <- list(start = nuWeekStart,
                          end = nuWeekEnd,
                          interval = 1*60*60*24*7*4.5*2,
                          aggregate.dur = 1*60*60*24*7*52*5,
                          rule = "any")

compute.animation(net, animation.mode = "kamadakawai", slice.par = slice.par, default.dist = 10)

render.d3movie(net,
               slice.par = slice.par,
               displaylabels = TRUE,
               output.mode = "htmlWidget",
               usearrows = TRUE,
               vertex.col = 'vertex.col',
               vertex.sides = 'vertex.sides',
               vertex.cex = 'vertex.cex',
               vertex.rot = 'vertex.rot',
               edge.lwd = 'weight',
               render.par = list(tween.frames = 10, show.time = TRUE))

How can I derive proper slicing parameters from the data set so that the rendering process does not choke on individual slices that miss attributes or edges without simply increasing the aggregation duration?

4

1 回答 1

1

正如您已经确定的那样,该render.d3movie功能存在错误。它正在尝试查找“空”切片(不包含活动顶点的时间范围)的值,请参阅Github 上的错误报告。(我实际上无法使用您上面的代码重现错误,但这绝对是一个错误,感谢您的报告)

如何从数据集中导出适当的切片参数,以使渲染过程不会在缺少属性或边缘的单个切片上阻塞,而无需简单地增加聚合持续时间?

在错误修复之前(希望很快),您可以

a)render.animation改用

b) 选择切片参数以确保网络没有活动顶点的切片。您可以使用该timeline功能查看切片的位置。例如,要显示节点(蓝色)和边缘(紫色)的活动拼写以及切片箱(垂直灰色条):

timeline(net,slice.par=slice.par,main='timeline plot of activity spells')

在此处输入图像描述

c) 最好的解决方案可能是调整顶点活动以确保在每个时间片中始终有一个顶点活动。在这种情况下,有些顶点的持续时间很短,reconcile.vertex.activity因为它们只包含持续时间很短的边。使用不同的规则可能会避免这种情况,或者将顶点设置为一旦出现就始终处于活动状态(如果这对您的数据有意义)。

其他一些注意事项:

您可能还需要将slice.par$rule值设置为,earliest而不是any这样当它weight在边缘上对动态属性进行分箱时遇到多个可能的值时,它会知道选择哪一个。

顺便说一句,我认为可能有一种更紧凑的方法来使用networkDynamic实用程序函数构建网络并传入stTransac,并且在加载大型数据集时可能会更快。

于 2016-09-02T18:38:38.623 回答