This vignette covers valued network modeling in the `ergm`

framework, with an emphasis on count data. Examples pertinent to rank
data can be found in the vignette in the `ergm.rank`

package.

`network`

(Butts (2008))
objects have three types of attributes:

**network attributes**– attributes which pertain to the whole network and include such information as network size, directedness, and multiplicity;**vertex attributes**– attributes which pertain to the individual vertices in the network and include such information as vertex label, as well as group assignment or some other property of the individual being represented;**edge attributes**– attributes which pertain to edges in the network and include such information as edge value.

An edge attribute is only defined for edges that exist in the
network. Thus, in a matter of speaking, to set an edge value, one first
has to *create* an edge and then *set* its attribute.

As with network and vertex attributes, edge attributes that have been
set can be listed with `list.edge.attributes`

. Every network
has at least one edge attribute: `"na"`

, which, if set to
`TRUE`

, marks an edge as missing.

There are several ways to create valued networks for use with
`ergm`

. Here, we will demonstrate two of the most
straightforward approaches.

The first dataset that we’ll be using is the (in)famous Sampson’s
monks. Dataset `samplk`

in package `ergm`

contains
three (binary) networks: `samplk1`

, `samplk2`

, and
`samplk3`

, containing the Monks’ top-tree friendship
nominations at each of the three survey time points. We are going to
construct a valued network that pools these nominations.

**Method 1: From a sociomatrix** In many cases, a valued
sociomatrix is available (or can easily be constructed). In this case,
we’ll build one from the Sampson data.

```
library(ergm.count) # Also loads ergm.
data(samplk)
ls()
## [1] "samplk1" "samplk2" "samplk3"
```

```
as.matrix(samplk1)[1:5, 1:5]
## John Bosco Gregory Basil Peter Bonaventure
## John Bosco 0 0 1 0 1
## Gregory 1 0 0 0 0
## Basil 1 1 0 0 0
## Peter 0 0 0 0 1
## Bonaventure 0 0 0 1 0
```

```
# Create a sociomatrix totaling the nominations.
samplk.tot.m <- as.matrix(samplk1) + as.matrix(samplk2) + as.matrix(samplk3)
samplk.tot.m[1:5, 1:5]
## John Bosco Gregory Basil Peter Bonaventure
## John Bosco 0 1 2 0 2
## Gregory 3 0 0 0 0
## Basil 3 1 0 0 0
## Peter 0 0 0 0 3
## Bonaventure 1 0 0 3 0
```

```
# Create a network where the number of nominations becomes an attribute of an
# edge.
samplk.tot <- as.network(samplk.tot.m, directed = TRUE, matrix.type = "a", ignore.eval = FALSE,
names.eval = "nominations" # Important!
)
# Add vertex attributes. (Note that names were already imported!)
samplk.tot %v% "group" <- samplk1 %v% "group" # Groups identified by Sampson
samplk.tot %v% "group"
## [1] "Turks" "Turks" "Outcasts" "Loyal" "Loyal" "Loyal"
## [7] "Turks" "Waverers" "Loyal" "Waverers" "Loyal" "Turks"
## [13] "Waverers" "Turks" "Turks" "Turks" "Outcasts" "Outcasts"
```

```
# We can view the attribute as a sociomatrix.
as.matrix(samplk.tot, attrname = "nominations")[1:5, 1:5]
## John Bosco Gregory Basil Peter Bonaventure
## John Bosco 0 1 2 0 2
## Gregory 3 0 0 0 0
## Basil 3 1 0 0 0
## Peter 0 0 0 0 3
## Bonaventure 1 0 0 3 0
```

```
# Also, note that samplk.tot now has an edge if i nominated j *at least once*.
as.matrix(samplk.tot)[1:5, 1:5]
## John Bosco Gregory Basil Peter Bonaventure
## John Bosco 0 1 1 0 1
## Gregory 1 0 0 0 0
## Basil 1 1 0 0 0
## Peter 0 0 0 0 1
## Bonaventure 1 0 0 1 0
```

**Method 2: Form an edgelist** Sociomatrices are simple
to work with, but not very convenient for large, sparse networks. In the
latter case, edgelists are often preferred. For our present case,
suppose that instead of a sociomatrix we have an edgelist with
values:

```
samplk.tot.el <- as.matrix(samplk.tot, attrname = "nominations", matrix.type = "edgelist")
samplk.tot.el[1:5, ]
## [,1] [,2] [,3]
## [1,] 2 1 3
## [2,] 3 1 3
## [3,] 5 1 1
## [4,] 6 1 2
## [5,] 7 1 1
```

```
# and an initial empty network.
samplk.tot2 <- samplk1 # Copy samplk1
samplk.tot2[, ] <- 0 # Empty it out
samplk.tot2 #We could also have used network.initialize(18)
## Network attributes:
## vertices = 18
## directed = TRUE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 0
## missing edges= 0
## non-missing edges= 0
##
## Vertex attribute names:
## cloisterville group vertex.names
##
## No edge attributes
```

```
samplk.tot2[samplk.tot.el[, 1:2], names.eval = "nominations", add.edges = TRUE] <- samplk.tot.el[,
3]
as.matrix(samplk.tot2, attrname = "nominations")[1:5, 1:5]
## John Bosco Gregory Basil Peter Bonaventure
## John Bosco 0 1 2 0 2
## Gregory 3 0 0 0 0
## Basil 3 1 0 0 0
## Peter 0 0 0 0 3
## Bonaventure 1 0 0 3 0
```

In general, the construction
`net[i,j, names.eval="attrname", add.edges=TRUE] <- value`

can be used to modify individual edge values for attribute
`"attrname"`

. This way, we can also add more than one edge
attribute to a network. Note that network objects can support an almost
unlimited number of vertex, edge, or network attributes, and that these
attributes can contain any data type. (Not all data types are compatible
with all interface methods; see `?network`

and related
documentation for more information.)

The other dataset we’ll be using is almost as (in)famous Zachary’s Karate Club dataset. We will be employing here a collapsed multiplex network that counts the number of social contexts in which each pair of individuals associated with the Karate Club in question interacted. A total of 8 contexts were considered, but as the contexts themselves were determined by the network process, this limit itself can be argued to be endogenous.

Over the course of the study, the club split into two factions, one
led by the instructor (“Mr. Hi”) and the other led by the Club President
(“John A.”). Zachary also recorded the faction alignment of every
regular attendee in the club. This dataset is included in the
`ergm.count`

package, as `zach`

.

The `network`

’s `plot`

method for
`network`

s can be used to plot a sociogram of a network. When
plotting a valued network, we it is often useful to color the ties
depending on their value. Function `gray`

can be used to
generate a gradient of colors, with `gray(0)`

generating
black and `gray(1)`

generating white. This can then be passed
to the `edge.col`

argument of `plot.network`

.

**Sampson’s Monks** For the monks, let’s pass value data
using a matrix.

```
par(mar = rep(0, 4))
samplk.ecol <- matrix(gray(1 - (as.matrix(samplk.tot, attrname = "nominations")/3)),
nrow = network.size(samplk.tot))
plot(samplk.tot, edge.col = samplk.ecol, usecurve = TRUE, edge.curve = 1e-04, displaylabels = TRUE,
vertex.col = as.factor(samplk.tot %v% "group"))
```

Edge color can also be passed as a vector of colors corresponding to
edges. It’s more efficient, but the ordering in the vector must
correspond to `network`

object’s internal ordering, so it
should be used with care. Note that we can also vary line width and/or
transparency in the same manner:

```
par(mar = rep(0, 4))
valmat <- as.matrix(samplk.tot, attrname = "nominations") #Pull the edge values
samplk.ecol <- matrix(rgb(0, 0, 0, valmat/3), nrow = network.size(samplk.tot))
plot(samplk.tot, edge.col = samplk.ecol, usecurve = TRUE, edge.curve = 1e-04, displaylabels = TRUE,
vertex.col = as.factor(samplk.tot %v% "group"), edge.lwd = valmat^2)
```

`plot.network`

has may display options that can be used to
customize one’s data display; see `?plot.network`

for
more.

**Zachary’s Karate Club** In the following plot, we plot
those strongly aligned with Mr. Hi as red, those with John A. with
purple, those neutral as green, and those weakly aligned with colors in
between.

```
data(zach)
zach.ecol <- gray(1 - (zach %e% "contexts")/8)
zach.vcol <- rainbow(5)[zach %v% "faction.id" + 3]
par(mar = rep(0, 4))
plot(zach, edge.col = zach.ecol, vertex.col = zach.vcol, displaylabels = TRUE)
```

`ergm.count`

Many of the functions in package `ergm`

, including
`ergm`

, `simulate`

, and `summary`

, have
been extended to handle networks with valued relations. They switch into
this “valued” mode when passed the `response`

argument,
specifying the name of the edge attribute to use as the response
variable. For example, a new valued term `sum`

evaluates the
sum of the values of all of the relations: \(\sum_{{{(i,j)}\in\mathbb{Y}}}\boldsymbol{y}_{i,j}\).
So,

`summary(samplk.tot ~ sum)`

`## Error: ERGM term 'sum' function 'InitErgmTerm.sum' not found.`

produces an error (because no such term has been implemented for binary mode), while

```
summary(samplk.tot ~ sum, response = "nominations")
## sum
## 168
```

gives the summary statistics. We will introduce more statistics shortly. First, we need to introduce the notion of valued ERGMs.

For a more in-depth discussion of the following, see (Krivitsky (2012)).

Valued ERGMs differ from standard ERGMs in two related ways. First, the support of a valued ERGM (unlike its unvalued counterpart) is over a set of valued graphs; this is a substantial difference from the unvalued case, as valued graph support sets (even for fixed \(N\)) are often infinite (or even uncountable). Secondly, in defining a valued ERGM one must specify the reference measure (or distribution) with respect to which the model is defined. (In the unvalued case, there is a generic way to do this, which we employ tacitly – that is no longer the case for general valued ERGMs.) We discuss some of these issues further below.

Notationally, a valued ERGM (for discrete variables) looks like this: \[\text{Pr}_{h,\boldsymbol{g}}(\boldsymbol{Y}=\boldsymbol{y};\boldsymbol{\theta})=\frac{h(\boldsymbol{y})\exp\mathchoice{\left({\boldsymbol{\theta}{}}^\top{\boldsymbol{g}(\boldsymbol{y})}\right)}{({\boldsymbol{\theta}{}}^\top{\boldsymbol{g}(\boldsymbol{y})})}{({\boldsymbol{\theta}{}}^\top{\boldsymbol{g}(\boldsymbol{y})})}{({\boldsymbol{\theta}{}}^\top{\boldsymbol{g}(\boldsymbol{y})})}}{\kappa_{h,\boldsymbol{g}}(\boldsymbol{\theta})},\ {\boldsymbol{y}\in\mathcal{Y}},\] where \(\mathcal{Y}\) is the support. The normalizing constant is defined by \[\kappa_{h,\boldsymbol{g}}(\boldsymbol{\theta})=\sum_{\boldsymbol{y}\in\mathcal{Y}}h(\boldsymbol{y})\exp\mathchoice{\left({\boldsymbol{\theta}{}}^\top{\boldsymbol{g}(\boldsymbol{y})}\right)}{({\boldsymbol{\theta}{}}^\top{\boldsymbol{g}(\boldsymbol{y})})}{({\boldsymbol{\theta}{}}^\top{\boldsymbol{g}(\boldsymbol{y})})}{({\boldsymbol{\theta}{}}^\top{\boldsymbol{g}(\boldsymbol{y})})}.\] The similarity with ERGMs in the unvalued case is evident, notwithstanding the above caveats.

**New concept: a reference distribution** With binary
ERGMs, we only concern ourselves with presence and absence of ties among
actors — who is connected with whom? If we want to model values as well,
we need to think about who is connected with whom *and* how
strong or intense these connections are. In particular, we need to think
about how the values for connections we measure are distributed. The
reference distribution (a *reference measure*, for the
mathematically inclined) specifies the model for the data
*before* we add any ERGM terms, and is the first step in modeling
these values. The reference distribution is specified using a one-sided
formula as a `reference`

argument to an `ergm`

or
`simulate`

call. Running

`help("ergm-references")`

will list the choices implemented in the various packages, and are given as a one-sided formula.

Conceptually, it has two ingredients: the sample space and the
baseline distribution (\(h(\boldsymbol{y})\)). An ERGM that
“borrows” these from a distribution \(X\) for which we have a name is called an
*\(X\)-reference ERGM*.

**The sample space** For binary ERGMs, the sample space
(or support) \(\mathcal{Y}\) — the set
of possible networks that can occur — is usually some subset of \(2^\mathbb{Y}\), the set of all possible
ways in which relationships among the actors may occur.

For the sample space of valued ERGMs, we need to define \(\mathbb{S}\), the set of possible values
each relationship may take. For example, for count data, that’s \(\mathbb{S}=\{0,1,\dotsc, s \}\) if the
maximum count is \( s \) and \(\{0,1,\dotsc\}\) if there is no *a
priori* upper bound. Having specified that, \(\mathcal{Y}\) is defined as some subset of
\(\mathbb{S}^\mathbb{Y}\): the set of
possible ways to assign to each relationship a value.

As with binary ERGMs, other constraints like degree distribution may be imposed on \(\mathcal{Y}\).

\(h(\boldsymbol{y})\)**: The
baseline distribution** What difference does it make?

Suppose that we have a sample space with \(\mathbb{S}=\{0,1,2,3\}\) (e.g., number of monk–monk nominations) and let’s have one ERGM term: the sum of values of all relations: \(\sum_{{{(i,j)}\in\mathbb{Y}}}\boldsymbol{y}_{i,j}\): \[\text{Pr}_{h,\boldsymbol{g}}(\boldsymbol{Y}=\boldsymbol{y};\boldsymbol{\theta})\propto h(\boldsymbol{y})\exp\mathchoice{\left(\boldsymbol{\theta}{} \sum_{{{(i,j)}\in\mathbb{Y}}}\boldsymbol{y}_{i,j}\right)}{(\boldsymbol{\theta}{} \sum_{{{(i,j)}\in\mathbb{Y}}}\boldsymbol{y}_{i,j})}{(\boldsymbol{\theta}{} \sum_{{{(i,j)}\in\mathbb{Y}}}\boldsymbol{y}_{i,j})}{(\boldsymbol{\theta}{} \sum_{{{(i,j)}\in\mathbb{Y}}}\boldsymbol{y}_{i,j})}.\] There are two values for \(h(\boldsymbol{y})\) that might be familiar:

- \(h(\boldsymbol{y})=1\) (or any constant) \(\implies\) \(\boldsymbol{Y}\!_{i,j}{\stackrel{\mathrm{i.i.d.}}{\sim}}\, \text{Uniform or truncated geometric}\)
- \(h(\boldsymbol{y})=\binom{m}{\boldsymbol{y}_{i,j}}=\frac{m!}{\boldsymbol{y}_{i,j}!(m-\boldsymbol{y}_{i,j})!}\) \(\implies\) \(\boldsymbol{Y}\!_{i,j}{\stackrel{\mathrm{i.i.d.}}{\sim}}\, \text{Binomial}(m,\text{logit}^{-1}(\boldsymbol{\theta}))\)

What do they look like? Let’s simulate!

```
y <- network.initialize(2, directed = FALSE) # A network with one dyad!
## Discrete Uniform reference 0 coefficient: discrete uniform
sim.du3 <- simulate(y ~ sum, coef = 0, reference = ~DiscUnif(0, 3), response = "w",
output = "stats", nsim = 1000)
# Negative coefficient: truncated geometric skewed to the right
sim.trgeo.m1 <- simulate(y ~ sum, coef = -1, reference = ~DiscUnif(0, 3), response = "w",
output = "stats", nsim = 1000)
# Positive coefficient: truncated geometric skewed to the left
sim.trgeo.p1 <- simulate(y ~ sum, coef = +1, reference = ~DiscUnif(0, 3), response = "w",
output = "stats", nsim = 1000)
# Plot them:
par(mfrow = c(1, 3))
hist(sim.du3, breaks = diff(range(sim.du3)) * 4)
hist(sim.trgeo.m1, breaks = diff(range(sim.trgeo.m1)) * 4)
hist(sim.trgeo.p1, breaks = diff(range(sim.trgeo.p1)) * 4)
```

```
## Binomial reference 0 coefficient: Binomial(3,1/2)
sim.binom3 <- simulate(y ~ sum, coef = 0, reference = ~Binomial(3), response = "w",
output = "stats", nsim = 1000)
# -1 coefficient: Binomial(3, exp(-1)/(1+exp(-1)))
sim.binom3.m1 <- simulate(y ~ sum, coef = -1, reference = ~Binomial(3), response = "w",
output = "stats", nsim = 1000)
# +1 coefficient: Binomial(3, exp(1)/(1+exp(1)))
sim.binom3.p1 <- simulate(y ~ sum, coef = +1, reference = ~Binomial(3), response = "w",
output = "stats", nsim = 1000)
# Plot them:
par(mfrow = c(1, 3))
hist(sim.binom3, breaks = diff(range(sim.binom3)) * 4)
hist(sim.binom3.m1, breaks = diff(range(sim.binom3.m1)) * 4)
hist(sim.binom3.p1, breaks = diff(range(sim.binom3.p1)) * 4)
```

Now, suppose that we don’t have an *a priori* upper bound on
the counts — \(\mathbb{S}=\{0,1,\dotsc\}\) — then there
are two familiar reference distributions:

- \(h(\boldsymbol{y})=1\) (or any constant) \(\implies\) \(\boldsymbol{Y}\!_{i,j}{\stackrel{\mathrm{i.i.d.}}{\sim}}\, \text{Geometric}(p=1-\exp\mathchoice{\left(\boldsymbol{\theta}\right)}{(\boldsymbol{\theta})}{(\boldsymbol{\theta})}{(\boldsymbol{\theta})})\)
- \(h(\boldsymbol{y})=1/\prod_{{{(i,j)}\in\mathbb{Y}}}\boldsymbol{y}_{i,j}!\) \(\implies\) \(\boldsymbol{Y}\!_{i,j}{\stackrel{\mathrm{i.i.d.}}{\sim}}\, \text{Poisson}(\mu=\exp\mathchoice{\left(\boldsymbol{\theta}\right)}{(\boldsymbol{\theta})}{(\boldsymbol{\theta})}{(\boldsymbol{\theta})})\)

```
sim.geom <- simulate(y ~ sum, coef = log(2/3), reference = ~Geometric, response = "w",
output = "stats", nsim = 1000)
mean(sim.geom)
## [1] 1.926
```

```
sim.pois <- simulate(y ~ sum, coef = log(2), reference = ~Poisson, response = "w",
output = "stats", nsim = 1000)
mean(sim.pois)
## [1] 1.983
```

Similar means. But, what do they look like?

```
par(mfrow = c(1, 2))
hist(sim.geom, breaks = diff(range(sim.geom)) * 4)
hist(sim.pois, breaks = diff(range(sim.pois)) * 4)
```