The halk package is a suite of functions built for estimating age of organisms (namely fish) based on empirically measured size. One main implementation of this is a hierarchical age-length key, also known as a HALK.
The HALK is a data-borrowing age estimation method primarily used in fisheries ecology. It extends the traditional method of an age-length key (ALK) by borrowing data across time, space, or any other nested level to create nested ALKs used to estimate age of fish from empirically measured length. For example, if you have survey data for which length is measured, but no age sub-samples taken, you can still get some information on age by borrowing data from the same lake in different years, or different nearby lakes.
A HALK is created by passing paired age-length data to the
make_halk
function. There are two main arguments to this
function: data
, which represents the paired age-length
data, and levels
, which is a character vector of the column
names that represent the different nested levels in the HALK. For
example, in the following data, you can pass any combination of
spp
, county
, and waterbody
as
levels:
#> # A tibble: 6 × 5
#> spp county waterbody age length
#> <chr> <chr> <chr> <int> <dbl>
#> 1 bluegill county_A lake_a 0 1
#> 2 bluegill county_A lake_a 0 1
#> 3 bluegill county_A lake_a 0 0.9
#> 4 bluegill county_A lake_a 0 1
#> 5 bluegill county_A lake_a 0 1.1
#> 6 bluegill county_A lake_a 0 1.1
This will fit a HALK based on the user specified levels. Say that we
include spp
, county
and waterbody
as levels to the function make_halk
. This will create an
ALK for each waterbody, each county, and then a species-wide global
ALK.
<- make_halk(
spp_county_wb_alk
wb_spp_laa_data, levels = c("spp", "county", "waterbody")
)head(spp_county_wb_alk)
#> # A tibble: 6 × 4
#> spp county waterbody alk
#> <chr> <chr> <chr> <list>
#> 1 bluegill county_A lake_a <alk [10 × 10]>
#> 2 bluegill county_A lake_b <alk [12 × 10]>
#> 3 bluegill county_A lake_c <alk [11 × 10]>
#> 4 bluegill county_A <NA> <alk [13 × 10]>
#> 5 bluegill county_B lake_a <alk [11 × 10]>
#> 6 bluegill county_B lake_b <alk [12 × 10]>
The returned tibble contains a list-column named alk
that stores an ALK for each level provided to the levels
argument (note that the ALK for county_A
has an NA in the
waterbody column indicating that this is a county-wide ALK). Each object
in this list-column is simply an ALK that is created using all data from
the level indicated by the respective non-NA columns in that row.
# Bluegill ALK for lake_a in county_A, from row #1 above
head(spp_county_wb_alk$alk[[1]])
#> # A tibble: 6 × 10
#> length age0 age1 age2 age3 age4 age5 age6 age7 age8
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0 1 0 0 0 0 0 0 0 0
#> 2 1 1 0 0 0 0 0 0 0 0
#> 3 3 0 1 0 0 0 0 0 0 0
#> 4 4 0 0.918 0.0816 0 0 0 0 0 0
#> 5 5 0 0.0870 0.870 0.0435 0 0 0 0 0
#> 6 6 0 0 0.756 0.244 0 0 0 0 0
The halk package makes it easy to get age assigment from a HALK using
the assign_ages
function. Once you have created a HALK,
simply pass it to assign_ages
along with the length data
you wish to have ages estimated on—make sure that your length data has
all columns used in the levels
argument used in
make_halk
.
<- assign_ages(wb_spp_length_data, spp_county_wb_alk)
est_ages head(est_ages)
#> # A tibble: 6 × 7
#> spp county waterbody length est.age alk alk.n
#> <chr> <chr> <chr> <dbl> <dbl> <chr> <int>
#> 1 bluegill county_A lake_a 1 0 waterbody 371
#> 2 bluegill county_A lake_a 1 0 waterbody 371
#> 3 bluegill county_A lake_a 0.9 0 waterbody 371
#> 4 bluegill county_A lake_a 1.1 0 waterbody 371
#> 5 bluegill county_A lake_a 1.1 0 waterbody 371
#> 6 bluegill county_A lake_a 1 0 waterbody 371
Notice that there are lakes in the est_ages
object that
were not present in the original length-at-age data used to create the
spp_county_wb_alk
object. Ages for these lengths are
estimated at the county-wide level.
head(est_ages[est_ages$waterbody == "lake_x", ])
#> # A tibble: 6 × 7
#> spp county waterbody length est.age alk alk.n
#> <chr> <chr> <chr> <dbl> <dbl> <chr> <int>
#> 1 bluegill county_A lake_x 1 0 county 1088
#> 2 bluegill county_A lake_x 1.3 0 county 1088
#> 3 bluegill county_A lake_x 0.8 0 county 1088
#> 4 bluegill county_A lake_x 1 0 county 1088
#> 5 bluegill county_A lake_x 1.1 0 county 1088
#> 6 bluegill county_A lake_x 1.2 0 county 1088