The IgAScores package provides functions to calculate IgA binding indices from IgA-Seq data sets.

In IgA-Seq, bacteria within a sample are stained with an anti-IgA antibody and sorted into bound (IgA+) and unbound (IgA-) fractions. The taxonomy of these bacteria is then profiled using 16S rRNA gene sequencing of DNA amplified from the sorted fractions. The functions in this package generate scores of relative binding per taxon per sample from the resulting data.

We recommend using the Probability Ratio to score IgA binding but make other scoring approaches available for comparison. See the IgAScores paper for a detailed consideration of IgA scoring methods.

The different scoring approaches require different inputs these can include:

- The abundance of the taxon in the IgA+ fraction -
**IgA+ abundance** - The abundance of the taxon in the IgA- fraction -
**IgA- abundance** - The fraction of all bacteria in the sample that are sorted into the IgA+ fraction -
**IgA+ size** - The fraction of all bacteria in the sample that are sorted into the IgA- fraction -
**IgA- size** - The abundance of the taxon in the sample pre-sorting -
**Presort abundance** - A pseudo count to add to zero values -
**Pseudo**

Here we generate some dummy data to demonstrate the IgAScores functions.

```
#load in IgAScores
library(IgAScores)
#dataframes with counts for the bacterial taxa in the IgA+ and IgA- fractions and presort sample, as would be produced by 16S rRNA appraoches such as DADA2
igapos <- data.frame(Sample1=c(100,0,1,2,10),Sample2=c(110,0,11,42,50),Sample3=c(140,60,10,3,0))
iganeg <- data.frame(Sample1=c(200,0,40,20,4),Sample2=c(10,30,110,2,5),Sample3=c(30,20,0,123,20))
presort <- data.frame(Sample1=c(150,10,50,30,5),Sample2=c(100,30,115,20,10),Sample3=c(30,20,10,100,25))
taxnames <- c("Taxon1","Taxon2","Taxon3","Taxon4","Taxon5")
rownames(igapos) <- taxnames
rownames(iganeg) <- taxnames
rownames(presort) <- taxnames
#convert the counts to relative abundances using the included helper function
igapos <- relabund(igapos)
iganeg <- relabund(iganeg)
presort <- relabund(presort)
#iga+ and iga- fraction sizes per sample (fraction, if a percentage divide by 100)
possize <- c(Sample1=0.04,Sample2=0.05,Sample3=0.03)
negsize <- c(Sample1=0.54,Sample2=0.47,Sample3=0.33)
#set a pseudo count for handling zero values in some scoring methods
#this should be of a similar value of the minimum non-zero observed value (e.g. if minum values is 0.007 use 0.001)
pseudo <- 0.001
```

To calculate IgA scores across all of the taxa and samples in an experiment the *igascores()* function should be used. This enables calculation of all the different scores via the *method* argument, the requirements for each of the scores is shown below.

Score | Method name | Inputs required |
---|---|---|

Probability Ratio | probratio | IgA+ abundance, IgA- abundance, IgA+ size, IgA- size, pseudo |

IgA+ Probability | prob | IgA+ abundance, Presort abundance, IgA+ size |

Palm index | palm | IgA+ abundance, IgA- abundance, pseudo |

Kau index | kau | IgA+ abundance, IgA- abundance, pseudo |

For example, calculating the **Probability Ratio** on the example data:

```
#default method is probratio
prscores <- igascores(posabunds = igapos, negabunds = iganeg,
possizes = possize, negsizes = negsize,
pseudo = pseudo)
print(prscores)
#> Sample1 Sample2 Sample3
#> Taxon1 -0.3505492 -0.02065829 -0.1340168
#> Taxon2 NA -0.65261507 -0.1903192
#> Taxon3 -0.5954181 -0.65482619 0.1272275
#> Taxon4 -0.4632094 0.06382045 -0.7238482
#> Taxon5 -0.1019485 -0.03272343 -0.5154269
```

Note the NA in Sample 1’s estimate for Taxon 2. This is because the taxa was not observed in either of the IgA+ or IgA- fractions. Adding a pseudo count to both would create an artificial estimate that might actually be higher than real observed values, thus IgAScores won’t score values absent in both fractions. This behavior can be controlled using the *nazeros* parameter, but it is recommended to leave this as default. Similarly, the Probability Ratio has a *scaleratio* parameter, this scales the values between -1 and 1 by adjusting for the size of the pseudo count, again it is recommended to leave this on by default.

An example for the *IgA+ Probability*

```
ppscores <- igascores(posabunds = igapos, possizes = possize, presortabunds = presort, method="prob")
print(ppscores)
#> Sample1 Sample2 Sample3
#> Taxon1 0.057817109 0.07100939 0.1215962441
#> Taxon2 0.000000000 0.00000000 0.0781690141
#> Taxon3 0.001734513 0.00617473 0.0260563380
#> Taxon4 0.005781711 0.13556338 0.0007816901
#> Taxon5 0.173451327 0.32276995 0.0000000000
```

The IgA+ Probability is a direct estimate of the probability a bacteria will be bound to IgA and in the IgA+ fraction given that it belongs to the given taxon. Note that the opposite (the IgA- Probability) can be calculated by swapping out the IgA+ abundance and IgA+ size for the IgA- abundance and IgA- size.

Examples for the **Kau** and **Palm** indices:

```
kscores <- igascores(posabunds = igapos, negabunds = iganeg, pseudo=pseudo, method="kau")
print(kscores)
#> Sample1 Sample2 Sample3
#> Taxon1 0.3905989 0.6120783 0.6321241
#> Taxon2 NA -0.6144172 0.2823114
#> Taxon3 -0.4214603 -0.7851551 0.3891377
#> Taxon4 -0.2157185 0.4519001 -0.8066183
#> Taxon5 0.2618282 0.4054535 -0.5074027
pscores <- igascores(posabunds = igapos, negabunds = iganeg, pseudo=pseudo, method="palm")
print(pscores)
#> Sample1 Sample2 Sample3
#> Taxon1 1.16814159 8.10798122 4.22848200
#> Taxon2 NA 0.00000000 2.71830986
#> Taxon3 0.05840708 0.07370892 46.94835681
#> Taxon4 0.23362832 15.47887324 0.02210008
#> Taxon5 5.84070796 7.37089202 0.00000000
```

These methods implement the scores described by Kau et al. and Palm et al. respectively.

For most experimental purposes the *igascores()* function will be more useful, and allows calculation of scores for all taxa across all samples in an experiment. But if a single taxon and sample are to be scored, such as for custom wrapping of the functions in other scripts, individual functions are available for the four methods:

```
igaprobabilityratio(posabund = igapos[1,1], negabund = iganeg[1,1], possize = possize[1], negsize = negsize[1], pseudo = pseudo)
#> Sample1
#> -0.3505492
igaprobability(withinabund = igapos[1,1], presortabund = presort[1,1], gatesize = possize[1])
#> Sample1
#> 0.05781711
kauindex(posabund = igapos[1,1], negabund = iganeg[1,1], pseudo = pseudo)
#> [1] 0.3905989
palmindex(posabund = igapos[1,1], negabund = iganeg[1,1], pseudo = pseudo)
#> [1] 1.168142
```

Simulated IgA-Seq data is used to validate scoring approaches in the IgAScores paper. This can be replicated using the *simulateigaseq()* function. This has several parameters for customising the simulation, which are detailed in the functions documentation. The basic data returned by the simulation are shown below, these can then be used to calculate indices using the functions above.

```
#run the simulation with defaults
simdata <- simulateigaseq()
summary(simdata)
#> Length Class Mode
#> presortcounts 100 -none- numeric
#> presortabunds 10 data.frame list
#> poscounts 10 data.frame list
#> posabunds 10 data.frame list
#> negcounts 10 data.frame list
#> negabunds 10 data.frame list
#> possizes 10 -none- numeric
#> negsizes 10 -none- numeric
#> igabinding 3 data.frame list
#> igavalmeans 10 -none- numeric
#> igavalsds 10 -none- numeric
#> posthresh 1 -none- numeric
#> negthresh 1 -none- numeric
#> expgroup 10 -none- numeric
#> expspecies 0 -none- NULL
```

Full analysis scripts demonstrating the use of IgAScores and how to compare IgA binding scores between different experimental conditions can be found in the GitHub repository containing the analyses from the paper.