Quantifying Population Burden and Effectiveness of Decentralized Surveillance Strategies for Skin-Presenting Neglected Tropical Diseases, Liberia

We evaluated programmatic approaches for skin neglected tropical disease (NTD) surveillance and completed a robust estimation of the burden of skin NTDs endemic to West Africa (Buruli ulcer, leprosy, lymphatic filariasis morbidity, and yaws). In Maryland, Liberia, exhaustive case finding by community health workers of 56,285 persons across 92 clusters identified 3,241 suspected cases. A total of 236 skin NTDs (34.0 [95% CI 29.1–38.9]/10,000 persons) were confirmed by midlevel healthcare workers trained using a tailored program. Cases showed a focal and spatially heterogeneous distribution. This community health worker‒led approach showed a higher skin NTD burden than prevailing surveillance mechanisms, but also showed high (95.1%) and equitable population coverage. Specialized training and task-shifting of diagnoses to midlevel health workers led to reliable identification of skin NTDs, but reliability of individual diagnoses varied. This multifaceted evaluation of skin NTD surveillance strategies quantifies benefits and limitations of key approaches promoted by the 2030 NTD roadmap of the World Health Organization.

We excluded secondary yaws from our clinical case definitions due to its nonspecific presentations.
CHWs were trained to show the photos to all members of the household. At the end of showing the photos the CHW would ask the household if they or anyone in the household has a skin problem that "looks like any of the photos." If no household members absent at the time of the visit were initially referred by proxy, the household head or primary caregiver were directly prompted to act as the proxy respondent. Individual information was collected at this stage for all suspected skin NTD cases (age, sex, lesion type and phone number) and each unique patient provided with a QR-coded patient ID card. Follow-up teams would re-capture patient ID cards to ensure accurate patient linkage between survey stages.
All data collection tools were designed on an ODK-based platform (SurveyCTO, Dobility, USA) with data checks and audits steps built within the form to ensure data reliability.
Data collection devices were android-based smartphones that costs $35 per unit (Tecno Rise 32).
Among the CHW cohort, only 18 of 94 CHWs (19.1%) owned a smartphone with 26 (27.7%) not owning any type of phone and 54 (57.5%) had not completed secondary school. At every household, CHWs were instructed to scan and distribute household ID cards. We also monitored coverage using GPS coordinates collected at each household. Upon completion of surveys, CHW data was uploaded and building coverage was checked against most recently available opensource satellite imagery. If low building coverage was observed, CHW were asked to return complete the missing areas. This validation step was not possible in Barrobo districts due to a combination of heavy rainfall and poor mobile network coverage. Due to limited power networks given the rural location of most clusters, CHWs were also provided with high-capacity power banks to enable this work (48,000 mAh), as screening lasted for 7 days.

Midlevel Health Worker Training Program
We recruited clinically trained verifiers for the duration of Maryland survey activities (4months). All verifiers required physician assistant qualification with clinical experience as a midlevel health worker in Liberia. Our verifier cohort included nurses, physician assistants, community health services supervisors and officers in charge. Following selection, verification teams were trained on the diagnosis of skin NTDs using a novel integrated 5-day training workshop of clinical dermatology led by the Ministry of Health NTD program (ER, TM) and UK-based experts including a consultant tropical dermatologist (MM, SW,JT). The residential training was based at Ganta Rehab Centre in Nimba County, a national referral hospital for Buruli ulcer and leprosy. Additional patients with lymphatic filariais were recruited from nearby communities to facilitate practical experience. No symptomatic yaws patients were available for the training program.
Training was initiated with an introductory day of fundamental dermatological concepts and common skin diseases. Common skin disease were based on local epidemiology and focused on superficial fungal infections, impetigo and scabies, scrotal hernia and ulcers of alternative etiology. Pedagogic elements were followed by a half day skin clinic based in a nearby village to enable trainees experience with common presentations and differential diagnoses. The common skin disease module was followed by pedagogic and practical training with skin NTD patients over the remaining 3.5 days. Due to a lack of validated clinical algorithms we did not train verification teams using an algorithm-based approach for diagnosis of skin NTDs. We instead used a global assessment of symptoms and provided job aids with common clinical symptoms and epidemiologic characteristics of diseases. Pedagogic and practical training elements were aligned with these definitions throughout training and job aids were able to provide decisionmaking support in the field. The training program finished with a clinical assessment from cases recruited in the community and written exam -with clinical feedback provided by the program's lead dermatologist. Additional training on the use of electronic data collection tools was undertaken over 2 days in Maryland County before initiation of activities (ER, JT, KEH).

Disease Verification
Following CHW screening all cases were followed-up by verification teams trained on the MLHW training program. One member of the verification team was assigned to a health facility and provided with a full line list of suspected cases. Team members were based in the community for 7-10 days to follow-up all patients and coordinated activities with CHWs and the community health services supervisor (CHSS) of the facility. Data was captured on electronic devices with custom-made ODK-based surveys including assistive protocols for diagnostic approaches (skin examinations, swab sampling, rapid diagnostic tests). All laboratory samples were stored in cell lysis solution (Catalog no. 158908; QIAGEN, https://www.qiagen.com) transported in vaccine carriers and stored daily in facility freezers. Samples were transported to a −20°C central freezer at JJ Dossen Hospital, Harper after each phase of verification before shipment to the London School of Hygiene and Tropical Medicine, United Kingdom.

Quality Control of Screening Process
We aimed to perform quality control of the CHW screening process in all survey clusters. QC surveys were performed at household level to assess 1) coverage of CHW screening, 2) sensitivity of photo-based screening by CHWs, and 3) sociodemographic factors systematically associated with exclusion from CHW screening. Quality control surveys were undertaken by the CHSS of each of the county's 24 health facilities between 1-6 days after CHWs completed screening. Training of the CHSS was delivered by members of the verification team who participated in a 2-day training-of-trainers program led by members of MoH and LSHTM research team (ER, JT, KEH). Each CHSS was trained one-to-one for a full day by a member of the verification team and the first day of QC surveys individually supervised. The CHSS visited each cluster for one full day resulting in 3-4 days of surveys per facility. Each CHSS was provided with an electronic data capture tool that defined a random start point in the village and different random walk procedure each day. At each household, the CHSS would record household-level information to validate CHW information and collect additional sociodemographic data. Present household members were assembled and asked to verbally report if the CHW visited to show skin NTD photos and if so, requested the household ID for recapture. Each consenting individual subsequently underwent a full body skin examination in a private setting with appropriate lighting. Each CHSS was trained to record lesions that looked visually similar to those in photo-based screening tools used by CHWs. All patients with lesions identified were asked to present their individual ID which was then re-captured if available to differentiate between new cases and those previously identified during CHWs screening.

Quality Control of Verification Process
We estimated the reliability of clinical skin NTD diagnoses made by verification teams using QC surveys. After completion of verification, QC teams visited persons who could be reached from the health facility on the same day of visit due to logistical constraints. From the patients within this defined area we targeted all patients diagnosed with skin NTDs by verification teams and a random selection of patients with alternative diagnoses. QC of clinical diagnoses was made by members of the national case management NTD control program (ER, RG, TM) visiting patients in their own home. No cases were assessed in either Barrobo district as the area became inaccessible due to adverse weather conditions. Clinical diagnoses of yaws were not were not subject to analysis as cases were treated with azithromycin and follow-up visits could be over 14-days from initial diagnosis. We used inter-rater reliability measures to compare clinical diagnoses with measures of kappa score for all diagnoses (R, psych v1.9.12). For individual skin NTD outcomes kappa scores were not appropriate measures due to high prevalence index introduced by the sampling design (2), and we instead present crude agreement measures.

Statistical Analysis (Survey Analysis and Modeling)
To estimate population prevalence of skin NTDs, prevalence was estimated to account for a stratified design with primary sampling units (PSU) selected within health facility strata proportional to size. Prevalence estimates and variance were adjusted for both strata population and first order inclusion probabilities of PSUs. The survey sampling frame used adjusted 2008 census population data to ensure common implementation with Maryland County health team.
Due to inaccuracies in population census data, some cluster boundaries were not aligned with CHW catchments and some populations were evidently inaccurate based on CHW survey data coupled with satellite imagery analysis. To account for this, if survey clusters were under minimum sizes, the nearest contiguous cluster(s) from the original sampling frame was also screened by the same CHW, with both then included as a single survey cluster. If cluster boundaries did not match true CHW catchments, cluster boundaries were re-drawn. To account for these changes during analysis of prevalence estimates, strata cluster numbers and clusterlevel inclusion probabilities were re-calculated based on updated boundaries. This resulted in a sample of 92 from 185 total primary sampling units.
To assess the equity of CHW screening process, we used a matched case-control approach to identify household-level sociodemographic information that was systematically associated with exclusion from CHW screening. Cases were defined as households not visited by CHWs during screening, confirmed both verbally and by the absence of a QR-coded household ID card. We aimed to randomly selected 4 control households per case from within the same cluster. For some households, it was only possible to select 2-3 matched controls due to limited numbers. We built a conditional logistic regression model using sociodemographic data collected by QC teams analyzed in R (survival version 3.1-12). All independent variables were tested for univariate association using likelihood ratio tests against initial parameter estimates. All variables showing a statistical association below a p-value threshold of 0.20 were included in a final multivariable model. Quantitative variables were assessed using pre-defined categories and included as linear predictors if categories did not improve model fit at a pre-defined threshold of p = 0.05 (likelihood ratio test).
To understand CHW characteristics that could explain observed heterogeneity in referral rates between clusters, we used a mixed-effects generalized linear modeling approach (binomial family distribution). We defined the binary outcome at household level, with a positive outcome as the household having at least one individual referred for second stage verification. We collected additional sociodemographic information from all CHWs during training workshops for screening activities as potential exposures at cluster level affecting referral rates. Additional cluster-level variables were extracted from open-source geographic information system datasets to define clusters as rural or urban, and the distance of the cluster to stable night lights (WorldPop, www.worldpop.org). A baseline model was set that included the number of persons and sex distribution at household level and two proxy measure of urbanization. Cluster-level (CHW-level) variables were added for univariate analysis and all variables included in a final model with a random intercept assigned to cluster-level covariates. Quantitative variables were assessed using pre-defined categories and included as linear predictors if categories did not improve model fit at a pre-defined threshold of p = 0.05 (likelihood ratio test).

Community Health Worker Screening Results
We quantified the proportion of household members who saw photos during CHW screening, with 34,916 of 56,825 persons recorded as present during screening surveys (61.4%).
The remaining 38.6% not present during surveys, therefore, relied on proxy answers from household members for referral. We present the distributions of the number of household members versus the number present to see photos in Appendix Figure 1.
The CHW screening process identified 3,087 persons who verbally reported the presence of skin NTD symptoms. There was operationally relevant variation in referral rate of all skin NTDs observed at both health district (range 3.1% -7.0%) and cluster level (range 0.5 -23.0), which are shown in Appendix Table 1 and Appendix Figure 2. Appendix Table 1 also quantifies the high referral rates in districts with large peri-urban centers (Harper and Pleebo). While epidemiologic and environmental differences may drive natural variation in skin disease at these spatial scales, CHW demographics also varied by health district. For example, Harper and Pleebo contained 74.2% of all CHWs with secondary school qualifications despite only 56 of 92 (60.1%) of CHWs operating in these districts. We therefore aimed to identify possible factors within our CHW cohort associated with different rates of referral through a hierarchical modeling approach (Appendix Table 2). We observed an independent, inverse association between the age of the cluster's CHW (OR 0.59, 95% CI 0.43 -0.81 p = 0.001) and the odds of a household being referred during screening. We also observed weak inverse association between CHW education level and odds of referral (Secondary incomplete OR 1.64, 95% CI 1.07 -2.58; Secondary complete OR 1.75 95% CI 1.05 -3.21, p = 0.06). We observed weak evidence that distance of clusters further from developed areas associated with referral rates (distance to stable night lights; p 0.06), with higher referral rates 1-10 km developed areas, which diminished once over 10 km. These findings suggest sociodemographic factors, namely age, of CHWs explains some of the observed variation in referral rates with absolute location less influential.
Cases referred during screening were initially identified by household members selecting photographic case definitions. The pictures were distributed across 12 pages and the total number of times each type of lesion was identified by the individual referral is presented in Appendix Table 3. Multiple lesion types could be selected by an individual referral and in total 3,225 skin lesions were identified among 3,087 referrals with available information. The most common lesion for referred for verification was an enlarged scrotum (23.6% of lesions). Hypopigmented skin patches (16.0%) and BU-like nodules or BU-like plaques (17.6% combined) were also common. Presentations associated with advanced stages of leprosy, deformities of the hands and feet (4.1%) and lepromatous leprosy presentations (7.6%) were the least common reason for referral.
Health districts are arranged left-right in order of south to north geographic location (an approximate proxy for the increasing rural nature of the county along this axis). Table highlights the total households and persons screened and consenting during community screening for skin NTDs alongside referral rates calculate per 100 persons screened by CHWs. District-level referral rates showed statistical evidence of variation after accounting for cluster-level variance (likelihood ratio test p = 0.02).

Disease-Specific Clinical Epidemiology: Buruli Ulcer
During verification of suspected cases, clinical diagnosis was made for suspected Buruli ulcer and laboratory samples (swabs or FNA) were collected from active lesions for confirmation by IS2404 PCR. The verification teams diagnosed 55 total cases of clinically suspicious BU of which 1 (1.8%) was confirmed by PCR. We identified 3 additional cases of BU through PCR whose initial clinical suspicion was yaws (2) or tropical ulcer (1). All four PCR confirmed cases presented with a single ulcerative form on the lower limb ranging between 2-7cm in diameter.
The ages of the confirmed cases were 3, 15, 18 and 50 years old with 50% female. The lesions among two of the cases had begun within the past 12 weeks whereas for the other two cases, symptoms had been present for over 1 and 3 years respectively.
Among clinically suspicious BU cases only, the median age was older (44 years) and 42.0% of the cases were female. Most patients with clinically suspicious lesions reported the persistence of the disease for extensive durations; between 1-3 years (16/55; 29.1%) or over 3 years (26/55, 47.3%). Among these cases 36.4% (20 of 55) reported limitation of movement as a result of the lesion. Most cases had ulcers (39 cases, 70.9%) with 2 instances each of plaque or nodule (3.6% each;). A total of 12 cases were identified with suspected BU osteomyelitis (21.8%). Laboratory confirmation of BU osteomyelitis requires bone collection for confirmation (3). We were able to collect clinical material from 7 of 12 actively discharging external lesions all of which were PCR negative. It is plausible that additional BU osteomyelitis cases were within this 12-person cohort. As PCR confirmation is not consistently acquired for BU cases reported to WHO (4), we include a sensitivity analysis of prevalence estimates inclusive of clinically suspicious BU cases and excluding 3 PCR confirmed cases. This sensitivity analysis results in a design-adjusted prevalence of 9.8 cases/10,000 persons (95% CI 6.2 -13.5) for BU and 43.0 per 10,000 for all skin NTDs (95% 36.7 -49.1).

Disease-Specific Clinical Epidemiology: Leprosy
We diagnosed 39 cases of leprosy during survey activities (4.4 per 10,000; 95% CI 3.3 -5.5). All patients diagnosed with leprosy were subjected to full-body clinical examination and WHO/ILEP recommended field diagnostic tests (5); patch anesthesia testing and assessment of sensory loss in the hands or feet (Appendix Table 4

Disease-Specific Clinical Epidemiology: Lymphatic Filariasis-Associated illness
We diagnosed 111 cases of filarial lymphedema (17.5 cases/10,000 persons, 95% CI 14.1-21.0) and 58 cases of filarial hydrocele (8.5 cases/10,000 persons, 95% CI 4.8-12.3) in Maryland County. All but one case had lymphedema localized to the lower limbs (110 of 111; 99.1%). The remaining case had edema in both the upper arm and lower leg (0.9%). Among all patients the median age was 48 (range 2-86) with 67.3% of cases in females (Appendix Figure   4). One patient <5 years of age was given a diagnosis of filarial lymphedema, which we acknowledge as a probable misclassification.
Verification teams were trained to grade lymphedema according to WHO guidelines (6).
We observed most grade I cases (59.5%) (Appendix Table 5) followed by the most advanced form, grade III (27.9%), with the fewest observations of grade II (12.6%). Most patients reported being affected by lymphedema for >3 years (88.2%). The proportion of reporting limitation of movement caused by lymphedema was 25.2% but varied between grades (16.7 grade I; 28.6% grade II; 41.9% grade III). One surprising finding was a high proportion of patients diagnosed with bilateral lymphedema (23.4%) (Appendix Table 5). We did not attempt to differentiate between causes of acute pain associated with filarial pathology; acute filarial lymphangitis (AFL) and acute dermatolymphangioadenitis (ADLA) but instead refer to all cases of patient reported pain as ADLA. Nearly all patients reported being affected by ADLA attacks (97.3%) with the majority reporting acute attacks in cycles of approximately 1 month (46.6%) or between one and 3 months (32.4%).
For lymphatic filariasis hydrocele patients, the mean age was 43 (range 1-75), although we acknowledge the 1 year-old boy given a diagnosis of lymphatic filariasis hydrocele as a probable misclassification. There was missing data on age for 3 cases of lymphatic filariasis hydrocele. Most patients (77.6%) reported persistence of the condition for >3 years, yet only a few patients reported that the hydrocele limited their movement (13.8%) (Appendix Table 6).
Verification teams probed all hydrocele cases on any pain in the scrotum with 86.2% reporting pain with a typical periodicity of monthly (44.0%) or between 1-3 months (28.0%). Most patients also reported swollen lymph nodes (76.0%) and fever associated with the pain (92.0%).

Disease-Specific Clinical Epidemiology: Yaws
We identified 24 cases of serologically confirmed active yaws in Maryland County (2.6 cases/10,000 persons, 95% CI 1.4 -3.9). Verification teams were trained to identify both clinically suspected yaws papillomas and ulcers. Patients were initially tested with a rapid  Table 7). There was missing data on age for 2 case-patients. A 32 year-old man had the only yaws case diagnosed in persons >18 years of age. Among the 24 case-patients, most diagnoses were made in men (18/24, 75.0%). The primary clinical presentation among yaws cases was evenly distributed between ulcerative and papillomatous forms of disease (50.0%). The duration for which patients reported having active lesions was mostly <1 year (83.3%) but ranged between <8 weeks (37.5%) to >3 years (12.5%).

Spatial Heterogeneity and Coendemicity
To demonstrate the spatial heterogeneity of skin NTDs in Maryland County we present occurrence maps for all skin NTDs (Figure 3 main text) and disease-specific outcome data at both health district and cluster level (Appendix Table 8 and Appendix Figure 5). In Liberia health districts do not follow typical WHO definitions of health district based on population sizes. They instead represent sub-districts by with population sizes range from 8,492 to 51,959.
Appendix Table 8 supports Figure 3 of the main text with estimates for all diseases at health district level. For the primary outcome estimates ranged from 14.5 cases/10,000 persons (95% CI 8. 4-20.5) in Pleebo to 75.7 cases/10,000 persons (95% CI 59.9-91.4) in Harper district.
Although we did not design the survey with precision to measure differences at health districtlevel, primary outcome and individual diseases demonstrated overt variation in occurrence and magnitude at these implementation levels. These data also show that at health district level, the predominant pattern is co-endemicity of most skin NTDs although specific diseases, namely yaws and BU, can remain absent. With the low prevalence of BU and yaws, however, we cannot confidently assert that the diseases are absent these implementation levels.
Appendix Figure 5 summarizes the prevalence of each disease by cluster; the smallest survey unit of evaluation (population interquartile range 411-793  Table 2 of the main text. Both the disease-specific ICC values and Appendix Figure 5 highlight the predominant pattern of skin NTDs observed at cluster or community level; high prevalence within a limited number of foci with total absence from most of clusters. With the emergence of integrated skin NTD programs, the scale and structure of disease co-endemicity remains essential. Appendix Figure 6 summarizes the variation in coendemicity of diseases within clusters through intersection plots. Quantifying this difference, across the 92 survey clusters, we identified 9 unique combinations of skin NTD co-occurrence. The most common community-level outcome was lymphatic filariasis only (35 clusters) while singledisease foci were observed for each disease aside from BU (BU 0 clusters; leprosy 10 clusters; yaws 1 cluster). There were 22 of 92 clusters (23.9%) in which two or more skin NTDs were coendemic, with the leprosy and lymphatic filariasis most commonly found within the same community (24 clusters). Only 1 cluster demonstrated co-occurrence of 3 skin NTDs.
Relative to an alternative approach of vertical survey activities, these data provide strong epidemiologic justification for the efficiency gains made through integration. Using single disease focused estimation strategies, most clusters would have zero reported cases versus an integrated model, under which multiple skin NTDs can be simultaneously identified. Although coendemicity is not the predominant pattern at cluster-level, it is not uncommon. Disease cooccurrence appears to become more predominant at health district level where most skin NTDs coexist.

QC of Screening
During QC of screening we identified a subpopulation of households who reported that they were not visited by CHWs during community screening. To assess the equity of community-based approaches led by CHWs, we attempted to identify socioeconomic indicators that may be associated with nonparticipation (Methods).
Among 1,379 consenting households, we identified 52 households that were not visited by CHWs during screening. We selected 142 matched controls households for final analysis.
Univariate analysis indicated that using more expensive cooking fuel (OR 2.86, 95% CI 1.06-8.08), total residents within the household (OR 0.88, 95% CI 0.78-1.01), or crowding (OR 0.46, 95% CI 0.20−1.04), showed some evidence of association with nonparticipation in screening (Appendix Table 9). The independent strength of associations was not evident within the final model (OR 2.33, 95% CI 0.78-6.9; p = 0.13, residents OR 0.94, 95% CI 0.81-1.09; p = 0.40, crowding p = 0.59). These results provide no evidence that socioeconomic status of the household was associated with exclusion from CHW screening. Coupled with high CHW household coverage rates estimated from QC surveys; these findings support the equitable nature of CHW screening for skin NTDs.

Sensitivity Analysis of Prevalence Estimates
QC surveys of CHW screening identified the sensitivity of identifying skin NTD lesions using photo-based screening methods. Using this information, we estimate the effect on survey outcomes through sensitivity analysis by adjusting prevalence rates and their CIs accordingly (7).
Because our evaluation methods did not enable us to understand variation in sensitivity by absolute location or individual skin NTD outcome, sensitivity analyses are not adjusted to account for these factors. By quantifying the new case detection rate from full body skin examinations during QC surveys (3.6 cases/100 persons examined; 95% CI 3.1-4.2), we estimated that the total number of referable cases among the survey population was as high as 5,137 (vs. 3,087 reported by CHWs: sensitivity 60.1%). Assuming that skin NTDs are diagnosed at the same rate among these cases, this increased the maximum prevalence across all skin NTDs to 56.5 cases/10,000 persons (95% CI 48.4-64.7) from 34.0 cases/10,000 persons. Appendix Table 10 shows how potential reductions in sensitivity may affect final prevalence estimates for each disease outcome.   Appendix Table 10. Sensitivity analysis of total skin NTD cases and prevalence estimates (per 10,000) Cases Survey cases, Estimated prevalence per 10,000 (95% CI)