Introduction

Single-nucleotide polymorphisms (SNPs) represent the most widespread type of sequence variation in genomes and their use in animal population studies is increasing, thanks to innovative methods for SNP identification and automated screening (Van Zyl et al., 2014). SNPs are attractive markers for population genetic inference as they are usually the most abundant type of variation in genomes, and SNP-genotyping assays have low error rate and do not require calibration between laboratories (see Brumfield et al., 2003; Morin et al., 2004). SNP-based assays have proven to be powerful tools for inferring genetic relationships, population history and population structure within and between populations in model animal species (for example, Ramos et al., 2009). However, their use is increasing even in non-model organisms (for example, Gomez-Uchida et al., 2011), especially through their transfer from model (often domestic) to closely related species (for example, VonHoldt et al., 2011; Goedbloed et al., 2013a). Unfortunately, this practise is known to arise an ascertainment bias that could possibly translate into errors when estimating population history and other demographic parameters (for example, Morin et al., 2004) making population samples appear more similar than they truly are (Brumfield et al., 2003). However, this bias usually have a weaker effect on closely related species (VonHoldt et al., 2011) and should not affect within-population comparisons, while it could ask for cautiousness to compare a distant species with the one used for the SNPchip development (Pertoldi et al., 2010).

The wild boar (WB) is an iconic game species of prime interest among European hunters and is therefore highly managed throughout its distribution range. Animals of heterogeneous origin, including alien subspecies and captive-bred individuals, have been often released to restore overhunted populations (Apollonio et al., 2014). Furthermore, in several cases farmed WBs were crosses between wild and domestic pigs (DP) (Scandura et al., 2011a; Goedbloed et al., 2013a; Canu et al., 2014).

Human-induced alterations to WB populations have likely been on-going since pre-Neolithic time, when humans started to manage and introduce them to islands as supplementary food supply (Vigne et al., 2009) and continued during the domestication process and the following co-existence with DP. Before the 16th century, pigs were kept free-ranging, and crossbreeding between human-reared and wild pigs would have been possible throughout Europe (White, 2011). Although these practises have apparently weakly affected the historic phylogeographic pattern of the species (Larson et al., 2005; Scandura et al., 2008; Vilaça et al., 2014), their nowadays occurrence can have an impact on the genetic makeup of modern populations (Scandura et al., 2011a, b; Goedbloed et al., 2013a).

Recent studies have revealed that the WB population inhabiting Sardinia island shows an admixed composition, resulting from a mix of ancient and recent introductions, and by the local hybridisation with DP (Scandura et al., 2011b). As the earliest palaeontological evidence for WB presence on the island dates back to the early Neolithic and coincide with the first human settlements (Wilkens, 2003), it is generally assumed that humans have brought the species to Sardinia, either as wild or very early domesticated form (Albarella et al., 2006). After escaping from human control, being isolated from continental populations, they have diverged from other WBs (Masseti and Mazza, 1996). As a consequence of phenotypic and biogeographic distinctness, Sardinian WBs were classified as a separate subspecies (Sus scrofa meridionalis Major, 1883; Apollonio et al., 1988). But crossbreeding with DP in some areas of Sardinia, where outdoor pig farming is still practiced, and the uncontrolled introduction of continental WBs, have threatened and possibly compromised the genetic identity of the island population.

Although the genetic diversity and population structure of the Sardinian population have been previously described with different classes of molecular markers including allozymes, mitochondrial DNA and microsatellites (Randi et al., 1989; Scandura et al., 2011b), recently developed genome-wide high-density porcine SNP panels (Ramos et al., 2009) represent a valuable source of diagnostic power for the investigation of genomic distinctiveness, phylogeographic patterns and population admixture. The use of the Porcine SNP60 Beadchip technology was very informative in exploring the patterns of population structure and genetic introgression in north-western European WB populations (Goedbloed et al., 2013a), disclosing patterns that were not easily detected by other markers.

In this study we use a genome-wide SNP-genotyping assay to assess the current genomic diversity and distinctiveness of the Sardinian WB population. In doing so, we specifically looked for signs of recent introgression and evaluated the possible impact on the overall genetic diversity and differentiation from continental wild populations and from DP. Considering that an unbalanced sampling can introduce a bias when comparing levels of genomic diversity and differentiation across a panel of populations (Pilot et al., 2014), we investigated the effect of sample size and sample composition on our estimated parameters.

Materials and methods

Sampling and genotyping

WB samples collected throughout Sardinia in the period 2003–2010 (N=99) were combined, for comparison, with WB samples from other six European regions (Central Italy N=15, Iberia N=18, Balkans N=55, South-Western Europe=54, North-Western Europe N=37, Central-Northern Europe N=17), giving a total of 295 individuals from 17 countries. In addition, 105 DP (with a number of individuals per breed from 8 to 10) belonging to five international (Berkshire, Duroc, Large White, Yorkshire, Pietrain) and five Italian breeds (Casertana, Nera Siciliana, Mora Romagnola, Calabrese, Cinta Senese) from the PigBioDiv project (see Megens et al., 2008) were used for comparison in addition to a sample of Sardinian free-ranging pigs (see Supplementary Table S1a, b in Supplementary Information). All individuals were opportunistically sampled and checked for relatedness after genetic analysis, as they all resulted to be unrelated no additional selection was performed. Animals were analysed with the Porcine SNP60 Beadchip (Ramos et al., 2009), according to the manufacturer’s instructions (http://www.illumina.com/products/porcineSNP60_dna_analysis_kit.ilmn). Analyses were performed on the 49 803 SNPs (referred to as 50 K) mapping on autosomes, out of 62 163 total assayed SNPs. All individual samples included in this study had a genotyping rate >0.9, with an average of 0.993.

Statistical analysis

To examine genetic differentiation among the Sardinian population and the other wild and domestic populations, we performed Principal Component Analysis (PCA) using FLASHPCA (Abraham and Inouye, 2014). To avoid the possible confounding effect related to the presence of loci in linkage disequilibrium the calculation was performed on a subset of 30 127 SNPs (referred to as 30 K) obtained by removing SNPs in linkage disequilibrium (r2>0.5) with PLINK 1.09 (Purcell et al., 2007). On the basis of the PCA results and in concordance with previous findings (Scandura et al., 2008), we grouped the populations into four main groups (domestic breeds—DP, Sardinian WB—WSar, Italian WB—WIta and the remaining European wild populations all together—WEur). The 50 K data set was used to assess variability levels of the populations by calculating minor allele frequencies (which indicates the abundance of rare alleles through the genome), as well as expected (HE) and observed (HO) heterozygosity within populations in PLINK. FIS and Hardy–Weinberg equilibrium were tested in the Sardinian population using GENEPOP 4.1 (Rousset, 2008), adjusting significance levels by the sequential Bonferroni procedure (Rice, 1989). Genetic differentiation among populations was estimated by calculating pairwise FST values in GENEPOP using the 30 K subset to avoid a bias owing to physical linkage between loci (Helyar et al., 2011).

With the aim to explore the genetic structure across populations and to assess the source of putative immigrants or introgressed individuals in the Sardinian population, a Bayesian clustering assignment was performed using the 30 K data set in STRUCTURE 2.2 (Pritchard et al., 2000), accessing to the Bioportal cluster at the University of Oslo (Kumar et al., 2009). Ten independent runs of 100 000 iterations, following a burn-in of 80 000 iterations, were performed for each value of K between 1 and 20, neglecting prior population information, assuming independence among loci, and allowing admixture. The Evanno method (Evanno et al., 2005) was applied to infer the most reliable number of genetic clusters (K). Within the identified K-value, among the 10 runs, we selected the one with the highest log-likelihood value and calculated the proportion of membership to each of the K clusters for any single individual (qi).

To evaluate the extent and nature of genetic introgression in the Sardinian WB, we first identified at each value of K, which of the K clusters was associated to the Sardinian sample, that is, the cluster to which the WSar population showed the highest average membership (QSar). Then we studied the variation of QSar at increasing values of K in the cumulative sample including all other populations in the data set, and selected the value of K at which this contribution levelled off (that is, K=12). In doing so, we minimised the possible noise due to an incomplete partitioning of Sardinian vs non-Sardinian populations. We subsequently investigated the distribution of qSar among WSar individuals at the selected K (K=12) and ordered individuals by qSar (from the purest to the most introgressed).

As both diversity measures and detected levels of differentiation can be affected by the number and nature of the individuals under study, we recalculated minor allele frequencies, HE, HO and pairwise FST and repeated the PCA using a subset of 15 individuals (equalling the smallest population) for each group. In so doing, we used three different subsamples of the Sardinian population: a random subsample of 15 individuals (Random), the 15 individuals at the upper extreme (Top) and the 15 at the lower extreme (Bottom) of the distribution of qSar. Similarly, a possible bias in the Bayesian analysis due to unequal sample size among populations was addressed by re-running STRUCTURE with the same parameters described above (K=1–10) on the random subset of 15 individuals per population.

Runs of homozygosity (ROH) were identified across the genome of Sardinian WB and compared with reference populations. Sliding windows of 10 Kbp (20 SNPs), allowing for one heterozygous call for each window, were checked across the genome using PLINK. To avoid overestimation of ROHs owing to rare allele removal, no additional filtering for low allele frequencies was applied. As ROH analysis is sensitive to sample size and within-population heterogeneity, we repeated it with the same parameters considering only homogeneous clusters on the basis of STRUCTURE results at K=12 and an even sample size (equalling N=8, that is, the smallest homogenous cluster in our data set), achieved by random sampling of individuals within clusters. For the WSar population, we also calculated ROH for the eight individuals at the two extremes of the qSar distribution (Top and Bottom). Differences between the normalised distributions of ROH values from the complete WSar data set and the reduced ones were tested using a two-sample Kolmogorov–Smirnov tests in R 3.0.2 (R Core Team, 2012).

To assess population-specific differences in the spatial distribution of ROHs across the genome, ‘ROH hotspots’ were identified as locations of the genome for which the proportion of individuals showing a ROH (fROH) exceeded the 99th percentile of the distribution of fROH in the population (Pemberton et al., 2012). In so doing, we grouped individuals into the four previously used populations (that is, WSar, WIta, WEur and DP). Then we checked the amount of ROH hotspots that were shared by different populations. Finally, for each individual, the number and cumulative size of ROHs in the genome was calculated and results averaged by population. As the chip methodology is likely to underestimate the amount of small ROHs in the genome (Bosse et al., 2012), only ROHs>10 Mbp were considered in this calculation.

Results

The PCA plot of the 400 individuals analysed with the 30 K SNPs clearly separates WB from DP and delineates a structuring of the WB sample into three major groups: WSar, WIta and WEur (Figure 1a). The first principal component (8.34% of variance) sharply distinguished between DP and WB, except for a small number of possible hybrid individuals. The second component (4.99% of variance) reflects differences among the three wild populations. Thus, the subdivision of our data set into four populations was confirmed.

Figure 1
figure 1

Principal component analysis (PCA) for the 30 K SNPs data set: (a) based on all the available samples; (b) based on a random subset of 15 individuals for each population; (c) based on the 15 Sardinian individuals with the highest QSar and the same random subset of 15 individuals for the other populations used in Figure 1b; (d) based on the 15 Sardinian individuals with the lowest QSar and the same random subset of 15 individuals for the other populations used in Figure 1b. WSar—WB Sardinia, WSar_R15—Random WB Sardinia, WSar_T15—WB Sardinia with the highest QSar values, WSar_B15—WB Sardinia with the lowest QSar values, WIta—WB Italy, WEur—WB Europe, DP—domestic pig.

Out of 49 803 SNPs analysed on the whole available sample, a total of 47 287 (95%) were polymorphic across populations, with average minor allele frequencies of 0.206. Within the four populations, the number of polymorphic loci ranged from 47 099 (94.6%) in DP to 28 600 (57.4%) in WIta, while the average minor allele frequencies ranged from 0.251 in DP to 0.141 in WSar and WIta (Table 1a). Overall, HE ranged from 0.190 in the WIta population to 0.387 in the DP (Table 1a), whereas HO was lower than HE, ranging from 0.161 (WSar) to 0.251 (DP). For WSar, the heterozygosity deficit implies a significant deviation from Hardy–Weinberg equilibrium (P<0.001), that could possibly be related to a moderately high-inbreeding coefficient (FIS=0.162), but also to a substantial degree of population substructure (see Scandura et al., 2011b).

Table 1 Indexes of genetic diversity in different WB populations and in domestic pigs, calculated on the complete data set (50 K)

Pairwise FST values, calculated from the 30 K SNPs data set, ranged from 0.091 between WEur and WIta to 0.169 between WSar and DP (Table 2a). FST between WSar and WEur amounted to 0.126, whereas it was 0.138 between WSar and WIta.

Table 2 Pairwise FST values calculated between populations, based on the 30 K data set

Concordantly with PCA, Bayesian ancestry inference using the programme STRUCTURE highlighted the differentiation of the WSar population (Supplementary Figure S1). Already at K=2 WSar was separated from all other populations, whereas further differentiation among the wild populations, on a geographic base, and among DP breeds was observed at higher K values. The Evanno method identified in K=3 the uppermost level of population structure separating the WSar population (QI=0.956) from both the DP (QII=0.842) and the other continental WB populations as a whole (QIII=0.814). WIta resulted intermediate between WEur and WSar until K=12, when it was first recognised as a separate cluster (Figure 2b and Supplementary Figure S1, Supplementary Information). As at this value of K, the mean QSar in non-Sardinian populations started to stabilise (Figure 2a), qSar values at K=12 were adopted to evaluate introgression in the Sardinian sample (Figure 3). Around two-thirds (68%) of individuals in the island had qSar>0.95, but the eight individuals (8%) at the lower extreme of the distribution showed a mean equal to 0.716, suggesting a recent introgression (all values<0.90). The qSar in the 15 most upper individuals of the distribution (Top) averaged 0.9997 (±0.0005 s.d.), whereas average qSar in the 15 lowest individuals was 0.8064 (±0.1512 s.d.).

Figure 2
figure 2

(a) Distribution of average QSar in non-Sardinian individuals at increasing K; (b) plot of individual assignments (qSar) inferred by the software STRUCTURE to the K clusters (K=2–12).

Figure 3
figure 3

Decreasing values of qSar averaged across the 10 runs at K=12 for the WSar population. The dashed lines indicate the limits of the 15 individuals with highest and lowest qSar values. The dotted lines indicate the limits of the eight individuals with highest and lowest qSar values.

In the WB population the autosomes showed an average of 36.24±13.16 (s.d.) ROHs per individual, with an average size of 11.596±3.646 Mbp. ROH size varied from 2.76 to 269.53 Mbp. WSar population had on average 40.58±11.65 ROHs per individual (mean size 11.509±13.947). Both average number and size of ROHs in Sardinia were intermediate with respect to other populations, despite having the highest frequency of short fragments (73% of ROHs<10 Mbp, Figure 4a) and a high number of long runs (0.56% of ROHs>100 Mbp against a mean of 0.31% in the other WB populations). Most ROHs were present in a single individual and only a small number could be observed in several animals. ROHs hotspots (falling above the 99th percentile of fROH) in the Sardinian WB were spread across the genome and 62.07% of them were shared by at least another population. The proportion of shared ROHs hotspots varied from 22.83% in WEur to 92.31% in DP (Supplementary Figure S3 in Supplementary Information). As to the number and cumulative size of medium and large-sized ROHs (>10 Mbp), Sardinian WB genomes showed a very similar distribution to that observed for the other mainland European populations (Supplementary Table S2 in Supplementary Information), with a maximum of 43 ROHs in one individual and a maximum cumulative size of around 1 Gbp. Much higher values were instead obtained for DP samples (up to 1.6 Gbp).

Figure 4
figure 4

Density distribution of runs of homozygosity (ROH) in: (a) the complete WB data set; (b) a subsample of WSar individuals (N=8), the two extremes of the qSar distribution and a random subsample are shown; (c) a random subsample (N=8) of six genetically homogeneous WB populations; (d) a random subsample (N=8) of five genetically homogeneous DP breeds. WSarB8—WB Sardinia with the lowest qSar values, WSarT8—WB Sardinia with the highest qSar values, WSarR8—random WB Sardinian individuals, WBal—WB Balkans, WFra—WB France, WGer—WB Germany, WIbe—WB Iberia, WIta—WB Italy, WSar—WB Sardinia, Cal—Calabrese, Cin—Cinta Senese, Dur—Duroc, LW—Large White, Rom—Mora Romagnola.

Taking an even sample size for the four populations by random selection of 15 individuals, the observed number of polymorphic sites decreased in all populations but more strongly in the Sardinian one (WSar—26%, WEur—20%, DP—4%). As expected, a general—though smaller—reduction was observed in the other indexes of genetic diversity (Table 1b). The genetic diversity of the Sardinian population changed a lot when the two compositionally extreme groups of individuals were taken into account. The use of subsamples of 15 individuals per population did not significantly alter the clustering into four groups in the PCA plot (Figure 1b–d). However, pairwise distances between the WSar population and the other reference populations changed (especially with WEur; Table 2b).

A more substantial change was observed when sample size was reduced to 15 random individuals in the STRUCTURE analysis (Supplementary Figure S2, Supplementary Information). The Evanno method identified K=2 as the most reliable subdivision, separating the DP (QII=0.910) from the WSar and WIta populations, which clustered into one group (QI=0.970). The WEur was intermediate between them (QI=0.575). Increasing the number of K had the immediate effect to distinguish the WSar and WIta populations (from K=3 on) and to split the non-Sardinian populations into more and more clusters roughly corresponding to geographic areas or domestic lines.

As concerns the ROH analysis, among the three WSar subsets, the number of segments per individual increased from admixed individuals (WSar_B8; 35.38±14.99 s.d.) to those at the opposite extreme of the qSar distribution (WSar_T8; 49.88±3.14 s.d., Figure 4b). An opposite trend was observed for ROH size, that was on average larger in the WSar_B8 subsample (16.943±20.234 Mbp) and shorter in the WSar_T8 subsample (11.405±15.346 Mbp). The subset composed by admixed individuals was also the only one with a distribution differing from the initial WSar sample (D=0.429, P=0.042). When sample size was standardized, WSar showed intermediate number and size of ROHs (38.25±8.38 and 11.41±15.35, respectively), where as in the reference populations the number of ROHs ranged between 16.88 (WFra) and 47.63 (WIta) and their size ranged between 9.92 (WIbe) and 19.28 (WGer; Figure 4c). A similar number of homozygous segments was observed in DP populations (observed range between 19.38 for Calabrese and 53.63 for Duroc), whereas the length of stretches was seemingly higher (observed range between 12.95 for Duroc and 34.53 for Mora Romagnola Figure 4d).

Discussion

Genomic distinctiveness and origin of the Sardinian WB

The goal of this study was to assess the genomic differentiation of the Sardinian WB population from its continental counterparts and from DP. Our analyses unambiguously show that, within the genomic variation observed in the European porcine population as a whole, the Sardinian wild population forms a highly divergent cluster.

Although a possible ascertainment bias derived from the ascertainment panel of the Illumina porcine SNP60 genotyping beadchip can lead to an overestimation of the actual divergence between WB and DP, it is considered to have no effect on the inference of the relative divergence among WB populations (see also Goedbloed et al., 2013b). The number of polymorphic sites in the WSar population resulted to be 80.9% of the whole S. scrofa sample and increased to 88.5% when only WB were considered, suggesting a marginal influence of the ascertainment bias in the assessment of genetic variation. Therefore, the position of the Sardinian wild population in the PCA plot and the estimate of pairwise FST values testify to its significant genomic differentiation from all other European WB. These observations were not influenced by the differences of sample size among populations, as a random selection of a uniform number of individuals, yet implying a strong reduction of sample size in the target populations, produced little changes on diversity parameters (except number of polymorphic loci; Figure 1 and Table 1). However, a non-random selection within the WSar population produced remarkable changes in diversity statistics and FST values. It is worth to notice that the strongest deviations (decrease in diversity and increase in genetic divergence) were observed when using the ‘purest’ individuals, suggesting that present levels of admixture could have partially masked the original genetic features of the insular population.

Bayesian clustering resulting from 30K SNPs strongly supported the distinctiveness of the Sardinian population, which was associated to a private cluster already from the lowest values of K up to K=17 (Supplementary Figure S2, Supplementary Information). Furthermore, the Sardinian population appears more differentiated from DP than from other continental wild populations, independently from the sample size of the analysed populations (Table 2). This contrasts with the expectation of extensive hybridisation with local free-ranging DP and could suggest recent introgression from exotic WB, which could have disguised the ‘historical’ pattern of divergence of the island population. Actually, recent introgression from DP was evident only in a limited number of wild individuals that were readily discernible by cluster analyses (Figure 3). Thus, its uniqueness does not seem to be systematically affected by introgression from modern DP. This was true even after the random reduction of sample size, with the WSar population emerging already at K=3. However, the effect of sample size was stronger in other populations, like WIta whose genetic structure was underestimated when the full data set was used (Supplementary Figures S1 and S2 in Supplementary Information).

Distinct features of Sardinian WB were initially detected at polymorphic enzymes (Randi et al., 1989). Scandura et al., (2008; 2011b) using 10 autosomal microsatellites, found the Sardinian WB clustering separately in a Bayesian analysis, with a minor overlap to other sampled populations. Mitochondrial data, in addition, showed a high percentage of private sequences in the island (Randi, 1995; Scandura et al., 2008). Overall, the detected molecular and morphological distinctiveness of the Sardinian WB (Apollonio et al., 1988) is congruent with its classification as a different subspecies (S. scrofa meridionalis). This taxon designation is shared by WBs occurring in Corsica, but the present status of the Corsican population is uncertain as no updated genetic data are currently available.

Of all continental WB populations, SNP data show highest genetic affinity of Sardinian WB with the present-day populations of the Italian peninsula (Figure 1 and Supplementary Figure S1 in Supplementary Information). Previous analyses on mtDNA of ancient and modern samples from mainland Italy and Sardinia revealed the occurrence of a private clade (D4 in Larson et al., 2005; E2 in Scandura et al., 2008; 2011a), which showed different haplotypes in the two populations (Scandura et al., 2008). Considering these data jointly, our study strongly supports the hypothesis that the current island population arose from an ancient colonisation from peninsular Italy, probably mediated by humans. Yet, multiple colonisation events from different source populations might have occurred. Furthermore, this hypothesis is supported by the lack of archaeological records before the early Neolithic (Wilkens, 2003; Albarella et al., 2006), whereas a natural colonisation would have most likely occurred during the sea lowering at the last Pleniglacial. However, there is no clear evidence on the status (wild or domestic) of the introduced individuals, as size differences in early specimens with respect to contemporary mainland populations are controversial (Albarella et al., 2006). A possible founder effect, followed by genetic drift under isolation, could explain the present divergence of the Sardinian WB. But an important role may have been played also by local adaptation, possibly promoted by limited food supply and intra-specific competition, as suggested by an early size reduction during the Neolithic (Albarella et al., 2006).

High genomic variation in the insular population

The observed SNP variation is within the range estimated for SNP data in other wild species, including bisons (Bison spp.—Pertoldi et al., 2010), wolf-like canids (Canis spp.—VonHoldt et al., 2011) and the Alaskan salmon (Oncorhynchus nerkaGomez-Uchida et al., 2011).

As stated above, even though we cannot compare diversity statistics between DP and WB in absolute values, we can confidently compare those among WB populations (see Bosse et al., 2012; Goedbloed et al., 2013b). The fraction of polymorphic SNPs was relatively high for an island population (76.8% of the total amount found across populations). By comparison, in the whole sample of continental WB (excluding Italy) this proportion amounted to 81.3% (see Table 1). The observed variability was still comparable with that reported for the non-isolated WIta population when a random subset of individuals was analysed (Table 1b). We suggest four possible explanations for such an unexpectedly high variation in this island population: (1) Sardinia was colonized by a large number of individuals; (2) repeated introductions took place from multiple sources; (3) since its origin the island population has maintained a relatively high population size; (4) population substructuring caused by landscape features is present. Actually, the Sardinian population has not undergone heavy demographic fluctuations in the last century, and WBs were abundant on the island even when they had almost disappeared across most of the Italian peninsula (Ghigi, 1950).

Patterns of ROHs help to elucidate which factors could have left a major signature in the genome of the island WB. Interestingly, the WSar population showed the highest number of short (<10 Mbp, Figure 4a) and a high number of very long ROHs (>100 Mbp). A random reduction of the sample size did not affect these results; however, levels of autozygosity assessed by ROHs differed when the individuals with lowest qSar were considered, producing a lower number of segments per individual, as expected in presence of recent hybridisation events. Short ROHs may derive from ancient bottlenecks (like in case of a funding event) and can be maintained through time by a low Ne. The evidence that many high-frequency ROHs in WSar were shared by other WB and DP populations might suggest an ancestral origin and a possible signature of positive selection on these homozygous regions (Pemberton et al., 2012), although a role of introgression cannot be excluded. Conversely, long ROH are sensitive to recent population changes (Bosse et al., 2012) and their presence suggests that groups of inbred individuals are likely to be present in the island. In fact, although the fraction of the genome occupied by ROHs was similar to continental populations, a few individuals showed an exceedingly high number and size of ROHs (Supplementary Figure S4 in Supplementary Information). As some of these animals either belong to a previously identified isolated subpopulation or show relatively low qSar values, their ROHs may derive from low Ne in local demes or from the release/escape of introgressed individuals from inbred captive stocks (see also Canu et al., 2014). Overall, this pattern is suggestive of a combination of past demographic events (bottlenecks) and a more recent natural or artificial genetic substructuring in the Sardinian population (see Scandura et al., 2011b).

Concluding, despite a certain level of recent introgression from both wild and domestic populations, the Sardinian WB still shows significant divergence and distinctiveness at both mitochondrial and nuclear loci. Accordingly, genetic data would support its, representing an evolutionarily significant unit, although field studies are needed to test its ecological exchangeability (Crandall et al., 2000). Actually, there is a general lack of data on the biology and ecology of the nowadays Sardinian WB, which strongly limits a full assessment of its conservation value.

Further investigations, applying complete genome sequencing, including ancient Sardinian and Corsican samples, will be useful to address outstanding questions on the origin and evolution of the populations inhabiting these two islands. In addition, further investigations are needed to address the genetic basis and adaptive relevance of their phenotypic distinctiveness, as long with a possible variation due to population substructuring.

Data archiving

The 49 803 autosomal SNP genotypes for 295 WBs and 105 DP (PLINK and STRUCTURE file format) were deposited in the Dryad data repository: doi:10.5061/dryad.8bf48