Introduction

Potato (Solanum tuberosum ssp. tuberosum L.) is an important food crop in temperate climates. Worldwide more than 19 million ha. of potatoes are grown with a total economic value higher than 31 billion US$ (www.potato2008.org/en/world/index.html). Potato was brought to Europe by Spanish conquistadors in the sixteenth century from its centre of origin, the Andean region of South America where it was domesticated about 8,000 years ago (Bradshaw and Ramsay 2005). However, with the introduction of potato as a crop a range of pathogens and pests were introduced as well, causing devastating yield losses, including the Great Irish Famine by the oomycete Phytophthora infestans (Bradshaw and Ramsay 2005). Therefore, breeding for resistance is central for disease control in potato.

To defend themselves against pathogens and pest, plants have evolved an innate surveillance system encoded by a large set of resistance genes. Most resistance genes are single dominant and confer resistance in a gene-for-gene specific manner (Flor 1971). Most known resistance genes belong to the class of genes encoding a nucleotide binding (NB) and leucine rich repeats (LRR) domains (reviewed by Tameling and Takken 2008). R genes that belong to this super-family can be subdivided into genes encoding proteins with an N-terminal coiled-coil (CC) domain or a Toll-like Interleukin receptor (TIR) domain. They confer resistance to completely unrelated taxonomic groups like bacteria, fungi, viruses and nematodes by the activation of a defence response that prevents the pathogen from spreading (Bendahmane et al. 1999; Lawrence et al. 1995; Milligan et al. 1998; Mindrinos et al. 1994; van der Vossen et al. 2000; Vos et al. 1998; Whitham et al. 1994).

To date, thirteen functional NB-LRR resistance genes have been identified in potato i.e. R1 (Ballvora et al. 2002), Rpi-blb2 (van der Vossen et al. 2005), Gro1 (Paal et al. 2004), R3a (Huang et al. 2005), Gpa2 (van der Vossen et al. 2000), Rx1 (Bendahmane et al. 1999), Rx2 (Bendahmane et al. 2000), Rpi-blb3, Rpi-abpt, R2, R2-like, and Rpi-mcd1 (Lokossou et al. 2009), and Rpi-blb1 (van der Vossen et al. 2003). However, the potato genome harbours many more functional NB-LRR genes. In twenty or more regions in the potato genome, one or more resistance traits have been mapped, but none of the underlying genes have been identified yet (Bakari et al. 2006; Caromel et al. 2005; Caromel et al. 2003; Celebi-Toprak et al. 2002; Flis et al. 2005; Gebhardt and Valkonen 2001; Grube et al. 2000; Hein et al. 2009; Marczewski et al. 2001; Marczewski et al. 2002; Marczewski et al. 2006; Sato et al. 2006; Song et al. 2005; Szajko et al. 2008).

To facilitate the cloning and characterisation of resistance genes, various genetic maps have been constructed for the potato genome. The first genetic map was published in 1988 (Bonierbale et al. 1988). In 2001, Gebhardt and Valkonen published a genetic map of potato showing the location of all resistance traits known at that time. From this study, they could conclude that the distribution of R genes and QTLs is not random, but that they often reside in clusters or so-called ‘hotspots of resistance’ in the genome. Later on, an ultra high dense (UHD) genetic map with 10,365 AFLP loci in 1,118 bins was constructed (van Os et al. 2006), providing genome-wide marker saturation that facilitated the construction of a genome-wide physical map of potato (Borm 2008).

In this study, the UHD map (van Os et al. 2006) and the physical map (Borm 2008) of the diploid potato clone RH89-039-16 (RH) were used to construct a genetic map showing the genome-wide distribution of NB-LRR resistance gene homologs in the potato genome. In total, a set of 288 selected BAC clones derived from 47 NB-LRR resistance gene loci has been sequenced, which resulted in the detection of 738 partial and full length RGHs using BLASTp (Gish and States 1993). The RGHs can be subdivided into 448 CC-NB-LRR encoding sequences located at 32 different loci, whereas 280 sequences belong to the TIR-NB-LRR subclass derived from 15 different loci. Positional information of 82 resistance traits previously described was retrieved from literature and included in this study, resulting in a comprehensive integrated genome-wide genetic map of NB-LRR encoding RGHs and disease resistance loci in potato. Potential applications of the results obtained in this study to enhance the identification of genes underlying important disease resistance traits in potato and other solanaceous species will be discussed.

Materials and methods

BAC library screening

A BAC library of the diploid S. tuberosum spp. tuberosum clone RH89-039-16 (RH) consisting of 78,336 clones and an estimated coverage of five times the diploid genome was available (Borm 2008). The BAC clones were spotted in duplo on three macro-array filters by RZPD (Berlin, Germany).

Degenerate primers were designed based on P-loop and GLPL motifs of the Nucleotide Binding (NB) domains found in 54 known potato, tomato and pepper NB-LRR resistance gene sequences obtained from Genbank (data not shown). Identical thermal cycling conditions, using the primer combinations in Table 1, were used for all PCR amplifications: 1′ 94 C -(30″ 94C – 30″ 52C – 1′ 72C) 35× – 5′ 72C.

Table 1 Degenerate primer combinations used for the amplification of NBS sequences from potato

The resulting PCR products were purified directly using the QIAquick PCR purification kit (QIAGEN) according to the manufacturer’s instructions. Purified PCR products were ligated in the pGEMTeasy vector (QIAGEN) and transformed to DH5α (Invitrogen) cells according to the manufacturer’s instructions. For each primer combination, 96 clones were re-amplified using the original primers and clones representative of each product size were sent for Sanger sequencing (Greenomics, Wageningen, The Netherlands). Probes for screening the BAC library were selected from the resulting sequences as described in the results. Probe construction and filter hybridisation was performed by Greenomics (Wageningen, The Netherlands), using 15 different probe pools based on the 96 NB sequences.

BAC analysis

For BACs selected for sequencing, the insert size was determined using CHEF gel electrophoresis on a 1% agarose gel (Seakem® Gold, FMC, Philadelphia, PA, USA) in 0.5×TBE buffer at 4°C using a BIORAD CHEF DR II system (Bio-Rad laboratories, Hercules, CA, USA) at 200 V with a pulse time of 5–15 s for 18 h.

Sequence analysis

Sequencing to six-fold coverage and assembly of BAC clones to phase 1 was done by Greenomics (Wageningen, The Netherlands). Genomic BAC sequences were translated into 6 frames using a PERL (www.perl.com) script. BLASTp (Gish and States 1993) was used to match a set of 59 known functional resistance genes, separated into different domains based on literature (Table 2), against the database of translated BACs. BLASTp output took the form of a list of homologous BACs for each R gene domain, which was parsed using R2BAC (Goto et al. 2010) to identify the location of matching R gene domains inside the BACs. BAC sequences have been submitted to Genbank: AC238121, AC238183, AC232102, AC237998, AC238257, AC238084, AC238124, AC238122, AC238122, AC238122, AC238122, AC238122, AC238122, AC238122, AC238122, AC238261, AC238307, AC237896, AC237892, AC238268, AC237880, AC238396, AC238310, AC238277, AC238378, AC238011, AC238356, AC238396, AC234535, AC237944, AC238154, AC237844, AC238274, AC237846, AC238200, AC2382831, AC237953, AC237983, AC237843, AC238108, AC238211, AC238221, AC238176, AC238227, AC238336, AC238074, AC238299, AC238304, AC238344, AC238372, AC238222, AC237902, AC237930, AC232101, AC237980, AC237842, AC238190, AC232105, AC237826, AC237833, AC237875, AC234558, AC237927, AC238292, AC237906, AC237966, AC238063, AC238128, AC238172, AC238197, AC237841, AC237847, AC237855, AC238340, AC238022, AC237958, AC237973, AC234529, AC237970, AC237837, AC238099, AC237996, AC237865, AC237914, AC238314, AC238386, AC238387, AC237832, AC237840, AC238065, AC237848, AC237853, AC237857, AC237859, AC236625, AC237861, AC237862, AC237866, AC239995, AC238021, AC238132, AC239974, AC238178, AC238279, AC240056, AC239998, AC240052, AC239956, AC239979, AC239982, AC239958, AC240012, AC240037, AC240057, AC238125, AC240042, AC239955, AC238020, AC240025, AC239996, AC240059, AC240062, AC240070, AC240041, AC240072, AC240016, AC239971, AC238036, AC240047, AC239983, AC237905, AC240023, AC240063, AC240065, AC239989, AC240010, AC240048, AC239981, AC239980, AC240000, AC239980, AC240066, AC239987, AC240004, AC237886, AC240014, AC240007, AC240024, AC240040, AC240022, AC240073, AC240011, AC240014, AC240015, AC239967, AC239976, AC240003, AC240064, AC240069, AC239965, AC239975, AC237835, AC239991, AC239992, AC240020, AC240029, AC240044, AC240055, AC240071, AC240005, AC240006, AC232094, AC240028, AC240038, AC238291, AC239966, AC240001, AC240009, AC240013, AC240019, AC232092, AC240124, AC238273, AC237872, AC240032, AC232110, AC232116, AC240050, AC237883, AC237889 and AC23789.

Table 2 Functional NB-LRR resistance genes from various plant species

SSR analysis

Primers (Supplemental Table 1) for Simple Sequence Repeat (SSR) markers (Goldstein et al. 1995) were based on BAC sequences obtained in this study and were used on BAC DNA and genomic DNA using the following thermal cycling conditions: 5′ 94C – (30″ 94C – 30″ 56C – 30″ 72C)25× – 7″ 72C. SSR markers were visualised using a Li-cor sequencer (Li-cor, Lincoln, NB, USA) according to the manufacturer’s instructions.

CAPS analysis

Cleaved Amplified Polymorphic Sequences (CAPS) analysis (Konieczny and Ausubel 1993) was performed using the following thermal cycling conditions: 5′ 94 C – (30″ 94C – 30″ Tm – 30″ 72C)30× – 7″ 72C. Primers were either designed on BAC sequences within this study, or on sequence information derived from the PoMaMo database (http://gabi.rzpd.de/projects/Pomamo/; Meyer et al. 2005) or the SOL Genomics Network (http://solgenomics.net/; Mueller et al. 2005), or obtained from literature. Primers, annealing temperature (Tm), the appropriate endonuclease for the detection of polymorphism and the source are listed in Supplemental Table 2. DNA fragments of the CAPS markers were separated on 1% agarose in 1× TAE buffer at 120 V.

NBS profiling

NBS profiling (van der Linden et al. 2004), was performed on the RH × SH mapping population to create a map of NBS specific markers in potato (van der Linden, unpublished results). Markers segregating from RH were expected to (partly) resemble bands derived from RGHs identified in this study. To anchor RGH containing BACs with these NBS specific markers, a similar NBS profiling study was performed on the RH BAC library that was pooled (Borm 2008) for the direct identification of single NB-LRR containing BACs. For this, NB-site specific primers NB1, NB2, NB5a6, and NB9 (van der Linden et al. 2004) were used to screen these complex pooled-BAC-pools from the RH BAC library in an NBS profiling assay. NBS profiling was performed essentially as described previously (van der Linden et al. 2004). Adaptors were ligated to the restrictions sites of AluI, HaeIII, MseI, RsaI and TaqI. The NB-site specific primers in combination with adaptor primers were used for PCR, and fragments were fractionated using denaturing polyacrylamide gel electrophoresis. Bands derived from RGH containing RH BACs were compared to the NBS profiling markers derived from RH (van der Linden, unpublished results). Co-migrating bands in the complex BAC pools and the genomic DNA of RH that were constructed with the same enzyme/primer combination were assumed to be derived from the same locus. This method was used to anchor RGH containing BACs to the bin signatures of the UHD map (van Os et al. 2006).

Genetic mapping

A mapping population of 136 F1 genotypes from the cross between the diploid potato clones RH x SH83-92-488 (SH) was available (Rouppe van der Voort et al. 1997). For genetic mapping in this study, a subset of 45 offspring genotypes was selected using the software package MapPop (Vision et al. 2000) that identifies the most informative genotypes from a mapping population based on the maximum number of recombination events distributed over the genome. The consensus bin signatures of the UHD map of potato (van Os et al. 2006) were used as input for MapPop. Segregating bands were mapped with the software package BINmap+ (Borm 2008), that uses the bin signatures of the UHD map to match segregation patterns. Genomic DNA from SH, RH and progeny was extracted as described (van Os et al. 2006). BAC DNA was isolated using a high-throughput protocol, adapted from (Sambrook et al. 1989) as described (Borm 2008).

In silico anchoring

To determine overlap between some RGH containing BACs and previously anchored BAC contigs from the physical map of potato (Borm 2008) and thereby providing a genetic anchor for these RGH containing BACs, the BACend tool was used. The BACend tool uses a high stringency (98% nucleotide identity) BLAST to compare a query sequence to the BAC-end sequences of the RH BAC library (Borm 2008), displaying results in their physical map context. The BAC-end tool also filters BLAST results to ensure that BAST hits are structurally sound (ignoring non full-length BLAST hits in the middle of sequence- fragments) and non-repetitive (as determined from the BAC-end sequences) and of sufficient length (>100 base-pairs).

BLASTn (Altschul et al. 1990) was performed using R gene sequences [Mi-1 (Milligan et al. 1998) and Rpi-blb2 (van der Vossen et al. 2005)] or marker sequences of a known genomic location. The marker sequences were derived from SOL Genomics Network (http://solgenomics.net/; Mueller et al. 2005), or kindly provided by G. van der Linden (NBS blast). For marker sequences, a threshold of >98% nucleotide identity over the complete length of the marker sequence was used. For the R gene BLAST, the threshold was lower (>80%), based on the average similarity within an R gene cluster, but matching the complete ORF of the R gene. With these stringent criteria, it was assumed that when a match in an RGH containing BAC is found, such BAC has the same genetic location.

Results

Identification of BACs harbouring NB-LRR Resistance Gene Homologs (RGHs)

To assess the NB-LRR resistance gene super-family in the potato genome, sequence information of the conserved NB domain was used to design probes for the screening of a BAC library derived from the diploid potato genotype RH89-039-16 (RH). In total, eight primer combinations were designed based on the conserved P-loop and the GLPL motifs in the NB domain and used to amplify and sequence 150 fragments from genomic DNA of RH. Using Neighbour-Joining analysis, the 150 sequences could be divided into 32 groups with >85% similarity (data not shown). For each group, three representative probes were selected that ranged in size from 450 to 600 base pairs. The resulting collection of 96 probes was divided into 15 probe-pools to screen a BAC library of RH. This resulted in the detection of 1,535 BAC clones potentially harbouring resistance gene homologs (RGHs) encoding NB-LRR proteins.

Selection of BACs for sequencing

RGH sequences are often located in arrays of homologous sequences in so-called R gene clusters in the genome of plants, including potato [e.g. GroI (Paal et al. 2004), R1 (Ballvora et al. 2002) and Gpa2 (van der Vossen et al. 2000)]. Therefore, it is likely that members of a single cluster will be present in multiple overlapping BACs, forming a physical map contig. To identify overlap between the positive BACs and to remove redundant BACs, the physical map of RH (Borm 2008) was used to locate 1,402 of the 1,533 positive BAC clones in 502 physical map contigs. Many physical map contigs harboured only one out of several BACs that hybridised with an NBS probe. This was considered to be the result of a falsely positive signal. Therefore, 310 BACs were omitted from further analysis. The remaining, 192 physical map contigs were selected for further analysis as they contained at least two positive overlapping BACs.

To obtain a genome-wide collection of RGH sequences, one representative BAC from each contig was selected for Sanger sequencing (6× coverage). Of the 131 positive BAC clones that could not be assigned to a physical map contig, but may represent small clusters or even single R genes, 96 were randomly selected for sequencing too. In total, 33 million base pairs of sequence data were obtained for 288 BACs, divided into 2,958 sequence contigs (on average 10.27 sequence contigs per BAC).

Identification of resistance gene homologs

To identify the RGHs for each BAC, a high-throughput method was developed to search the complete set of BAC sequences that we obtained. A local BLASTp search was performed on six translated frames of the complete collection of BAC sequences with the sequences of the CC/TIR, NB and LRR domains of a set of 59 known resistance proteins (Table 2). This resulted in 2,184 significant BLAST hits (E value ≤0.05), which are presented in Supplemental Table 3. If a sequenced BAC fragment harbours different R protein domains in the correct order on a DNA sequence read of about 5 kb, it is assumed that they represent one single gene. In this way, we could reconstruct a total of 738 partial and full-length RGH sequences derived from 195 BACs, which could be subdivided in 280 RGHs that harbour a TIR domain and 448 that harbour a CC domain (Supplemental Table 3).

Anchoring the RGH containing BACs to the genetic map of potato

To determine the genomic locations of the RGHs, we first compared our set of corresponding BACs to the physical map of potato (Borm 2008) that was constructed from the same BAC library. In the latest version of the physical map, 53 of the RGH containing BACs were genetically anchored or merged to contigs with other RGH containing BACs (de Boer et al., unpublished data). In addition, a variety of anchoring methods were used (Table 3). For SSR, CAPS and NBS profiling, a subset of 45 most informative genotypes of the SHxRH mapping population was selected. This combination of genetic mapping and in silico anchoring enabled us to determine the genetic location of 169 of the 195 RGH containing BACs (87%). Twenty-three BACs could only be anchored to a linkage group of the genetic map and not to a bin (range). Eighteen of them could be added to a locus based on their RGH content. The remaining 5 BACs are anchored to chromosome X (4) and XII (1). Based on their RGH content, the four BACs on chromosome X could be combined into one locus. A complete overview of the anchoring of the BACs to the linkage groups is depicted in Supplemental Fig. 1. Thirty-eight BACs were mapped by more than one independent method, confirming their map locations. Specific information on the map positions per BAC is provided in Supplemental Table 3.

Table 3 Overview of the methods used to anchor the sequenced BAC clones to the genetic map of potato

Identification of 47 RGH loci in the potato genome

Our results show that the majority of RGH containing BACs map in clusters across the potato genome, except for LGII and LGIII (Supplemental Fig. 1). A total of 151 BACs mapped to a position with at least one other BAC harbouring highly homologous RGHs, suggesting they are derived from complex loci. Eighteen BACs were mapped alone, or did not harbour RGHs with similar BLAST results as those mapping at the same genetic position. Eight of these BACs harbour more than one RGH, suggesting they are representing relatively small clusters, while ten harbour only one RGH that may be a simple locus (Supplemental Table 3).

Sometimes, the BACs mapping to the same genetic position all harbour highly homologous RGHs, like for example the locus at the distal end of the long arm of chromosome V, where all BACs harbour only Rpi-blb2 homologs (Supplemental Fig. 1). However, BACs harbouring different types of RGH are not always genetically separated from each other, although they might have some physical distance. For example, on the short arm of chromosome VI (Supplemental Fig. 1) nine BACs harbour Rpi-blb2 homologs, six BACs harbour Bs4 homologs, and one BAC harbours both and the genetic map positions of the BACs are partially overlapping.

On the basis of these observations, we developed an overall strategy for assigning BACs to discrete RGH loci whereby we assumed that RGHs resulting in similar BLAST hits and present on BAC clones with the same genetic map location are derived from the same locus. Using this definition, 47 RGH loci could be identified (Fig. 1 and Table 4). The RGH locus names are composed of the chromosome number, a letter for each RGH locus on that chromosome and an indication whether they represent TNL or CNL RGHs.

Fig. 1
figure 1

Schematic representation of the genome-wide integrated genetic map of RGHs and resistance trait loci in potato. Chromosomes or linkage groups are represented by broad vertical bars with the bin signatures (van Os et al. 2006) separated by white horizontal bars. To the right of each linkage group, genetic markers are indicated in black. Previously described R genes which reside at syntenic loci are indicated in red and previously described R genes for which no syntenic loci is shown are indicated in black. Resistance trait loci that map approximately to an RGH locus are indicated in green and otherwise in blue. Horizontal dashed bars indicate that the marker used to map a previously described resistance trait locus is also mapped in the UHD map. To the left of each linkage group, RGH (TNL and CNL) loci are indicated in red if they are syntenic to previously described R genes, in green if they map approximately at the same location as a resistance trait locus and in blue if they do not. Thin vertical bars represent genetic intervals and thin horizontal bars a genetic map position

Table 4 List of RGH loci, their map position in the UHD map of potato and the sequence homology to the closest known R protein for each domain (based on BLASTp results)

To see if the RGH loci identified in this study co-localise to R gene loci described previously, the approximate genetic map positions of a set of 21 known solanaceous R genes (Table 5) were included in the RGH map. From this, we could conclude that eleven of these RGH loci co-localise with known functional R genes like for instance the Rpi-blb2 locus at the short arm of chromosome VI and the Gpa2/Rx1 locus at the distal end of the short arm of chromosome XII (Fig. 1). However, no RGH loci with homology to the potato R genes R2 (and homologs) and Rpi-blb1 or the tomato genes Nrc1 and Nrg1 were found. Thirty-six RGH loci are not syntenic to existing R genes and considered to be novel RGH loci harbouring TNL (12) and CNL (24) encoding R gene homologs.

Table 5 List of known functional solanaceous R genes, the genomic location in the genotype in which they were identified and the corresponding location in the UHD map of RH

Integrating disease resistance trait loci in the RGH genetic map

For the integration of potato disease resistance trait loci in the RGH map, 68 universal CAPS markers were first mapped in RH (Fig. 1) to create anchor points for each linkage group to facilitate the comparison of genetic maps obtained for different Solanum genotypes. For two resistance QTLs [i.e. Eca1a (Zimnoch-Guzowska et al. 2000) and K31_T1-11 (Oberhagemann et al. 1999)], the CAPS markers could be used directly, because the same marker was used to map the resistance trait in the original mapping population. In most cases, however, the tomato-EXPEN 2000 map (Fulton et al. 2002) on the Sol Genomics Network (Mueller et al. 2005), or the PoMaMo database (Meyer et al. 2005) were used to estimate the genomic position by comparing the location of the marker used to map the resistance trait and the location of the markers mapped in this study. In this way, 58 more trait loci for which no functional R gene has been identified yet could be integrated in the UHD genetic map of RH. The loci that have been mapped were as reviewed (Gebhardt and Valkonen 2001; Hein et al. 2009). In addition, fifteen loci were integrated that were not described in these reviews i.e. Ny-1 (Szajko et al. 2008), Ry-f sto (Flis et al. 2005), Ry sto (Song et al. 2005), Ry chc (Sato et al. 2006), Gm and Rm (Marczewski et al. 2006), Ns (Marczewski et al. 2002), Ny tbr (Celebi-Toprak et al. 2002), PLRV.I (Marczewski et al. 2001), PLRV.4 (Marczewski et al. 2004), Rl adg (Velásquez et al. 2007), MfaXII spl (Bakari et al. 2006), GpaM1 (Caromel et al. 2003), GpaV sspl and GpaXI sspl (Caromel et al. 2005). After integration of this information in the RGH map, we could observe that twenty-one RGH loci may be linked to forty-six disease traits for which no R gene sequence information is available yet. In addition, thirteen remaining RGH loci were still not linked to any disease resistance trait.

Discussion

Here, we present a comprehensive overview of the genome-wide distribution of NB-LRR loci harbouring resistance gene homologs (RGHs) in potato. RGH sequences were retrieved after analysing the sequences of 288 unique BAC clones, which were selected from a BAC library of the diploid potato clone RH89-039-16 (S. tuberosum ssp. tuberosum). This resulted in the identification of 738 partial and full-length NB-LRR encoding genes. Based on homology of the RGH sequences with known resistance gene proteins, the complete set of RGH could be subdivided in 280 TIR-NB-LRR (TNL) encoding sequences and 448 sequences encoding CC-NB-LRR (CNL). The RGH containing BACs were anchored to the genetic and physical map of potato, which allowed us to determine the location of R gene loci in the potato genome. A total of 47 NB-LRR loci were detected in the potato genome, including 36 novel loci and 11 loci syntenic with previously identified functional resistance gene loci (Hero, R1, Bs4, Prf, Rpi-blb2/Mi1/CaMi, Gro1, Sw5, Y-1/N, R3a/I2 and Gpa2/Rx1/Bs2). A literature survey was conducted to integrate the resistance trait loci described for potato. This showed that they often co-localise with the TNL and CNL loci obtained in this study. Hence, the integrated genome-wide genetic map of potato presented in this paper provides an excellent template for the development of markers for marker-assisted selection and for candidate gene approaches for the identification of functional R genes.

Leister et al. (1996) produced a genetic map in potato using (amongst others) NB-domain targeting RFLP markers (St markers). As expected, the majority of the 27 St markers map in an area where NB-LRR sequences were detected in this study. In addition, we identified at least 27 loci that do harbour RGHs but which were not detected by an St marker. However, nine St markers present on chromosomes II, III, IV, VIII, X and XII map to a genomic location where no RGHs have been detected. This discrepancy may indicate that our NB-based screening of the RH BAC library was not saturated, which is supported by the observation that we were not able to identify a number of R genes, including R2, R2-like, Rpi-blb3/Rpi-abpt on chromosome IV (Lokossou et al. 2009) and Rpi-blb1 on chromosome VIII (van der Vossen et al. 2003). This bias in our collection can be explained by the lack in sequence homology between the NB regions used to design the probes to screen the BAC library and the corresponding region in the NB domain of this particular set of R genes (data not shown). By using a BAC library covering 5× the genome, it is very well possible that not all RGH loci present in the potato genome were represented in the library. It is very unlikely that some RGH loci are not present in RH, because unpublished results for the Rx1/Gpa2 locus and for RGH locus Vg-CNL show that they are present in over 500 genotypes derived from different Solanum species.

The data presented in this study show that the genomic distribution of RGH loci is not random as illustrated by the low number of RGHs present on chromosome I compared to for instance the large number of RGHs identified on chromosome XI that harbours six RGH-loci, corresponding to 6 St markers. This non-random distribution of RGH loci was anticipated by Gebhardt and Valkonen (2001) based on the non-random distribution of resistance trait loci. Therefore, the genome-wide NB-LRR map presented in this study was integrated with resistance trait loci as reviewed by Gebhardt and Valkonen (2001) and Hein et al. (2009). The map has been supplemented with 15 loci that have not been described by these two papers. Thirty-eight resistance trait loci approximately co-localise with TNL and CNL loci (Fig. 1), providing a template for a candidate-gene approach. Interestingly, many of these are quantitative resistance traits, suggesting NB-LRR resistance genes underlie these quantitative loci. At least ten RGH loci do not co-localise with known resistance trait loci, which could potentially facilitate a candidate gene approach for the identification of genes underlying as yet unmapped resistance traits.

In compliance with previously analysed plant species like Arabidopsis (Meyers et al. 2003) poplar (Kohler et al. 2008) and rice (Zhou et al. 2004), this study shows that the potato genome harbours a substantial and highly diverse super-family of NB-LRR genes, which can be divided in a TIR and a CC class of RGHs. The ratio between TNL and CNL as observed in this study is 5:8, which is in the range of the ratio observed in other dicot species like Arabidopsis, lettuce, poplar and grapevine, where the ratio TIR:CC is 2:1, 1:1, 3:7 and 1:2, respectively (Kohler et al. 2008; McHale et al. 2009; Meyers et al. 2003; Yang et al. 2008). Remarkably, in monocot species the TIR RGHs are almost absent. Rice contains only three TIR-NBS sequences, but they lack the LRR domain and have very deviant NBS sequences (Bai et al. 2002; Zhou et al. 2004).

Most RGHs are present on complex loci (Michelmore and Meyers 1998) and the findings in this study comply with this. However, in eleven cases only one single RGH could be allocated to a locus. They may represent simple loci, but it can also be the result of incomplete sequencing. Interestingly, many loci, both complex and simple, reside in hotspots of resistance genes. In these regions (for instance on chromosome V between bins 40 and 50) the distinction between different clusters could not be based on genetic segregation and is purely based on sequence divergence (Supplemental Fig. 1). On the basis of this genetic map, it is impossible to determine whether these RGH clusters are still physically separated or completely intermingled. However, in three cases, two different types of RGH are even present on the same BAC. In the Arabidopsis genome sequence, mixed RGH clusters have been identified (Meyers et al. 2003) and therefore it is possible that the RGH clusters are not physically discrete. If they are, the fact that two types of RGH are present on one BAC proves that they can be physically very close in potato (i.e. less than 100 kb).

Grube et al. (2000) showed that R gene loci in the solanaceous crop genera tomato, potato and pepper are present on corresponding positions and that homologs of previously identified R genes derived from tomato and tobacco could be identified in syntenious positions in pepper. In this study, it was shown that RGH sequences co-localise with resistance gene loci in potato. Synteny of R gene loci between different potato genotypes or even other solanaceous species is illustrated by the fact that NB-LRR sequences homologous to previously identified R genes in potato, tomato, pepper and tobacco have been found at corresponding genomic locations. However, it is also known that resistance gene loci are very dynamic regions (Ballvora et al. 2007; Kuang et al. 2005) and that micro-synteny seldom exists. At a larger scale, synteny can also be partly lost between different genotypes. This is shown on chromosome V, where three functional R genes have previously been identified, namely R1, Prf and Bs4 (Ballvora et al. 2002; Salmeron et al. 1996; Schornack et al. 2004). Although two of these genes (Prf and Bs4) are derived from tomato, a partly sequenced physical map of this region in S. demissum revealed that homologs of all three genes are present in potato (Kuang et al. 2005). Indeed, in RH, the homologs of all three genes have been identified and mapped as well (this study). In S. demissum, the order of the three clusters, including the two markers Gp21 and Gp179 and coming from the telomere, is Gp21, Bs4, R1, Prf, Gp179 (Kuang et al. 2005). Interestingly, in the genetic map of RH the order is Gp21, R1, Gp179, Prf, Bs4 (Fig. 2). The markers in RH are fitted to the BIN signatures of the UHD map, which is the most parsimonious marker order based on 10,000 markers and considered to be very robust (Isidore et al. 2003; van Os et al. 2006). Nevertheless, an incorrect marker order can never be completely ruled out. This lack of synteny, genuine or artificial, can be a drawback for applications.

Fig. 2
figure 2

Comparative map representing the short arm of chromosome V of S. demissum and S. tuberosum (RH). The vertical bars represent the genetic maps. Genetic positions of markers Gp21 and Gp179 are indicated with a horizontal bar and their comparison between the two maps with a dashed line. Comparison of the RGH loci syntenic to Bs4, R1 and Prf are indicated with lines. The direction of the telomere is indicated with an arrow

Breeders want to incorporate agronomically interesting resistance traits in their breeding material. Marker assisted selection is a technique that can facilitate this process. However, for marker assisted selection, it is essential that markers are available that are diagnostic for the trait of interest (Moloney et al. 2010). It was shown that markers that are genetically very close in a potato mapping population are not always diagnostic in breeding material (Moloney et al. 2010). A specific marker developed on the resistance gene itself has the highest change of being diagnostic. It was shown that cluster specific markers can be developed in the LRR region, which is the most variable part of the resistance gene (Bakker et al. 2003); (Finkers-Tomczak et al. 2011). It is also possible to develop gene specific markers for each member of a resistance gene cluster (Finkers-Tomczak et al. 2011). The genome-wide integrated map presented in this study, links resistance trait loci with NB-LRR loci, and will provide a template to design cluster-specific markers and closely linked markers for the benefit of marker-assisted selection and candidate gene approaches. The potato genome sequence will be available soon (Visser et al. 2009), and this will allow us to fill the gaps and complete the collection of NB-LRR sequences.