Structural variants in 3000 rice genomes

  1. Nickolai Alexandrov1
  1. 1International Rice Research Institute, Laguna 4031, Philippines;
  2. 2Bioinformatics Group, Wageningen University and Research, 6708 PB Wageningen, the Netherlands;
  3. 3Systems and Computing Engineering Department, Universidad de Los Andes, Bogotá 111711, Colombia;
  4. 4Agrobiodiversity Research Area, International Center for Tropical Agriculture (CIAT), Cali 6713, Colombia;
  5. 5Biology Department, Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA;
  6. 6Roche Sequencing Solutions, Belmont, California 94002, USA;
  7. 7Arizona Genomics Institute, University of Arizona, Tucson, Arizona 85721, USA;
  8. 8King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia;
  9. 9Department of Biology, University of La Verne, La Verne, California 91750, USA;
  10. 10Vavilov Institute of General Genetics, Moscow 119333, Russia;
  11. 11A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow 127051, Russia;
  12. 12Laboratory of Forest Genomics, Siberian Federal University, Krasnoyarsk 660041, Russia
  1. 13 These authors contributed equally to this work.

  • Corresponding authors: nickolai.alexandrov{at}gmail.com, ttatarinova{at}laverne.edu, ja.duitama{at}uniandes.edu.co, andrey.grigoriev{at}rutgers.edu, r.mauleon{at}irri.org
  • Abstract

    Investigation of large structural variants (SVs) is a challenging yet important task in understanding trait differences in highly repetitive genomes. Combining different bioinformatic approaches for SV detection, we analyzed whole-genome sequencing data from 3000 rice genomes and identified 63 million individual SV calls that grouped into 1.5 million allelic variants. We found enrichment of long SVs in promoters and an excess of shorter variants in 5′ UTRs. Across the rice genomes, we identified regions of high SV frequency enriched in stress response genes. We demonstrated how SVs may help in finding causative variants in genome-wide association analysis. These new insights into rice genome biology are valuable for understanding the effects SVs have on gene function, with the prospect of identifying novel agronomically important alleles that can be utilized to improve cultivated rice.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.241240.118.

    • Freely available online through the Genome Research Open Access option.

    • Received June 29, 2018.
    • Accepted March 11, 2019.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server