Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics

Koh Aoki, Kentaro Yano, Ayako Suzuki, Shingo Kawamura, Nozomu Sakurai, Kunihiro Suda, Atsushi Kurabayashi, Tatsuya Suzuki, Taneaki Tsugane, Manabu Watanabe, Kazuhide Ooga, Maiko Torii, Takanori Narita, Tadasu Shin-i, Yuji Kohara, Naoki Yamamoto, Hideki Takahashi, Yuichiro Watanabe, Mayumi Egusa, Motoichiro KodamaYuki Ichinose, Mari Kikuchi, Sumire Fukushima, Akiko Okabe, Tsutomu Arie, Yuko Sato, Katsumi Yazawa, Shinobu Satoh, Toshikazu Omura, Hiroshi Ezura, Daisuke Shibata

Research output: Contribution to journalArticle

129 Citations (Scopus)

Abstract

Background: The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance.Results: To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%.Conclusion: The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.

Original languageEnglish
Article number210
JournalBMC Genomics
Volume11
Issue number1
DOIs
Publication statusPublished - Mar 30 2010

Fingerprint

Solanaceae
Lycopersicon esculentum
Genomics
Complementary DNA
Genome
Exons
Introns
DNA Shuffling
Untranslated Regions
Plant Genes
5' Untranslated Regions
Expressed Sequence Tags
3' Untranslated Regions

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics. / Aoki, Koh; Yano, Kentaro; Suzuki, Ayako; Kawamura, Shingo; Sakurai, Nozomu; Suda, Kunihiro; Kurabayashi, Atsushi; Suzuki, Tatsuya; Tsugane, Taneaki; Watanabe, Manabu; Ooga, Kazuhide; Torii, Maiko; Narita, Takanori; Shin-i, Tadasu; Kohara, Yuji; Yamamoto, Naoki; Takahashi, Hideki; Watanabe, Yuichiro; Egusa, Mayumi; Kodama, Motoichiro; Ichinose, Yuki; Kikuchi, Mari; Fukushima, Sumire; Okabe, Akiko; Arie, Tsutomu; Sato, Yuko; Yazawa, Katsumi; Satoh, Shinobu; Omura, Toshikazu; Ezura, Hiroshi; Shibata, Daisuke.

In: BMC Genomics, Vol. 11, No. 1, 210, 30.03.2010.

Research output: Contribution to journalArticle

Aoki, K, Yano, K, Suzuki, A, Kawamura, S, Sakurai, N, Suda, K, Kurabayashi, A, Suzuki, T, Tsugane, T, Watanabe, M, Ooga, K, Torii, M, Narita, T, Shin-i, T, Kohara, Y, Yamamoto, N, Takahashi, H, Watanabe, Y, Egusa, M, Kodama, M, Ichinose, Y, Kikuchi, M, Fukushima, S, Okabe, A, Arie, T, Sato, Y, Yazawa, K, Satoh, S, Omura, T, Ezura, H & Shibata, D 2010, 'Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics', BMC Genomics, vol. 11, no. 1, 210. https://doi.org/10.1186/1471-2164-11-210
Aoki, Koh ; Yano, Kentaro ; Suzuki, Ayako ; Kawamura, Shingo ; Sakurai, Nozomu ; Suda, Kunihiro ; Kurabayashi, Atsushi ; Suzuki, Tatsuya ; Tsugane, Taneaki ; Watanabe, Manabu ; Ooga, Kazuhide ; Torii, Maiko ; Narita, Takanori ; Shin-i, Tadasu ; Kohara, Yuji ; Yamamoto, Naoki ; Takahashi, Hideki ; Watanabe, Yuichiro ; Egusa, Mayumi ; Kodama, Motoichiro ; Ichinose, Yuki ; Kikuchi, Mari ; Fukushima, Sumire ; Okabe, Akiko ; Arie, Tsutomu ; Sato, Yuko ; Yazawa, Katsumi ; Satoh, Shinobu ; Omura, Toshikazu ; Ezura, Hiroshi ; Shibata, Daisuke. / Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics. In: BMC Genomics. 2010 ; Vol. 11, No. 1.
@article{9b808d83a59e4453ae519d867689f144,
title = "Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics",
abstract = "Background: The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance.Results: To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061{\%}.Conclusion: The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.",
author = "Koh Aoki and Kentaro Yano and Ayako Suzuki and Shingo Kawamura and Nozomu Sakurai and Kunihiro Suda and Atsushi Kurabayashi and Tatsuya Suzuki and Taneaki Tsugane and Manabu Watanabe and Kazuhide Ooga and Maiko Torii and Takanori Narita and Tadasu Shin-i and Yuji Kohara and Naoki Yamamoto and Hideki Takahashi and Yuichiro Watanabe and Mayumi Egusa and Motoichiro Kodama and Yuki Ichinose and Mari Kikuchi and Sumire Fukushima and Akiko Okabe and Tsutomu Arie and Yuko Sato and Katsumi Yazawa and Shinobu Satoh and Toshikazu Omura and Hiroshi Ezura and Daisuke Shibata",
year = "2010",
month = "3",
day = "30",
doi = "10.1186/1471-2164-11-210",
language = "English",
volume = "11",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics

AU - Aoki, Koh

AU - Yano, Kentaro

AU - Suzuki, Ayako

AU - Kawamura, Shingo

AU - Sakurai, Nozomu

AU - Suda, Kunihiro

AU - Kurabayashi, Atsushi

AU - Suzuki, Tatsuya

AU - Tsugane, Taneaki

AU - Watanabe, Manabu

AU - Ooga, Kazuhide

AU - Torii, Maiko

AU - Narita, Takanori

AU - Shin-i, Tadasu

AU - Kohara, Yuji

AU - Yamamoto, Naoki

AU - Takahashi, Hideki

AU - Watanabe, Yuichiro

AU - Egusa, Mayumi

AU - Kodama, Motoichiro

AU - Ichinose, Yuki

AU - Kikuchi, Mari

AU - Fukushima, Sumire

AU - Okabe, Akiko

AU - Arie, Tsutomu

AU - Sato, Yuko

AU - Yazawa, Katsumi

AU - Satoh, Shinobu

AU - Omura, Toshikazu

AU - Ezura, Hiroshi

AU - Shibata, Daisuke

PY - 2010/3/30

Y1 - 2010/3/30

N2 - Background: The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance.Results: To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%.Conclusion: The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.

AB - Background: The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance.Results: To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%.Conclusion: The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.

UR - http://www.scopus.com/inward/record.url?scp=77950400977&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950400977&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-11-210

DO - 10.1186/1471-2164-11-210

M3 - Article

C2 - 20350329

AN - SCOPUS:77950400977

VL - 11

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - 1

M1 - 210

ER -