Integrative Genomics

Research Activities

With the aim of advancing personalized medicine and personalized prevention, we conduct genome analyses of biological specimens provided by cohort studies using next-generation sequencer. We also conduct omics analyses of protein and low-molecular metabolites using NMR machines, mass spectrometer, amongst other instruments. This data is processed using a super computer and other technologies.
In parallel with these analyses, we are developing technologies to analyze biological specimens, as well as software and information analysis methodologies for processing large quantities of data, including genome data.
We have completed whole genome sequence analysis for 69,000 Japanese in 2023. This is one of the largest whole genome sequence analyses of a general population in the world.
We have furthermore developed a tool for analyzing the genomes of Japanese people, which has allowed the analysis of quasi-whole-genome sequences of Japanese people (Japonica Array).

Key Word: Whole genome sequence, Population genetics, Genome reference panel, Japanese reference sequence, Metabolomics, Bioinformatics, High performance computing

Publications

- The structural origin of metabolic quantitative diversity
  Seizo Koshiba, Ikuko Motoike, Kaname Kojima, Takanori Hasegawa,Matsuyuki Shirota, Tomo Saito, Daisuke Saigusa, Inaho Danjoh, Fumiki Katsuoka, Soichi Ogishima, Yosuke Kawai, Yumi Yamaguchi-Kabata, Miyuki Sakurai, Sachiko Hirano, Junichi Nakata, Hozumi Motohashi, Atsushi Hozawa, Shinichi Kuriyama, Naoko Minegishi, Masao Nagasaki, Takako Takai-Igarashi, Nobuo Fuse, Hideyasu Kiyomoto, Junichi Sugawara, Yoichi Suzuki, Shigeo Kure, Nobuo Yaegashi, Osamu Tanabe, Kengo Kinoshita, Jun Yasuda & Masayuki Yamamoto
  Scientific Reports 6, Article number: 31463 (2016); doi:10.1038/srep31463
  article
- 3.5KJPNv2: an allele frequency panel of 3552 Japanese individuals including the X chromosome
  Shu Tadaka, Fumiki Katsuoka, Masao Ueki, Kaname Kojima, Satoshi Makino, Sakae Saito, Akihito Otsuki, Chinatsu Gocho, Mika Sakurai-Yageta, Inaho Danjoh, Ikuko N. Motoike, Yumi Yamaguchi-Kabata, Matsuyuki Shirota, Seizo Koshiba, Masao Nagasaki, Naoko Minegishi, Atsushi Hozawa, Shinichi Kuriyama, Atsushi Shimizu, Jun Yasuda, Nobuo Fuse, the Tohoku Medical Megabank Project Study Group, Gen Tamiya, Masayuki Yamamoto & Kengo Kinoshita
  Human Genome Variation volume 6, Article number: 28 (2019); doi:10.1038/s41439-019-0059-5
  article
- Estimating carrier frequencies of newborn screening disorders using a whole-genome reference panel of 3552 Japanese individuals
  Yumi Yamaguchi-Kabata, Jun Yasuda, Akira Uruno, Kazuro Shimokawa, Seizo Koshiba, Yoichi Suzuki, Nobuo Fuse, Hiroshi Kawame, Shu Tadaka, Masao Nagasaki, Kaname Kojima, Fumiki Katsuoka, Kazuki Kumada, Osamu Tanabe, Gen Tamiya, Nobuo Yaegashi, the Tohoku Medical Megabank Project Study Group, Kengo Kinoshita, Masayuki Yamamoto, Shigeo Kure
  Human Genetics, April 2019, Volume 138, Issue 4, pp 389–409; doi:10.1007/s00439-019-01998-7
  article
- Omics research project on prospective cohort studies from the Tohoku Medical Megabank Project.
  Seizo Koshiba, Ikuko Motoike, Daisuke Saigusa, Jin Inoue, Matsuyuki Shirota, Yasutake Katoh, Fumiki Katsuoka, Inaho Danjoh, Atsushi Hozawa, Shinichi Kuriyama, Naoko Minegishi, Masao Nagasaki, Takako Takai‐Igarashi, Soichi Ogishima, Nobuo Fuse, Shigeo Kure, Gen Tamiya, Osamu Tanabe, Jun Yasuda, Kengo Kinoshita, Masayuki Yamamoto
  Genes to Cells, April 2018, 2018; 23 (6): 406-417; doi:10.1111/gtc.12588
  article

Other Major Publications (genome and omics analyses)

Database

Japanese Multi Omics Reference Panel (jMorp)

“jMorp” is a database consisted of genome, metabolome and proteome data in plasma. Multiple omics analysis data obtained by ToMMo is integrated to “jMorp”, and opened to the researchers online.

Details of jMorp datasets

jMorp User Guide

Genome Variation

Based on our study, the Japanese Whole Genome Reference Panel, referred to as 54KJPN, consisting of allele and genotype frequency panels from approximately 54,000 Japanese individuals that are estimated to be unrelated to each other out of 69,000 individuals who have completed whole genome analyses, has been published in the database jMorp.
SNVs of interest can be searched for on our platform by the reference SNP ID number (rsID), Gene Symbol and locate information on the international human genome reference sequence. All found SNVs frequencies in our reference panel are displayed and can be compared with gnomAD. All the frequencies and locate information are downloadable by a single file for each chromosome. In addition to enhancements of jMorp, it includes such as X chromosome data, mitochondria data, Copy Number Variations (CNV) data, and implementation of the variant-structure mapping visualization tool.
In addition, "JSV1", which covers large-scale base insertions and deletions called structural variants, has also been released on jMorp. structural variants of 111 trio have been analyzed and applied in research on SNVs. To date, there is few research on structural variants in the world. The release of JSV1 is anticipated to facilitate the understanding of genomic structural variants.

Achievement of Research Product by Whole Genome Reference Panel: Japonica Array
The “Japonica Array”, DNA microarray designed for genome analysis of Japanese population, was developed in 2014 by ToMMo, and has been used for our genome analyses of tens of thousands of cohort participants. ToMMo further designed a novel SNP array equipped with more than 28,000 SNPs unique to the Japanese population, including disease-related SNPs, known as the ‘Japonica Array NEO’. Additionally, tagSNPs are designed based on the Japanese whole-genome reference panel, 3.5KJPNv2, and hence, a whole genome sequence can be imputed very precisely with the Japonica Array NEO.

Sakurai-Yageta M, Kumada K, Gocho C, et al. Japonica Array NEO with increased genome-wide coverage and abundant disease risk SNPs. The Journal of Biochemistry. 2021; 170 (3): 399-410.

Genome Sequence

ToMMo has released the Japanese reference genome, JG1, constructed by integrating three de novo assembled genomes of Japanese male individuals in 2019. In 2020, we released updated version JG2, constructed by integrating six de novo assembled genomes from three Japanese male individuals. In January 2022, the JG2.1.0 succeeded JG2.0.0. beta, where the successor utilizes GRCh38-derived sequences that have been patched for undetermined regions.
In June 2024, the JG3, the first Japanese near telomere-to-telomere (T2T) assembly, has been released. The Japanese reference genome is available online.

Metabolome and Proteome

On “jMorp”, the results of global metabolome analysis for metabolites in plasma and proteome analysis are available online. You can find the distribution and frequencies information for major metabolites for more than 60,000 populations and proteins for several hundred populations.
Metabolome data were measured by proton NMR and LC-MS in plasma obtained from participants from the population based cohort by the Tohoku Medical Megabank Project. Proteome data were obtained by nanoLC-MS. Metabolome and proteome data were obtained from more than 60,000 adults. We also measured around 4,500 volunteers for NMR metabolome analysis of samples at a repeat assessment survey. We have measured several thousand metabolites, including the uncharacterized ones and several hundred proteins by now. The data will be released after carefully checking each metabolite and protein in order.

Transcriptome

Transcriptome analysis of approximately 500 Japanese whole blood samples is on jMorp. Also, lymphoblastoid cell lines (LCLs) using long-read sequencing technology (PacBio Isoform-Sequencing; Iso Seq) is readily accessible on jMorp, or through our Genome Browser.

Methylome

Since 2020, methylome data from the iMethyl database has been integrated into jMorp. The Iwate Tohoku Medical Megabank Organization (IMM) analyzed methylation / gene expression using three types of cells (monocytes, CD4+ T cells, and neutrophils) across over 100 participants.

Phenome (PGx, Metagenome)

Incorporating both pharmacogenomics (PGx) and the metagenome, relative abundances of microbial taxonomy identified by 16S rRNA V3-V4 region amplicon sequencing in saliva and dental plaques obtained from 1,388 volunteers, and microbiome analysis data of fecal samples obtained from 315 volunteers can also be found on jMorp. In terms of PGx, analyses of changes in enzyme activity for 382 enzymes with genetic polymorphisms involving amino acid substitutions related to drug sensitivity, can also be found.

ToMMo adopt a method called “data-visiting” which means researchers can reach and directly access the data themselves. You can check how we promoting data utilization. More detail is HERE