Nature Genetics (2023)이 기사 인용
996 액세스
83 알트메트릭
측정항목 세부정보
신장은 귀중한 용질을 유지하면서 분자 폐기물을 제거함으로써 혈장과 소변의 경계면에서 작동합니다. 혈장과 소변 대사체 쌍에 대한 유전적 연구를 통해 근본적인 과정을 확인할 수 있습니다. 우리는 1,916개의 혈장 및 소변 대사산물에 대한 게놈 차원의 연구를 수행했으며 1,299개의 중요한 연관성을 발견했습니다. 혈장만 연구했다면 관련 대사산물의 40%와의 연관성을 놓쳤을 것입니다. 우리는 아쿠아포린(AQP)-7 매개 글리세롤 수송과 같은 신장의 대사산물 재흡수에 대한 정보를 제공하는 소변 관련 소견과 혈장 및 소변 내 신장 발현 단백질의 위치 및 기능과 일치하는 다양한 대사 발자국을 발견했습니다. , 수송체 NaDC3(SLC13A3) 및 ASBT(SLC10A2)를 포함합니다. 7,073개의 대사산물-질병 조합의 공유된 유전적 결정인자는 대사 질환을 더 잘 이해하고 순환하는 소화 효소 및 고혈압과 디펩티다제 1의 연관성을 밝혀내는 자원을 나타냅니다. 혈장 이상으로 대사체에 대한 유전적 연구를 확장하면 신체 구획의 경계면에서 프로세스에 대한 고유한 통찰력을 얻을 수 있습니다.
인간의 신장은 대사 항상성을 유지하기 위해 아미노산과 같은 귀중한 용질을 유지하면서 혈장에서 소분자 노폐물을 제거합니다. 혈장을 일차 소변 초여과액으로 사구체 여과한 후, 그 구성은 네프론을 따라 고도로 조정된 과정을 통해 변형됩니다. 수백 개의 고도로 특화된 수송 단백질이 네프론을 둘러싸고 있는 세포막을 통해 용질을 이동시켜 중요한 분자를 재흡수하는 동시에 독성이 있거나 불필요한 분자를 적극적으로 배설합니다1. 이러한 수송 단백질 중 다수와 수송된 대사산물을 생성하거나 분해하는 효소가 인간 단일유전자 질환 연구를 통해 확인되었습니다. 이는 신장 질환뿐만 아니라 수송체 SGLT2 및 URAT1의 억제제(참고문헌 2,3)와 같이 대사 질환을 치료하는 데 매력적인 약물 표적을 나타냅니다. 그러나 많은 수송체와 효소, 그리고 이들의 기질과 생체 내 생성물은 여전히 특성화되어야 합니다. 우리는 인간 유전 연구에서 얻은 정보를 혈장 및 소변 대사체에 연결하면 건강과 질병에서 이러한 단백질의 역할에 대한 새로운 통찰력을 제공할 것이라는 가설을 세웠습니다.
소변 내 대사물질 수준에 대한 유전적 영향은 각 대사물질이 혈장에서 여과되기 때문에 소변에서 검출되는 유전자형 의존적 장 대사물질 흡수 또는 간 변형 반응과 같은 전신 과정을 반영할 수 있습니다. 이는 또한 신장 특이적 과정, 예를 들어 네프론 내벽 세포에 의한 작은 분자의 활성 생산, 재흡수 또는 분비를 반영할 수 있습니다. 혈장과 소변 대사산물 측정 쌍을 이용한 연구는 이러한 과정을 구별할 수 있는 잠재력을 가지고 있습니다.
여기서 우리는 혈장과 소변이라는 두 가지 '기질'에서 파생된 대사체에 대한 유전적 영향에 관한 차이점과 유사점을 연구하여 둘 다 보완적인 정보를 제공한다는 가설을 테스트합니다. 독일 만성 신장 질환(GCKD) 연구에 참가한 5,023명의 참가자로부터 얻은 혈장 및 소변 대사산물 쌍 측정과 게놈 전반의 유전 정보를 체계적으로 통합함으로써 우리는 근본적인 전신 과정과 신장 특이적 과정을 강조합니다. 우리는 게놈 전반에 걸쳐 1,299개의 중요한 연관성을 탐지하고 혈장만 연구하면 거의 40%의 대사산물과의 연관성을 놓쳤을 것임을 보여줍니다. 우리는 신장 발현 수송체가 혈장 및 소변 대사체에 남기는 발자국, 이전에 설명되지 않은 신장 강화 효소의 전신 역할에 대한 소변 관련 연관성의 예를 강조합니다. 이 연구는 유전적 변이와 인간 특성 및 질병 사이의 분자적 연관성을 나타낼 수 있는 아직 특성이 밝혀지지 않은 효소 및 수송 과정에 대한 미래의 실험적 검증을 위한 풍부한 자원을 생성합니다.
0.8). In summary, discovery GWAS of the plasma and urine metabolomes identified a wealth of significantly associated loci, the basis for subsequent characterizations./p> 0.8), gray labels indicate genetic regions identified in both plasma and urine without intermatrix colocalization, and red or blue labels indicate genetic regions exclusively identified in plasma or urine, respectively. The number of plasma and urine mQTLs annotated to a gene is given in parentheses (plasma, urine). The pie chart reflects the proportions of the 282 unique genes that were annotated as enzymes and transporters. Official gene symbols for PYCRL and ERO1L are PYCR3 and ERO1A, respectively./p>5 colocalizing regions are color coded and labeled. For the three other groups, all genes assigned to >50 colocalizing metabolite regions are color coded and labeled./p> 0.8; Methods) involving 1,162 mQTLs. Colocalizing associations were divided into four groups (Supplementary Table 10): those where the same genetic signal affected different metabolites in the same matrix ((1) ‘intraplasma’, n = 3,189; (2) ‘intraurine’, n = 3,155), the same metabolite in both plasma and urine ((3) ‘intermatrix, same metabolite’, n = 204) and different metabolites in plasma and urine ((4) ‘intermatrix, different metabolite’, n = 4,048)./p>50% of the 3,155 intraurine colocalizations (Fig. 3c). This is consistent with FADS1 encoding a central enzyme in polyunsaturated fatty acid metabolism17 and the predominance of these lipid metabolites in plasma and with NAT8 encoding an N-acetyltransferase highly expressed in the kidney that generates water-soluble molecules for excretion18 and the abundance of N-acetylated metabolites in urine. Similarly, the organic anion transporter encoded by SLCO1B1 and the solute transporters encoded by the SLC17A family show high and specific expression in liver and kidney, respectively, where they transport dozens of physiological and pharmacological substrates19,20./p> 0.8) with rs601338, at which the minor A allele encodes the stop-gain variant p.Trp154Ter (NP_000502.4) that was associated with higher levels of only these two urine metabolites. The encoded fucosyltransferase 2 is a ubiquitously expressed enzyme that mediates the inclusion of fucose into glycans on a variety of glycolipids and glycoproteins. Individuals homozygous for p.Trp154Ter have lower risk of several infectious diseases during childhood25,26, a selective advantage. Indeed, we detected positive selection at this and other loci, including positive controls such as the LCT locus (Methods and Supplementary Table 21). The extended homozygosity of the haplotype carrying the minor, derived allele at the galactosylglycerol mQTL further supported positive selection (Fig. 5b)./p>64-fold higher urine but not plasma glycerol levels (Fig. 5g), thereby confirming a single case report through evidence from population studies./p> 0.8), with color coding representing the phenotype category. Effect directions are indicated by the line type (solid, positive association; dashed, inverse association). CNS, central nervous system; NOS, not otherwise specified./p>50% of the observed metabolite variance. Although this translates into much smaller effects on complex diseases such as hypertension, arthropathies or gallstone disease, colocalization can nominate shared pathophysiological mechanisms and inform about potential therapeutic targets, repurposing opportunities and potential side effects of approved drugs. Our study includes numerous such examples, supported by the recent launch of new drugs such as evinacumab, a monoclonal antibody targeting angiopoietin-like 3 (ANGPTL3) to treat dyslipidemia, or the SLC10A2 inhibitor odevixibat to treat cholestasis. Even if a target implicated by metabolites in our study is not desirable or amenable for therapeutic modulation, disease-associated metabolites may represent valuable intermediate biomarkers for risk prediction or response to treatment./p>60 ml min−1 per 1.73 m2 with UACR > 300 mg per g (or urinary protein/creatinine ratio > 500 mg per g) were included53. This study used biomaterials collected at the baseline visit, shipped frozen to a central biobank and stored at −80 °C54. A more detailed description of the study design, standard operating procedures and the recruited study population has been published53,55. The GCKD study was registered in the national registry for clinical studies (DRKS 00003971) and approved by local ethic committees of the participating institutions (universities or medical faculties of Aachen, Berlin, Erlangen, Freiburg, Hannover, Heidelberg, Jena, München and Würzburg)53. All participants provided written informed consent. For this project, metabolites were quantified from stored EDTA plasma and spot urine. Information on genome-wide genotypes, covariates and metabolites was available for 4,960 (plasma) and 4,912 (urine) persons./p>4,500 purified standards) was used for metabolite identification. Known metabolites reported in this study conformed to confidence level 1 (the highest confidence level of identification) of the Metabolomics Standards Initiative58,59, unless otherwise denoted with an asterisk. Additional mass spectral entries have been created for compounds of unknown structural identity (unnamed biochemicals; >2,750 in the Metabolon library), which have been identified by virtue of their recurrent nature (both chromatographic and mass spectral). Peaks were quantified using the area under the curve and normalized to correct for variation resulting from instrument interday tuning differences by the median value for each run day. Likewise, metabolites in the ARIC replication sample were also quantified with the Metabolon HD4 platform./p>50% missing data. A total of 130 plasma and 131 urine metabolites were removed, as less than 300 genotyped samples were available./p>5% of samples outlying >5 s.d.). Three plasma samples and one urine sample represented an outlier >5 s.d. along one of the first 15 principal components based on metabolites with complete information. The final dataset consisted of 1,296 plasma and 1,401 urine log2-transformed traits for subsequent GWAS. Supplementary Table 2 provides detailed annotation of the metabolites, including heritability estimates for metabolites with at least one genetic association. Over the course of this project, two formerly different urine metabolites were merged because they represented the same molecule: X-12739 and X-24527 to the glutamine conjugate of C6H10O2 (1)* and X-23667 and X-24759 to (2-butoxyethoxy)acetic acid./p> 0.8) within a window of ±500 kb around the index SNP based on genetic data from the 1000 Genome Project phase 3 version 5 of European ancestry using https://snipa.helmholtz-muenchen.de/snipa/?task=proxy_search. For each study, the best available proxy SNP in terms of maximal LD and minimal distance was selected. Summary statistics were downloaded from https://metabolomics.helmholtz-muenchen.de/gwas/index.php?task=download (Shin et al.6), http://www.hli-opendata.com/Metabolome (Long et al.7, only summary statistics with P value < 10−5), https://omicscience.org/apps/crossplatform/ (Lotta et al.8), https://pheweb.org/metsim-metab/ (Yin et al.10), https://omicscience.org/apps/mgwas/mgwas.table.php (Surendran et al.11) and http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/; accession numbers for European GWAS are GCST90199621–GCST90201020 (Chen et al.12). Hysi et al.9 shared their summary statistics upon request./p> 0.6 using GCTA-GRM71. GCTA-GREML72 was then used to estimate the proportion of variation in log2-transformed and, in the case of urine, pq-normalized metabolite levels that can be explained by the SNPs for all metabolites that gave rise to an mQTL./p> 0.8). For each mQTL, the GCTA-COJO Slct algorithm version 1.91.6 (ref. 73) was used to identify independent genome-wide significant SNPs (Pconditional < 3.9 × 10−11), using a collinearity cutoff of 0.1. For mQTL with multiple independent SNPs, approximate conditional analyses were carried out conditioning on the other independent SNPs in the region using the GCTA-COJO Cond algorithm to estimate conditional effect sizes. Statistical fine mapping was performed for all independent SNPs per mQTL. In loci with a single independent SNP, approximate Bayes factors (ABFs) were calculated from the original GWAS effect estimates using Wakefield's formula74 with a standard deviation prior of 1.33. For mQTL with multiple independent SNPs, ABFs were derived from the conditional effect estimates. The SNP's ABF was used to calculate the posterior probability for the variant driving the association signal (PPA, ‘causal variant’). Credible sets were calculated by summing the PPA across PPA-ranked variants until the cumulative PPA was >99%. log2-transformed credible set sizes were regressed on the MAFs of independent index SNPs./p>We also performed colocalization analyses of mQTLs with disease outcomes and biomarker measurements in the UK Biobank, with two representative kidney function traits and with trans pQTLs using the precomputed pQTL data from Sun et al.79 to gain insights into clinical consequences and potential molecular mediators of mQTLs. Association summary statistics between SNPs and 30 biomarkers from the UK Biobank baseline examination, including the liver function markers AST, ALT, GGT, bilirubin and albumin, were computed using BOLT-LMM80 (application no. 20272) in the same subset of European-ancestry participants as previous studies81. Precomputed GWAS summary statistics of diseases as ascertained in the UK Biobank and analyzed using phecodes were obtained from https://www.leelabsg.org/resources (1,403 binary traits) and from https://yanglab.westlake.edu.cn/data/ukb_fastgwa/imp_binary/ (2,325 of 2,989 binary traits82; traits containing job-coding terms were excluded from the analysis). There were 816 phecodes analyzed in both, but only unique phecodes were counted for positive colocalizations. We used GWAS summary statistics of creatinine-based and cystatin C-based eGFR (eGFRcrea and eGFRcys) from Stanzick et al.1.2 million individuals. Nat. Commun. 12, 4350 (2021)." href="/articles/s41588-023-01409-8#ref-CR83" id="ref-link-section-d87044679e3267"83, who meta-analyzed kidney function GWAS from the CKDGen Consortium and the UK Biobank. The GWAS summaries were downloaded from the CKDGen data website at https://ckdgen.imbi.uni-freiburg.de. Colocalization testing between mQTL and trans pQTL was performed within a window of ±500 kb around the mQTL's index SNP when at least one trans pQTL association with P < 0.05 ÷ 409 ÷ 3,000 for plasma and P < 0.05 ÷ 410 ÷ 3,000 for urine was present within a window of ±100 kb around the index SNP. Similarly, colocalization analysis between mQTL and biomarkers, diseases and kidney function traits was performed within ±500 kb of the index SNP when there were one or more associated variants with MAF > 0.01 and P < 0.05 ÷ 409 or P < 0.05 ÷ 410, respectively, within ±100 kb of the index SNP, using the method outlined above./p> 0.01)./p>1.2 million individuals. Nat. Commun. 12, 4350 (2021)./p>