Skip to main content
Genome Research logoLink to Genome Research
letter
. 2004 Oct;14(10a):1902–1910. doi: 10.1101/gr.2722704

Identification of Mammalian microRNA Host Genes and Transcription Units

Antony Rodriguez 1, Sam Griffiths-Jones 1, Jennifer L Ashurst 1, Allan Bradley 1,1
PMCID: PMC524413  PMID: 15364901

Abstract

To derive a global perspective on the transcription of microRNAs (miRNAs) in mammals, we annotated the genomic position and context of this class of noncoding RNAs (ncRNAs) in the human and mouse genomes. Of the 232 known mammalian miRNAs, we found that 161 overlap with 123 defined transcription units (TUs). We identified miRNAs within introns of 90 protein-coding genes with a broad spectrum of molecular functions, and in both introns and exons of 66 mRNA-like noncoding RNAs (mlncRNAs). In addition, novel families of miRNAs based on host gene identity were identified. The transcription patterns of all miRNA host genes were curated from a variety of sources illustrating spatial, temporal, and physiological regulation of miRNA expression. These findings strongly suggest that miRNAs are transcribed in parallel with their host transcripts, and that the two different transcription classes of miRNAs (`exonic' and `intronic') identified here may require slightly different mechanisms of biogenesis.


MicroRNAs (miRNAs) are an evolutionarily conserved large class of noncoding RNAs (ncRNAs) 18-22 nucleotides long that mediate posttranscriptional silencing of genes. miRNAs were first discovered in Caenorhabditis elegans with the identification of the lin-4 and let-7 miRNA genes, which act as posttranscriptional repressors of target genes by antisense binding to their 3′ untranslated regions (UTRs; for review, see Ambros 2000). Shortly thereafter, hundreds of other miRNAs were found in worms as well as in flies, plants, and vertebrates (for review, see Carrington and Ambros 2003; Bartel 2004). Rapid progress has begun to unravel the genetic roles of miRNAs in development and other biological processes. For example, in C. elegans, let-7 and lin-4 miRNAs function as heterochronic genes, and mutations in either disrupt proper specification of cell fates (Ambros 2000). In Drosophila, a mutation in miR-14 leads to a disruption in normal patterns of cell death and also defects in fat metabolism (Xu et al. 2003). In mammals, ∼230 miRNAs have been identified from a vast array of tissues and cell types (Lagos-Quintana et al. 2001, 2002, 2003; Mourelatos et al. 2002; Dostie et al. 2003; Houbaviy et al. 2003; Lim et al. 2003; Michael et al. 2003; Kim et al. 2004).

Current models for miRNA biogenesis and maturation suggest that compartmentalized stepwise processing of miRNAs takes place first in the nucleus and then in the cytoplasm. The prevailing view is that primary transcripts of miRNAs (pri-miRNAs) are processed in the nucleus by the RNase III enzyme, Drosha, to stem-loop intermediates known as pre-miRNAs (Lee et al. 2003). These pre-miRNAs are then transported to the cytoplasm for cleavage by Dicer and maturation to their active forms (Lee et al. 2003). Although RNAs of 22-, 75-nt as well as longer pri-miRNA transcripts have been detected by Northern blot analyses for a handful of miRNAs, the transcription units (TUs) or gene hosts that give rise to the vast majority of miRNAs have not been examined in great detail.

In the present study, we set out to identify the modes of transcription for mammalian miRNAs by annotating their positions in the human and mouse genomes. We found that more than half of all known mammalian miRNAs are within introns of either protein-coding or noncoding TUs, whereas ∼10% are encoded by exons of long nonprotein-coding transcripts, also known as mRNA-like noncoding RNAs (mlncRNAs; Erdmann et al. 2000). Our annotation illuminates the modes of transcription for miRNAs and has allowed the identification of `intronic' as well as `exonic' transcription classes of miRNAs. We curated the expression patterns of well defined microRNA host genes to illustrate likely expression patterns for microRNAs in wide-ranging biological contexts.

RESULTS AND DISCUSSION

The miRNA Registry contains 232 mammalian miRNAs (http://www.sanger.ac.uk/Software/Rfam/mirna/; Griffiths-Jones 2004). Of these we identified 117 miRNAs located in introns of protein-coding genes or long ncRNA transcripts (Table 1, Supplemental Tables A,B). Approximately 40% (90) of all miRNAs are found within introns of protein-coding genes, whereas ∼10% (27) are located within introns of long ncRNA transcripts (Table 1). Interestingly, 30 miRNAs overlap with exons of ncRNAs (Supplemental Table C). In some cases (14), miRNAs are located in either an exon or an intron (`mixed') depending on alternative splicing of the host transcript (Table 1, Supplemental Table D). Where clusters of miRNAs overlap with a single host transcript, the vast majority of miRNAs are located in the same intron or exon (Supplemental Tables A-D). Additionally, 32 miRNAs overlap with two or more TUs transcribed on opposite DNA strands (Supplemental Tables A-D). This observation indicates that miRNAs are commonly associated with complex transcriptional loci. The remaining 70 miRNAs are of uncertain transcriptional origin and were not analyzed further. To sum up, a total of 161 miRNAs are linked to the transcription of mRNAs or ncRNAs. Based on these observations, we propose that miRNAs be classified as exonic (exon-derived) miRNAs or intronic (intron-derived) miRNAs.

Table 1.

Distribution of Mammalian miRNAs Found in Introns and Exons of Host Transcripts

The numbers given represent total numbers of miRNAs overlapping with transcripts in the human or mouse genome (see Methods). For comparison purposes, miRNAs in introns of transcripts are separated according to the nature of the host gene (protein coding vs. ncRNA). Nine of the `mixed' miRNAs are transcribed in introns or exons of alternatively spliced mlncRNA hosts the rest are associated with protein-coding host genes as well as mlncRNAs of alternatively spliced transcripts. Asterisk (*) denotes the number of mixed miRNAs derived from mlncRNA transcripts only.

graphic file with name 82629-06t1_1t.jpg

A partial listing of protein-coding miRNA host genes is shown in Table 2. These miRNA host genes encode proteins with a broad spectrum of biological roles ranging from embryonic development, to the cell cycle, and physiology (Supplemental Tables A,D). To derive a perspective on the classes of protein-coding genes possibly utilized by miRNAs for their transcription, we surveyed the Gene Ontology (GO) classifications of all miRNA host genes (Gene Ontology Consortium 2001; http://www.geneontology.org/). The two most commonly identified `biological process' classifications for GO annotated miRNA gene hosts are `metabolism' (GO:0008152; 19 of 90; 21%) followed by `cellular physiological process' proteins (GO:00050875; 14 of 90; 16%) and the most common GO `molecular function are `purine nucleotide binding' (GO:0017076; 8 of 90; 9%) and `DNA binding' (GO:0003677; 7 of 90; 8%). Comparison of the most commonly identified GO classifications for miRNA host genes and the entire collection of GO annotated mammalian genes does not reveal disproportionate representation of these classes of genes (data not shown). We also note that several genes involved in human disease are hosts to miRNAs. For instance, nonsense, splicing, frameshift, or deletion mutations in the Chloride Channel Protein 5 (CLCN5) gene cause Dent disease and nephrolithiasis, an X-linked recessive disorder in human patients (OMIM: 300008; Yamamoto et al. 2000). A causative role for the CLCN5-encoded miR-188 has not been explored in this disease.

Table 2.

miRNAs in Introns of Protein-Coding Genes

microRNAa Mouse host gene IDb Human host gene IDb Host gene protein descriptionc miRNA positiond
miR-7-3 (miR-7b) ENSG00000176840 PITUITARY GLAND SPECIFIC FACTOR 1A. [Source: RefSeq; Acc: NM_174947] Intron 2
miR-10b ENSMUSG00000038692 ENSG00000170166 HOMEOBOX PROTEIN HOX-D4(HOX-4B) (HOX-5.1) (HHO.C13). {Source: SWISS-PROT (P09016)] Intron 4
miR-15b, -16-2 ENSMUSG00000034349 ENSG00000113810 STRUCTURAL MAINTENANCE OF CHROMOSOMES 4-LIKE 1 PROTEIN (CHROMOSOME-ASSOCIATED POLYPEPTIDE C) (HCAP-C) (XCAP-C HOMOLOG). SMC4L1 [Source: SWISS-PROT; Acc: Q9NTJ3] Intron 4
miR-25, -93, -106b ENSMUSG00000029730 OTTHUMG00000023308/ENSG00000166508 DNA REPLICATION LICENSING FACTOR MCM7 (CDC47 HOMOLOG) (P1.1-MCM3). [Source: SWISS-PROT; Acc: P33993] Intron 13
miR-30c-1 and -30e ENSMUSG00000032897 ENSG00000066136 NUCLEAR TRANSCRIPTION FACTOR Y SUBUNIT GAMMA (NF-Y PROTEIN CHAIN C) (NUCLEAR FACTOR YC) (NF-YC) (CCAAT-BINDING TRANSCRIPTION FACTOR SUBUNIT C) (CBF-C) (TRANSACTIVATOR HSM-1/2). NFYC [Source: SWISS-PROT; Acc: Q13952] Intron 5
miR-33 ENSMUSG00000022463 OTTHUMG00000030492/ENSG00000100152 STEROL REGULATORY ELEMENT BINDING PROTEIN-2 (SREBP-2) (STEROL REGULATORY ELEMENT-BINDING TRANSCRIPTION FACTOR 2). SREBF2 [Source: SWISS-PROT; Acc: Q12772] Intron 15 (16)
miR-101-2 (miR-101b) ENSMUSG00000024785 OTTHUMG00000019474/ENSG00000120158 RNA 3′-TERMINAL PHOSPHATE CYCLASE-LIKE PROTEIN (HSPC338), RCL1_HUMAN [Source: SWISS-PROT; Acc: Q9Y2P8] Intron 8
miR-126, 126* ENSMUSG00000026921 OTTHUMG00000020938/ENSG00000172889 EGF-LIKE DOMAIN 7; NEU1 PROTEIN, NOTCH4-LIKE PROTEIN (VASCULAR ENDOTHELIAL ZINC FINGER 1) [Source: RefSeq; Acc: NM_178444] Intron 6 (7)
miR-128b ENSMUSG0000032503 ENSG00000076062 CAMP-REGULATED PHOSPHOPROTEIN 21 (ARPP-21) (SWISS-PROT: AP21_HUMAN) Intron 11 (17)
miR-139 ENSMUSG00000030653 ENSG00000186642 CGMP-DEPENDENT 3′,5′-CYCLIC PHOSPHODIESTERASE (EC 3.1.4.17) (CYCLIC GMP STIMULATED PHOSPHODIESTERASE) (CGS-PDE) (CGSPDE), PDE2A [Source: SWISS-PROT; Acc: O00408] Intron 2 (1)
miR-140 ENSMUSG00000031930 ENSG00000088481 NEDD-4-LIKE UBIQUITIN-PROTEIN LIGASE WWP2 (EC 6.3.2.-) (WW DOMAIN-CONTAINING PROTEIN 2) (ATROPIN-1 INTERACTING PROTEIN 2) (AIP2). WWP_2HUMAN [Source: SWISS-PROT; Acc: O00308] Intron 15 (16)
miR-149 ENSMUSG00000034220 ENSG00000063660 GLYPICAN-1 PRECURSOR, GPC1 [Source: SWISS-PROT; Acc: P35052] Intron 1
miR-151 ENSMUSG00000022607 ENSG00000169398 FOCAL ADHESION KINASE 1 (EC 2.7.1.112) (FADK 1) (PP125FAK) (PROTEIN-TYROSINE KINASE 2). FAK1 [Source: SWISS-PROT (Q05397)] Intron 19 (25)
miR-188 ENSMUSG00000004317 ENSG00000171365 CHLORIDE CHANNEL PROTEIN 5 (CLC-5), CLCN5 [Source: SWISS-PROT; Acc: P51795] Intron 3 (2)
miR-190 ENSMUSG00000035702 ENSG0000017194 TALIN 2. TLN 2 [Source: SWISS-PROT; Acc: Q9Y4G6] Intron 51 (27)
miR-207 (human miR-207 not found) ENSMUSG00000028410 DNAJ HOMOLOG SUBFAMILY A MEMBER 1 (HEAT SHOCK 40 KDA PROTEIN 4) (DNAJ PROTEIN HOMOLOG 2) (HSJ-2), DJA1_MOUSE [Source: SWISS-PROT; Acc: P54102] Intron 1
miR-208 ENSMUSG00000040752 OTTHUMG00000028753/ENSG00000166094 MYOSIN HEAVY CHAIN, CARDIAC MUSCLE ALPHA ISOFORM (MYHC-ALPHA), MYH6 [Source: SWISS-PROT; Acc: P13533] Intron 28 (29)
miR-335 ENSMUSG00000051855 ENSG00000106484 MESODERM SPECIFIC TRANSCRIPT ISOFORM A; PATERNALLY EXPRESSED GENE 1. MEST [Source: RefSeq (NM_002402)] Intron 2 (1)
miR-338 ENSMUSG00000025375 ENSG00000181409 APOPTOSIS-ASSOCIATED TYROSINE KINASE. AATK [Source: RefSeq; Acc: NM_007377] Intron 5 (7)
miR-339 ENSMUSG00000029533 ARSENITE INDUCIBLE RNA ASSOCIATED PROTEIN. (AIRAP) AA407930 [Source: RefSeq; Acc: NM_133349] Intron 1
miR-340 ENSMUSG00000020376 ENSG00000113269 RING FINGER PROTEIN 130; GOLIATH PROTEIN; G1-RELATED ZINC FINGER PROTEIN, RNF130 [Source: RefSeq (NM_018434)] Intron 2 (2)
miR-342 ENSMUSG00000021262 OTTHUMG00000029003/ENSG00000089465 ENA/VASODILATOR STIMULATED PHOSPHOPROTEIN-LIKE PROTEIN (ENA/VASP-LIKE PROTEIN). EVL [Source: SWISS-PROT (Q9UI08)] Intron 3
miR-346 14803 NCBI OTTHUMG00000018650/ENSG00000182771 SIMILAR TO GLUTAMATE RECEPTOR, IONOTROPIC, DELTA 1. GRID1 [Source: SPTREMBL (Q8IXT3)] Intron 2 (1)
a

miRNA names listed in parentheses refer to the mouse name if different from the human.

b

Gene IDs are from ENSEMBL, Vega, or NCBI.

c

Protein descriptions are for the human gene.

d

Intron positions for miRNAs in gene are listed. If the human/mouse positions are different, the mouse location is presented in parentheses.

A significant number of intron-based miRNAs reside within related families of protein-coding genes (Table 3, Supplemental Tables A,D,E). This allowed us to group specific sets of miRNAs into novel families based on host gene identity. In some cases, host gene families aid in the identification of related sets of intronic miRNAs (Table 3). For instance, miR-148b and miR-152 are sequence-related and located in two genes encoding subunits of the coatomer transporter (see Table 3, Supplemental Table E). In other examples, we note sequence-unrelated miRNAs that are found within related families of proteins. For example, miR-105 and miR-224 are located in two genes encoding subunits of the ligand-gated ion channel γ-aminobutyric-acid receptor-α-3 and -ε respectively, although the mature miRNAs are unrelated by sequence (Table 3).

Table 3.

Families of miRNAs Found in Introns of Related Protein-Coding Genes

Family Mouse host gene IDa Human host gene IDa Host gene descriptionb microRNA Related miRNAsc
Coatomer ENSMUSG00000035994 MBuild30 ENSG00000111481 COATOMER ZETA-1 SUBUNIT (ZETA-1 COAT PROTEIN) (ZETA-1 COP) (CGI-120) (HSPC181), COPZ1 [Source: SWISS-PROT (Q9Y3C3)] miR-148b +
ENSMUSG00000018672 ENSG00000005243 COATOMER ZETA-2 SUBUNIT (ZETA-2 COAT PROTEIN) (ZETA-2 COP). COPZ2 [Source: SWISS-PROT; Acc: Q9P299] miR-152 +
GABA ENSMUSG00000031343 ENSG00000011677 GAMMA-AMINOBUTYRIC-ACID RECEPTOR ALPHA-3 SUBUNIT PRECURSOR (GABA(A) RECEPTOR). GABRA3 [Source: SWISS-PROT; Acc: P34903] miR-105-1, -105-2
ENSMUSG00000031340 ENSG00000102287 GAMMA-AMINOBUTYRIC-ACID RECEPTOR EPSILON SUBUNIT PRECURSOR (GABA(A) RECEPTOR). GABRE [Source: SWISS-PROT; Acc: P78334] miR-224
NIF ENSMUSG00000038995 ENSG00000144677 NUCLEAR LIM INTERACTOR-INTERACTING FACTOR 1 (NLI-INTERACTING FACTOR 1) (NIF-LIKE PROTEIN) (YA22 PROTEIN) (HYA22), NIF1_HUMAN (C3orf8) [Source: SWISS-PROT; Acc: O15194] miR-26a-1 +
ENSMUSG00000040540 ENSG00000175215 NUCLEAR LIM INTERACTOR-INTERACTING FACTOR 2 (NLI-INTERACTING FACTOR 2) (PROTEIN OS-4). NIF2_HUMAN [Source: SWISS-PROT; Acc: O14595] miR-26a-2 +
ENSMUSG00000026176 ENSG00000144579 NUCLEAR LIM INTERACTOR-INTERACTING FACTOR 3 (NLI-INTERACTING FACTOR 3) (NLI-IF). NIF3_HUMAN [Source: SWISS-PROT; Acc: Q9GZU7] miR-26b +
LIM ENSMUSG00000033306 ENSG00000145012 LIM DOMAIN CONTAINING PREFFERED TRANSLOCATION PARTNER IN LIPOMA; LIM DOMAIN-CONTAINING PREFERRED TRANSLOCATION PARTNER IN LIPOMA; LIPOMA-PREFFERRED-PARTNER GENE. LPP [Source: RefSeq; Acc: NM_005578] miR-28
ENSG00000163995 ACTIN BINDING LIM PROTEIN 2. ABLIM2 [Source: RefSeq; Acc: NM_032432] miR-95
PANK ENSMUSG00000033610 OTTHUMG00000018718/ENSG00000152782 PANTOTHENATE KINASE 1 (EC 2.7.1.33) (PANTOTHENIC ACID KINASE 1) (HPANK1) (HPANK). [Source: SWISS-PROT; Acc: Q8TE04] miR-107 +
ENSMUSG00000034220 OTTHUMG00000031768/ENSG00000125779 PANTOTHENATE KINASE 2, MITOCHONDRIAL PRECURSOR (EC 2.7.1.33) (PANTOTHENIC ACID KINASE 2) (HPANK2). [Source: SWISS-PROT; Acc: Q9BZ23] miR-103-2 +
ENSMUSG00000018846 ENSG00000120137 PANTOTHENATE KINASE 3 (EC 2.7.1.33) (PANTOTHENIC ACID KINASE 3) (HPANK3). [SOURCE: SWISS-PROT; ACC: Q9H999] miR-103-1 +
SLIT 20563 NCBI ENSG00000145147 SLIT HOMOLOG 2 PROTEIN PRECURSOR (H-SLIT-2). SLIT2 [Source: SWISS-PROT; Acc: O94813] miR-218-1 +
20564 NCBI ENSG00000184347 SLIT HOMOLOG 3; SLIT3 [Source: RefSeq; Acc: NM_003062] miR-218-2 +
PTPRN ENSG00000054356 PROTEIN-TYROSINE PHOSPHATASE-LIKE N PRECURSOR (R-PTP-N) (PTP IA-2) (ISLET CELL ANTIGEN 512) (ICA 512) (ISLET CELL AUTOANTIGEN 3), PTPRN [Source: SWISS-PROT; Acc: Q16849] miR-153-1 +
ENSMUSG00000054701 ENSG00000155093 RECEPTOR-TYPE PROTEIN-TYROSINE PHOSPHATASE N2 PRECURSOR (EC 3.1.3.48) (R-PTP-N2) (ISLET CELL AUTOANTIGEN RELATED PROTEIN) (ICAAR) (IAR) (PHOGRIN). PTPRN2 [Source: SWISS-PROT; Acc: Q92932] miR-153-2 +
TRPM ENSMUSG00000030523 ENSG00000134160 TRANSIENT RECEPTOR POTENTIAL CATION CHANNEL, SUBFAMILY M, MEMBER 1; MELASTATIN 1. TRPM1 [Source: RefSeq; Acc: NM_002420] miR-211 +
ENSMUSG00000024763 ENSG00000083067 LONG TRANSIENT RECEPTOR POTENTIAL CHANNEL 3 (LTRPC3) (FRAGMENT), TRPM3 [Source: SWISS-PROT; Acc: Q9HCF6] miR-204 +
a

Gene IDs are from ENSEMBL, NCBI, or Acembly AceView.

b

Protein descriptions are for the human gene.

c

miRNAs with related nucleotide sequences are marked with plus signs and those that are not related with a minus sign.

In addition to the miRNAs in protein coding genes, we also noted 66 miRNAs within TUs that lack a significant protein-coding potential (Table 1). More detailed analysis of these miRNA host TUs revealed a number of previously classified long ncRNAs (see Tables 4,5 and Supplemental Tables B-D). These include: BIC (Tam 2001), Deleted in Leukemia 2 (DLEU2; Migliazza et al. 2001), Noncoding RNA in Rhabdomyosarcoma (NCRMS; Chan et al. 2002), Synapse-Specific-7H4 (7H4; Velleca et al. 1994), FLJ34037 (Ota et al. 2004), and numerous FANTOM2 long ncRNAs (Okazaki et al. 2002; Numata et al. 2003). Only the BIC RNA has previously been noted as a carrier for a miRNA (Lagos-Quintana et al. 2002). These types of ncRNA transcripts are sometimes referred to as mRNA-like ncRNAs (mlncRNAs; Erdmann et al. 2000) because they share properties with mRNAs such as splicing (e.g., BIC: Tam 2001), polyadenylation (e.g., Air: Sleutels et al. 2002), long transcription units (e.g., Xist: Hong et al. 1999), and possibly also spatio/temporally restricted expression (e.g., Ntab: French et al. 2001). The molecular/genetic function of most noncoding mRNA-like RNAs is unclear but some, such as Xist and Air, have been studied in greater detail.

Table 4.

miRNAs in Introns of Noncoding Transcription Units

microRNA Chroma Human host TU IDb Mouse host TU IDb miRNA locationc Host transcript description
let-7a-2, miR-100, -125b-1 11 (9) ENSESTG00000015836 ENSMUSESTG00000011922 Intron 2 (4) Novel mlncRNA
let-7d 9 (13) OTTHUMG00000020259 Intron 3 Vega mlncRNA
miR-7-2 15 (7) ENSMUSESTG00000007393 Intron 1 Novel mlncRNA
miR-15a, -16-1 13 (14) OTTHUMG00000016927 1810047A1 6Rik Intron 4 DLEU2 mlncRNA
miR-30a, -30a*, -30c-2 6 (1) ENSESTG00000011458 Intron 3 Novel mlncRNA
miR-31 9 (4) OTTHUMG00000019681 Intron 1 Novel mlncRNA
miR-129-2 11 (2) 11_43541975 AceView Intron 1 Novel mlncRNA
miR-132, -212 17 (11) ENSMUSESTG00000011295 MBuild 30 Intron 1 Novel mlncRNA
miR-135a-1 3 (9) 3_52291509 AceView Novel mlncRNA
miR-135a-2 12 (10) 196475 NCBI ENSMUSESTG00000002435 Intron 9 (1) NCRMS mlncRNA
miR-135b 1 (1) ENSESTG00000002491 ENSMUSESTG00000018175 Intron 1 Novel mlncRNA
miR-141, -200c 12 (6) ENSMUSESTG00000015291 MBuild30 Intron 1 Novel mlncRNA
miR-154 14 (12) ENSMUSESTG00000016217 Intron 2 Novel mlncRNA
miR-181a, -181b-2 9 (2) OTTHUMG00000020657 Intron 1 Vega mlncRNA
miR-181b-1, -213 1 (1) ENSMUSESTG00000012306 Intron 2 Novel mlncRNA
miR-194-1 1 (1) 2010103J01Rik Intron 1 Riken mlncRNA
miR-302 4 (3) 4_114030028 AceView Intron 2 Novel mlncRNA
miR-325 —(X) ENSMUSESTG00000006551 Intron 1 Novel mlncRNA
a

The human chromosome is listed first followed by the mouse chromosome in parentheses.

b

Transcript IDs are from ENSEMBL, Vega, FANTOM2, NCBI, or Acembly AceView.

c

Intron positions for miRNAs in transcript are listed. If the human/mouse positions are different, the mouse location is presented in parentheses.

Table 5.

miRNAs Found in Exons of Nonprotein-Coding Transcription Units

microRNAa Chromb Human host TU IDc Mouse host TU IDc miRNA locationd Host transcript description
let-7b, let-7c-2 22 (15) OTTHUMG00000030111 Exon 1 Vega mlncRNA
let-7i 12 (10) D630033A02Rik Exon 1 Riken mlncRNA
miR-22 17 (11) ENSESTG00000028158 2210403K04Rik Exon2 Riken mlncRNA
miR-29b-2 1 (1) 1_205058566 AceView C030002C11Rik Exon 1 Riken mlncRNA
miR-34b, -34c 11 (9) ENSESTG00000013662 Exon 2 Novel mlncRNA
miR-101-1 (miR-101) 1 (4) ENSMUSESTG00000022081 Exon 2 Novel mlncRNA
miR-122a 18 (18) 18_54266208 AceView Exon 2 Novel mlncRNA
miR-133b 6 (1) 6_52060506 AceView Exon 1 Novel mlncRNA
miR-137 1 (1) 1_97978084 AceView ENSMUSESTG00000012545 Exon 2 (3) Novel mlncRNA
miR-143, -145 5 (18) 5_148836826 AceView Exon 1 Novel mlncRNA
miR-146 5 (11) ENSESTG00000012459 5830402E17Rik Exon 2 (1) Riken mlncRNA
miR-155 21 (16) 114614 NCBI Bic NCBI Exon 3 (2) BIC mlncRNA
miR-192 11 (19) 11_64434198 AceView Exon 1 Novel mlncRNA
miR-195 17 (11) LOC284112 AceView Exon 1 Novel mlncRNA
miR-196b 7 (6) 7_26962152 AceView Exon 1 Novel mlncRNA
miR-199a-2, -199a*-2 1 (1) 1_169353084 AceView Exon 1 Novel mlncRNA
miR-202 —(7) G630041I22Rik Exon 2 Riken mlncRNA
miR-206 6 (1) 4831414J02Rik Exon 1 7H4 mlncRNA
miR-214 1 (1) 1_169353084 AceView D230012O04Rik Exon 2 (1) Riken mlncRNA
miR-215 1 (1) 2010103J01Rik Exon 2 Riken mlncRNA
miR-221 X (X) X_44651888 AceView Exon 1 Novel mlncRNA
miR-223 X (X) X_64102086 AceView 9830169G21Rik Exon 3 (2) Riken mlncRNA
miR-296 20 (2) 20_58078195 AceView D330038P10Rik Exon 1 Riken mlncRNA
miR-324-5p, -324-3p 17 (11) 17_7329130 AceView Exon 3 Novel mlncRNA
miR-331 12 (10) A630071D13Rik Exon 1 Riken mlncRNA
a

miRNA names listed in parentheses refer to the mouse name if different from the human.

b

The human chromosome is listed first followed by the mouse chromosome in parentheses.

c

Transcript IDs are from ENSEMBL, Vega, FANTOM2, NCBI, or Acembly AceView.

d

Exon positions for miRNAs in transcript are listed. If the human/mouse positions are different, the mouse location is presented in parentheses.

As well as the previously defined ncRNA TUs noted above, we identified dozens of novel TU hosts for miRs with little protein coding potential (see Methods, Tables 4,5, and Supplemental Tables B-D). For example, EST and cDNA data in ENSEMBL allowed the identification of an mlncRNA TU host containing miR-100, let-7a-2, and also miR-125b-1 in an intron (Table 4, Fig. 1A). Sequence alignment of the human/mouse host transcript 1 (ENSESTT00000036713/ENSMUSESTT00000030830; data not shown); and transcript 2 (ENSESTT00000036715/2610203C20Rik) does not reveal significant homology (Fig. 1B). Nevertheless, the miR-100, let-7a-2, and miR-125b-1 transcription locus is remarkably similar in human and mouse in both arrangement and structure (Fig. 1A). The possibility exists that some miRNA hosts of this type may be so-called `inside out' genes analogous to some small nucleolar RNA (snoRNA) hosts (Tycowski et al. 1996), in which transcription of the host gene serves purely to yield the intron-encoded RNAs. It is also plausible that some mlncRNA hosts containing miRNAs in their introns serve as functional RNAs.

Figure 1.

Figure 1

Genomic organization of the mammalian miR-100, let-7a-2, miR-125b-1 locus and sequence alignment of human and mouse miR-125b-1 host transcripts. (A) Identified transcripts are shown with exons represented as vertical lines and labeled above in numerical order. Dotted lines denote splicing of transcripts. Positions of miRNA precursors are denoted with a triangle. Exons are not to scale. Human/Mouse Transcript 1 corresponds to ENSESTT00000036713 and ENSMUSESTT00000030830, respectively. Transcript 2 corresponds to human ENSESTT00000036715 and mouse 2610203C20Rik transcripts. (B) A DNA sequence alignment for Transcript 2 is shown. Identical bases are denoted in black. Consensus polyadenylation sites are boxed in. Note that nucleotides 597-955 of the human Transcript 2 are not shown.

A significant number of miRNAs overlap exons of novel noncoding transcripts (Tables 1,5; Supplemental Tables C,D). For example, the miR-22-predicted hairpin precursor is contained entirely within exon 2 of a noncoding TU, and the splicing pattern is generally conserved in human and mouse, despite the lack of protein-coding potential (Fig. 2A,B). Significant sequence conservation between the mouse and human miR-22 TU is observed within the 5′ end and within the miR-22-predicted hairpin precursor (Fig. 2B). Such high overall conservation in the mammalian miR-22/TU may indicate an important role in biogenesis or localization. Although high sequence conservation within the predicted hairpin precursors of mammalian miRNAs is a common feature (Lim et al. 2003), it is unclear whether the extended sequence similarity observed in miR-22/TU is a more common phenomenon for the exonic class. An alignment of the mouse and human miR-155/BIC does not show such extended conservation beyond the predicted hairpin precursor (data not shown). In addition to miR-22, a further 29 other miRNAs are contained within exons of transcripts (Table 5). In the majority of cases, exonic miRNAs were identified within mlncRNA transcripts, but in two rare cases we noted that the miRNA overlapped with the 3′ UTR of a protein-coding transcript (Supplemental Table D).

Figure 2.

Figure 2

Genomic organization of the mammalian miR-22 locus and sequence alignment of human and mouse miR-22 host transcripts. (A) Exons are shown as white rectangles and labeled above. The human/mouse miR-22 host transcript corresponds to ENSESTT00000065902 and 2210403K04Rik, respectively. (B) DNA sequence alignments of the miR-22 human/mouse host transcripts are shown. Identical bases are black. The mature miR-22 sequence is highlighted in gray. The conserved splice site junction between exons 1 and 2 is also highlighted in gray. Note that nucleotides 688-1658 of the mouse transcript are not shown.

The prevalence of miRNAs within introns or exons of TUs prompted us to ask whether it might be possible to derive reliable miRNA expression patterns from host gene expression data. By analogy to mammalian snoRNAs, intronic miRNAs may be cotranscribed with the host gene by inclusion in introns of their pre-mRNAs and excision from debranched introns by an exonuclease (Tycowski et al. 1993). Similarly to yeast snoRNAs, mammalian miRNAs of exonic transcription origin could be derived from ncRNA transcripts by endonucleolytic cleavage within duplex regions (Chanfreau et al. 1998). If parallel transcription of host genes and miRNAs takes place, similar expression patterns for host transcript and miRNAs would be observed. To test this possibility, we analyzed the expression of five predicted miRNA host transcripts by RT-PCR from several organs in adult mice and compared their expression pattern to that of their corresponding exonic or intronic miRNA (see Methods and Table 6). In four out of five cases, we note an exact correlation between the host gene/transcript expression (Table 6) and the previously reported miRNA expression in mouse adult organs (Sempere et al. 2004). This result suggests that miRNA expression may be reliably derived from host transcript expression information, and strongly suggests that overlapping transcripts in the genome are likely hosts for miRNAs.

Table 6.

Comparison of miRNA Expression and Host Transcript Expression

a miRNA expression is derived from Sempere et al. (2004).

b An ethidium bromide-stained gel picture shows RT-PCR results from mouse adult brain (B), liver (Li), heart (H), lung (Lu), kidney (K), and spleen (S).

graphic file with name 82629-06t6_1t_rev1.jpg

The above result impelled us to derive likely miRNA expression pattern information from all host genes identified here. We curated host gene expression data from a number of publicly available sources including primary literature and from public gene expression pattern databases (see Methods). This analysis allowed the identification of physiologically regulated, signal transduction, and tissue/organ specifically transcribed miRNA hosts (see Supplemental Tables A-D for a full description of expression patterns of miRNA hosts). For example, miR-33 is in an intron of SREBP2 (Table 2), an important gene in cholesterol homeostasis that is tightly regulated by sterols at the transcriptional level (Sato et al. 1996). In addition, we identified a cluster of three intronic miRNAs (miR-25, -93, and -106b) that are likely linked to the transcription of DNA Replication Licensing Factor MCM7 (Table 2), a gene that is quiescent in nondividing cells and is upregulated just prior to S phase of the cell cycle (Fujita et al. 1996). Interestingly, human miR-7-3 is contained within intron 2 of Pituitary Gland Specific Factor 1 (Table 2), a gene that is expressed in the pituitary gland (Tanaka et al. 2002). This mature miRNA has two other identical copies in the genome, one of which (miR-7-1) is possibly linked to the expression of a ubiquitously expressed gene, HNRPK (Supplemental Table E). This illustrates that expression pattern information for identical miRNAs may be confounded by distinct but overlapping patterns of transcription. In such cases where multiple paralogous miRNAs exist in the genome, it may prove useful to distinguish the expression of individual miR copies by analyzing the expression of their respective host genes (see Supplemental Table E for a list of similar/paralogous microRNAs contained in different host genes).

The current dogma in the miRNA field is that compartmentalized processing of all miRNAs to mature miRs occurs with Drosha in the nucleus and Dicer in the cytoplasm (for review, see Bartel 2004; He and Hannon 2004). In two seminal studies, it was demonstrated that miRNAs are dependent on Drosha for processing in HeLa cells (Lee et al. 2003) and that enrichment of a particular pri-/pre-miRNA occurs in the nucleus while the mature miR is found in the cytoplasm (Lee et al. 2002). However, we note that the miRNAs tested in these studies are intronic class miRNAs (i.e., miRs-15a, -16-1, -20, -30a; Supplemental Tables A,B), a `mixed' miRNA (miR-21) (Supplemental Table D), and several undefined transcription class miRNAs (data not shown). In lieu of the findings in this study of two types of transcription miRNA classes in mammals, it will be very interesting to revisit these experiments in the context of experimentally verified exonic transcription-class miRNAs. It seems likely that commonalities in processing and maturation by Drosha and Dicer occur for all transcription classes of miRNAs; however, the existence of exonic miRNAs in mammals implies that differences of biogenesis may also take place. In this regard, it is noteworthy that we found miR-206 within an exon of the synapse-associated mlncRNA, 7H4 (Table 4, Supplemental Table C; Velleca et al. 1994; Numata et al. 2003). 7H4 was originally identified by Velleca et al. (1994) as an RNA selectively found in the motor endplate of the rat skeletal neuromuscular junction. Detection of 7H4 at the neuromuscular junction raises the intriguing possibility that the unprocessed form of miR-206 may be transported outside of the nucleus as a pri-mir-206/7H4 RNA. Future studies will be required to determine whether some miRNAs can be actively transported to subcellular compartments outside of the nucleus embedded within their host transcripts.

Conclusion

The findings presented here clearly show that the expression of a large subset of mammalian miRNAs may be transcriptionally linked to the expression of other genes, coding for both proteins and ncRNAs. The identification of two distinct classes of miRNAs, overlapping introns of other transcripts, and encoded in exons, suggest that the maturation of miRNAs may be more complex than previously thought. Mammalian snoRNAs, responsible for modification of ribosomal and spliceosomal RNAs, were shown to be excised from introns of protein-coding (often ribosomal protein) genes (Qu et al. 1995). The suggestion that approximately half of mammalian miRNAs are expressed by a similar mechanism has important and far-reaching implications for our understanding of gene expression, regulation, and the basic definition of what constitutes a gene. Previous studies have demonstrated unexplained levels of conservation in intronic and other apparent noncoding sequence in mammals (Bejerano et al. 2004). There has also been recent discussion about the regulatory possibilities of parallel output arising from exonic and intronic sequences in genes (Mattick 2003). Although we are far from categorical evidence of parallel transcription on a global scale, and the data presented here account for only a small percentage of noncoding conservation, the presence of miRNAs in host genes, together with the reported roles of intronic snoRNAs, provides a tantalizing suggestion of this phenomenon on a larger scale.

METHODS

miR Databases

miRNA hairpin sequences were retrieved from the microRNA Registry release 3.1 (http://www.sanger.ac.uk/Software/Rfam/mirna/; Griffiths-Jones 2004). Some homologs were identified using pairwise similarity of the hairpin sequence and by matching of mature miRNAs. The ability of flanking regions to adopt a hairpin structure was determined using the mfold program (http://www.bioinfo.rpi.edu/applications/mfold/).

Genome Analysis and Curation

miRNAs were mapped to the human (NCBI Build 34) and mouse (NCBI Build 32) genome assemblies using WUBLASTN. To safeguard against concerns about mapping errors in the mouse NCBI Build 32 assembly, we also made use of the mouse NCBI Build 30. ENSEMBL IDs from mouse Build 30 are provided in several cases in which errors in the newer assembly were noted. For ease of manual annotation, the results of these searches were made accessible via the ENSEMBL genome graphical interface (http://www.ensembl.org/). Annotation of mouse and human miRNAs was aided by using a combination of genome annotated TUs, including Vega (http://vega.sanger.ac.uk/Homo_sapiens/), ENSEMBL (http://www.ensembl.org/), FANTOM2.0 (Okazaki et al. 2002; Numata et al. 2003), NCBI Acembly/AceView (http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/), NCBI (http://www.ncbi.nlm.nih.gov/mapview/), and FLJ transcripts (Ota et al. 2004). Genome coordinate overlaps between miRNAs and annotated TUs were classified according to the exonic or intronic nature of the miRNA. The FatiGo data mining tool (http://fatigo.bioinfocnio.es/; Al-Shahrour et al. 2004) was used to determine if any GO terms are significantly overrepre sented in microRNA host genes.

mRNA-Like Noncoding RNA Databases

A total of ∼10,000 mlncRNAs sequences were collected from the FANTOM Functional Annotation Database (http://fantom2.gsc.riken.go.jp/; Numata et al. 2003), the Noncoding RNAs Database (http://biobases.ibch.poznan.pl/ncRNA/; Erdmann et al. 2000), and the Full-Length Japan database (http://www.nedo.go.jp/bio-e/; Ota et al. 2004), and mapped to the genome using SSAHA. The positions of these ncRNAs were made accessible in the ENSEMBL genome graphical interface via the DAS protocol and analyzed for potential overlap with miRNAs in the human or mouse genome.

Identification of mRNA-Like Noncoding RNAs

miRNA host transcripts were designated as mlncRNAs in a process similar to that employed by the FANTOM Genome Exploration Research Group (Okazaki et al. 2002). To identify mlncRNA candidate miRNA hosts, we eliminated those ENSEMBL or NCBI Acembly AceView Transcripts overlapping with miRNAs with similarity to known protein sequences using BLASTX (E-value < 10-5) or those that showed homology to known motifs or domains in the Pfam database (Bateman et al. 2004).

miRNA Host Gene RT-PCR

Total RNA was prepared from 2-3-wk-old mice using the RNA RT-PCR Miniprep Kit (Stratagene). To safeguard against RT amplification of hnRNA (nuclear precursor RNA), an oligoDT primer was used in the first-strand cDNA synthesis step instead of gene-specific primers. In order to unambiguously distinguish spliced cDNA from genomic DNA contamination, specific primers were designed to amplify across introns of the predicted microRNA host transcript, and -RT controls reactions were performed. The primers used to detect each of the five microRNA transcription units by PCR were as follows: miR-9-3 host transcript, ENSMUSESTG00000007408, 5′-GTGGGCCAGGAGAAAAGAGAAAAG-3′ (forward) and 5′-CTCTGTGTTCATCCTGTGGCTTTG-3′ (reverse); miR-22 carrier transcript, 2210403K04Rik, 5′-CAGTGA TTTTGCTCCTCTGTCCAC-3′ (forward) and 5′-CAACTGAGC TACAACCCCAGTCAT-3′ (reverse); miR-137 carrier transcript, ENSMUESTG00000012545, 5′-CAGAGTCCGCATGAAGCAGAG CAA-3′ (forward) and 5′-GTCCTTGGCAACCGGGAGCTTTTA-3′ (reverse); miR-153 host transcript, ENSMUSG00000054701, 5′-CTGTGCTGACCTATGACCACTC-3′ (forward) and 5′-CAATCAG GACGTAGGTTCCACT-3′ (reverse); miR-219 host transcript, ENSMUESTG00000007085MBuild30, 5′-AGAGGCGCCCGCTT GGGTTCAG-3′ (forward) and 5′-TCATTCCCCAGCTTCTGGG TTT-3′ (reverse).

miRNA Host Gene Expression Pattern Curation

Gene expression pattern information was obtained from the primary literature using NCBI PubMed (http://www.ncbi.nlm.nih.gov:80/) as a source for articles. In cases where the host gene/TU had no primary literature citations, expression data were instead obtained from the RIKEN Expression Array Database (http://read.gsc.riken.go.jp/; Bono et al. 2003), the Gene Expression Atlas Database (http://expression.gnf.org/cgi-bin/index.cgi; Su et al. 2002), the NCBI Acembly/AceView Web site, or from SWISS-PROT (http://us.expasy.org/sprot/).

Acknowledgments

We thank Tony Cox and James Stalker for technical assistance in generating ENSEMBL DAS sources for genome annotation, Dr. Russell Grocock for useful discussions and technical help, and Drs. Jos Jonkers and Madhuri Warren for critical reading of the manuscript. A.R. is an NIH Postdoctoral Fellow. This work was funded by The Wellcome Trust.

Footnotes

[Supplemental material is available online at www.genome.org.]

Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.2722704. Article published online before print in September 2004.

References

  1. Al-Shahrour, F., Diaz-Uriarte, R., and Dopazo, J. 2004. FatiGO: A web tool for finding significant associations of gene ontology terms with groups of genes. Bioinformatics 20: 578-580. [DOI] [PubMed] [Google Scholar]
  2. Ambros, V. 2000. Control of developmental timing in Caenorhabditis elegans. Curr. Opin. Genet. Dev. 10: 428-433. [DOI] [PubMed] [Google Scholar]
  3. Bartel, D.P. 2004. MicroRNAs: Genomics, biogenesis, mechanism, and function. Cell 116: 281-297. [DOI] [PubMed] [Google Scholar]
  4. Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., et al. 2004. The Pfam protein families database. Nucleic Acids Res. 32: D138-141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bejerano, G., Pheasant, M., Makunin, I., Stephen, S., Kent, W.J., Mattick, J.S., and Haussler, D. 2004. Ultraconserved elements in the human genome. Science 304: 1321-1325. [DOI] [PubMed] [Google Scholar]
  6. Bono, H., Yagi, K., Kasukawa, T., Nikaido, I., Tominaga, N., Miki, R., Mizuno, Y., Tomaru, Y., Goto, H., Nitanda, H., et al. 2003. Systematic expression profiling of the mouse transcriptome using RIKEN cDNA microarrays. Genome Res. 13: 1318-1323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Carrington, J.C. and Ambros, V. 2003. Role of microRNAs in plant and animal development. Science 301: 336-338. [DOI] [PubMed] [Google Scholar]
  8. Chan, A.S., Thorner, P.S., Squire, J.A., and Zielenska, M. 2002. Identification of a novel gene NCRMS on chromosome 12q21 with differential expression between rhabdomyosarcoma subtypes. Oncogene 21: 3029-3037. [DOI] [PubMed] [Google Scholar]
  9. Chanfreau, G., Legrain, P., and Jacquier, A. 1998. Yeast RNase III as a key processing enzyme in small nucleolar RNAs metabolism. J. Mol. Biol. 284: 975-988. [DOI] [PubMed] [Google Scholar]
  10. Dostie, J., Mourelatos, Z., Yang, M., Sharma, A., and Dreyfuss, G. 2003. Numerous microRNPs in neuronal cells containing novel microRNAs. RNA 9: 180-186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Erdmann, V.A., Szymanski, M., Hochberg, A., Groot, N., and Barciszewski, J. 2000. Non-coding, mRNA-like RNAs database Y2K. Nucleic Acids Res. 28: 197-200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. French, P.J., Bliss, T.V., and O'Connor, V. 2001. Ntab, a novel non-coding RNA abundantly expressed in rat brain. Neuroscience 108: 207-215. [DOI] [PubMed] [Google Scholar]
  13. Fujita, M., Kiyono, T., Hayashi, Y., and Ishibashi, M. 1996. hCDC47, a human member of the MCM family. Dissociation of the nucleus-bound form during S phase. J. Biol. Chem. 271: 4349-4354. [DOI] [PubMed] [Google Scholar]
  14. Gene Ontology Consortium. 2001. Creating the gene ontology resource: Design and implementation. Genome Res. 8: 1425-1433. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Griffiths-Jones, S. 2004. The microRNA Registry. Nucleic Acids Res. 32: D109-111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. He, L. and Hannon, G.J. 2004. MicroRNAs: Small RNAs with a big role in gene regulation. Nat. Rev Genet. 5: 522-531. [DOI] [PubMed] [Google Scholar]
  17. Hong, Y.K., Ontiveros, S.D., Chen, C., and Strauss, W.M. 1999. A new structure for the murine Xist gene and its relationship to chromosome choice/counting during X-chromosome inactivation. Proc. Natl. Acad. Sci. 96: 6829-6834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Houbaviy, H.B., Murray, M.F., and Sharp, P.A. 2003. Embryonic stem cell-specific MicroRNAs. Dev. Cell 5: 351-358. [DOI] [PubMed] [Google Scholar]
  19. Kim, J., Krichevsky, A., Grad, Y., Hayes, G.D., Kosik, K.S., Church, G.M., and Ruvkun, G. 2004. Identification of many microRNAs that copurify with polyribosomes in mammalian neurons. Proc. Natl. Acad. Sci. 101: 360-365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Lagos-Quintana, M., Rauhut, R., Lendeckel, W., and Tuschl, T. 2001. Identification of novel genes coding for small expressed RNAs. Science 294: 853-858. [DOI] [PubMed] [Google Scholar]
  21. Lagos-Quintana, M., Rauhut, R., Yalcin, A., Meyer, J., Lendeckel, W., and Tuschl, T. 2002. Identification of tissue-specific microRNAs from mouse. Curr. Biol. 12: 735-739. [DOI] [PubMed] [Google Scholar]
  22. Lagos-Quintana, M., Rauhut, R., Meyer, J., Borkhardt, A., and Tuschl, T. 2003. New microRNAs from mouse and human. RNA 9: 175-179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Lee, Y., Jeon, K., Lee, J.T., Kim, S., and Kim, V.N. 2002. MicroRNA maturation: Stepwise processing and subcellular localization. EMBO J. 21: 4663-4670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lee, Y., Ahn, C., Han, J., Choi, H., Kim, J., Yim, J., Lee, J., Provost, P., Radmark, O., Kim, S., et al. 2003. The nuclear RNase III Drosha initiates microRNA processing. Nature 425: 415-419. [DOI] [PubMed] [Google Scholar]
  25. Lim, L.P., Glasner, M.E., Yekta, S., Burge, C.B., and Bartel, D.P. 2003. Vertebrate microRNA genes. Science 299: 1540. [DOI] [PubMed] [Google Scholar]
  26. Mattick, J.S. 2003. Challenging the dogma: The hidden layer of non-protein-coding RNAs in complex organisms. Bioessays 25: 930-939. [DOI] [PubMed] [Google Scholar]
  27. Michael, M.Z., O'Connor, S.M., van Holst Pellekaan, N.G., Young, G.P., and James, R.J. 2003. Reduced accumulation of specific microRNAs in colorectal neoplasia. Mol. Cancer Res. 1: 882-891. [PubMed] [Google Scholar]
  28. Migliazza, A., Bosch, F., Komatsu, H., Cayanis, E., Martinotti, S., Toniato, E., Guccione, E., Qu, X., Chien, M., Murty, V.V., et al. 2001. Nucleotide sequence, transcription map, and mutation analysis of the 13q14 chromosomal region deleted in B-cell chronic lymphocytic leukemia. Blood 97: 2098-2104. [DOI] [PubMed] [Google Scholar]
  29. Mourelatos, Z., Dostie, J., Paushkin, S., Sharma, A., Charroux, B., Abel, L., Rappsilber, J., Mann, M., and Dreyfuss, G. 2002. miRNPs: A novel class of ribonucleoproteins containing numerous microRNAs. Genes & Dev. 16: 720-728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Numata, K., Kanai, A., Saito, R., Kondo, S., Adachi, J., Wilming, L.G., Hume, D.A., Hayashizaki, Y., and Tomita, M. 2003. Identification of putative noncoding RNAs among the RIKEN mouse full-length cDNA collection. Genome Res. 13: 1301-1306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Okazaki, Y.M., Furuno, T., Kasukawa, J., Adachi, H., Bono, S., Kondo, I., Nikaido, N., Osato, R., Saito, H., Suzuki, I. et al. 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420: 563-573. [DOI] [PubMed] [Google Scholar]
  32. Ota, T.Y., Suzuki, T., Nishikawa, T., Otsuki, T., Sugiyama, R., Irie, A., Wakamatsu, K., Hayashi, H., Sato, K., Nagai, K., et al. 2004. Complete sequencing and characterization of 21,243 full-length human cDNAs. Nat. Genet. 36: 40-45. [DOI] [PubMed] [Google Scholar]
  33. Qu, L.H., Henry, Y., Nicoloso, M., Michot, B., Azum, M.C., Renalier, M.H., Caizergues-Ferrer, M., and Bachellerie, J.P. 1995. U24, a novel intron-encoded small nucleolar RNA with two 12 nt long, phylogenetically conserved complementarities to 28S rRNA. Nucleic Acids Res. 23: 2669-2676. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Sato, R., Inoue, J., Kawabe, Y., Kodama, T., Takano, T., and Maeda, M. 1996. Sterol-dependent transcriptional regulation of sterol regulatory element-binding protein-2. J. Biol. Chem. 271: 26461-26464. [DOI] [PubMed] [Google Scholar]
  35. Sempere, L.F., Freemantle, S., Pitha-Rowe, I., Moss, E., Dmitrovsky, E., and Ambros, V. 2004. Expression profiling of mammalian microRNAs uncovers a subset of brain-expressed microRNAs with possible roles in murine and human neuronal differentiation. Genome Biol. 5: R13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Sleutels, F., Zwart, R., and Barlow, D.P. 2002. The non-coding Air RNA is required for silencing autosomal imprinted genes. Nature 415: 810-813. [DOI] [PubMed] [Google Scholar]
  37. Su, A.I., Cooke, M.P., Ching, K.A., Hakak, Y., Walker, J.R., Wiltshire, T., Orth, A.P., Vega, R.G., Sapinoso, L.M., Moqrich, A., et al. 2002. Large-scale analysis of the human and mouse transcriptomes. Proc. Natl. Acad. Sci. 99: 4465-4470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Tam, W. 2001. Identification and characterization of human BIC, a gene on chromosome 21 that encodes a noncoding RNA. Gene 274: 157-167. [DOI] [PubMed] [Google Scholar]
  39. Tanaka, S., Tatsumi, K., Okubo, K., Itoh, K., Kawamoto, S., Matsubara, K., and Amino, N. 2002. Expression profile of active genes in the human pituitary gland. J. Mol. Endocrinol. 28: 33-44. [DOI] [PubMed] [Google Scholar]
  40. Tycowski, K.T., Shu, M.D., and Steitz, J.A. 1993. A small nucleolar RNA is processed from an intron of the human gene encoding ribosomal protein S3. Genes & Dev. 7: 1176-1190. [DOI] [PubMed] [Google Scholar]
  41. ____. 1996. A mammalian gene with introns instead of exons generating stable RNA products. Nature 379: 464-466. [DOI] [PubMed] [Google Scholar]
  42. Velleca, M.A., Wallace, M.C., and Merlie, J.P. 1994. A novel synapse-associated noncoding RNA. Mol. Cell Biol. 14: 7095-7104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Xu, P., Vernooy, S.Y., Guo, M., and Hay, B.A. 2003. The Drosophila microRNA Mir-14 suppresses cell death and is required for normal fat metabolism. Curr. Biol. 13: 790-795. [DOI] [PubMed] [Google Scholar]
  44. Yamamoto, K., Cox, J.P., Friedrich, T., Christie, P.T., Bald, M., Houtman, P.N., Lapsley, M.J., Patzer, L., Tsimaratos, M., Van, T.H.W.G., et al. 2000. Characterization of renal chloride channel (CLCN5) mutations in Dent's disease. J. Am. Soc. Nephrol. 11: 1460-1468. [DOI] [PubMed] [Google Scholar]

WEB SITE REFERENCES

  1. http://biobases.ibch.poznan.pl/ncRNA/; Noncoding RNAs database.
  2. http://www.bioinfo.rpi.edu/applications/mfold/; mfold RNA program.
  3. http://www.ensembl.org; ENSEMBL genome viewer.
  4. http://expression.gnf.org/cgi-bin/index.cgi; Gene expression atlas database.
  5. http://fantom2.gsc.riken.go.jp/; FANTOM functional annotation database.
  6. http://fatigo.bioinfo.cnio.es/; FatiGO Data Mining with Gene Ontology Terms.
  7. http://www.ncbi.nlm.nih.gov/mapview/; NCBI genome viewer.
  8. http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/; NCBI Acembly AceView gene collection.
  9. http://www.nedo.go.jp/bio-e/; Full-Length Japan database.
  10. http://read.gsc.riken.go.jp/; RIKEN expression array database. [DOI] [PMC free article] [PubMed]
  11. http://www.sanger.ac.uk/Software/Rfam/mirna/; miRNA Registry.
  12. http://us.expasy.org/sprot/; SWISS-PROT.
  13. http://vega.sanger.ac.uk/Homo_sapiens/; Vega human annotation browser.
  14. http://www.geneontology.org/; Gene Ontology.

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press

RESOURCES

OSZAR »