Fig. 4. Functional effects of vertebrate WGD and gene loss in vertebrates.
a, Key neural-crest-related gene families with members classified according to their functional role (colour) and paralogy status relative to 1RV and 2RJV. The involvement of paralogues derived from both copies of the 1RV in NCC-related function, in both gnathostomes and lampreys, supports the hypothesis that NCCs predate 1RV. b, Enrichment of functional annotation terms (gene ontology) in sets of genes showing a specific pattern of retention after vertebrate WGDs. Each column corresponds to a set of paralogous genes with a specific pattern of post-duplication retention in a given species. We distinguished cases in which both paralogues can be assigned to a specific duplication and are retained, cases in which at least one of the paralogues is retained and cases in which at least one of the two copies is lost. CNS, central nervous system. c, Distribution of the difference of positive organ-specific expression domains between selected vertebrate species and the amphioxus outgroup for ohnologue gene families59. A shift to the left in the distribution (as seen for the gar) indicates an extensive subfunctionalization through the restriction of gene-expression domains in vertebrates. d, Gene-family loss in deuterostomes, highlighting the severe loss in the hagfish lineage relative to that seen in other vertebrates and deuterostomes (grey). Species abbreviations are provided in Supplementary Table 8. e, Functional enrichment (gene ontology) for gene families lost in the hagfish lineages, highlighting a simplification of visual and hormonal systems (labels in orange). f, Structure of the two clusters of α-keratin genes on chromosomes 14 and 4, and their expression in the slime gland and the skin shown as a heat map (gene expression expressed as fragments per kilobase per million reads (FPKM)). Unchar is the prefix used for naming genes that did not receive a gene name by homology search. Genes are shown in the same order in the heat map as they are located in the two clusters. Stars indicate the two genes that are expressed preferentially in the skin (Extended Data Fig. 10).