Community resources Evolutionary divergence in embryo and seed coat development of U’s Triangle Brassica species illustrated by a spatiotemporal transcriptome atlas Authors for correspondence: Raju Datla Email: raju.datla@gifs.ca Daoquan Xiang Email: daoquan.xiang@nrc-cnrc.gc.ca Received: 16 March 2021 Accepted: 19 July 2021 Peng Gao1* , Teagen D. Quilichini2* , Hui Yang2, Qiang Li3, Kirby T. Nilsen4 , Li Qin5 , Vivijan Babic2, Li Liu1, Dustin Cram2, Asher Pasha6 , Eddi Esteban6 , Janet Condie2, Christine Sidebottom2, Yan Zhang7, Yi Huang8, Wentao Zhang2 , Pankaj Bhowmik2, Leon V. Kochian1 , David Konkin2 , Yangdou Wei5 , Nicholas J. Provart6 , Sateesh Kagale2 , Mark Smith7 , Nii Patterson2, C. Stewart Gillmor9 , Raju Datla1 and Daoquan Xiang2 1Global Institute for Food Security, University of Saskatchewan, Saskatoon, SK S7N 4L8, Canada; 2Aquatic and Crop Resource Development, National Research Council Canada, 110 Gymnasium Place, Saskatoon, SK S7N 0W9, Canada; 3National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China; 4Brandon Research and Development Centre, Agriculture and Agri-FoodCanada, 2701Grand Valley Road, Brandon,MBR7C 1A1, Canada; 5College of Art & Science, University of Saskatchewan, 9 Campus Dr, Saskatoon, SK S7N 5A5, Canada; 6Department of Cell & Systems Biology, University of Toronto, 25 Willcocks St., Toronto, ON M5S 3B2, Canada; 7Saskatoon Research and Development Centre, Agriculture and Agri-Food Canada, 107 Science Place, Saskatoon, SK S7N 0X2, Canada; 8Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture and Rural Affairs, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan 430062, China; 9Laboratorio Nacional de Gen�omica para la Biodiversidad (Langebio), Unidad de Gen�omica Avanzada, Centro de Investigaci�on y Estudios Avanzados del IPN (CINVESTAV-IPN), Irapuato, Guanajuato 36821, M�exico Contents Summary 30 I. Introduction 31 II. Results 31 III. Discussion 43 IV. Materials and Methods 46 Acknowledgements 47 Data availability 47 References 48 New Phytologist (2022) 233: 30–51 doi: 10.1111/nph.17759 Summary The economically valuable Brassica species include the six related members of U’s Triangle. Despite the agronomic and economic importance of these Brassicas, the impacts of evolution and relatively recent domestication events on the genetic landscape of seed development have not been comprehensively examined in these species. Here we present a 3D transcriptome atlas for the six species of U’s Triangle, producing a unique resource that captures gene expression *These authors contributed equally to this work. 30 New Phytologist (2022) 233: 30–51 � 2021 The Authors New Phytologist � 2021 New Phytologist Foundationwww.newphytologist.com This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. Forum https://orcid.org/0000-0002-6586-4307 https://orcid.org/0000-0002-6586-4307 https://orcid.org/0000-0003-3311-3776 https://orcid.org/0000-0003-3311-3776 https://orcid.org/0000-0002-9477-5549 https://orcid.org/0000-0002-9477-5549 https://orcid.org/0000-0002-1821-9946 https://orcid.org/0000-0002-1821-9946 https://orcid.org/0000-0002-9315-0520 https://orcid.org/0000-0002-9315-0520 https://orcid.org/0000-0001-9016-9202 https://orcid.org/0000-0001-9016-9202 https://orcid.org/0000-0003-4301-1597 https://orcid.org/0000-0003-4301-1597 https://orcid.org/0000-0003-3416-089X https://orcid.org/0000-0003-3416-089X https://orcid.org/0000-0001-5410-8357 https://orcid.org/0000-0001-5410-8357 https://orcid.org/0000-0001-7161-9845 https://orcid.org/0000-0001-7161-9845 https://orcid.org/0000-0001-5551-7232 https://orcid.org/0000-0001-5551-7232 https://orcid.org/0000-0002-7213-1590 https://orcid.org/0000-0002-7213-1590 https://orcid.org/0000-0003-4869-6257 https://orcid.org/0000-0003-4869-6257 https://orcid.org/0000-0003-1009-2167 https://orcid.org/0000-0003-1009-2167 https://orcid.org/0000-0003-0790-5569 https://orcid.org/0000-0003-0790-5569 https://orcid.org/0000-0001-7144-1274 https://orcid.org/0000-0001-7144-1274 http://creativecommons.org/licenses/by/4.0/ http://crossmark.crossref.org/dialog/?doi=10.1111%2Fnph.17759&domain=pdf&date_stamp=2021-10-23 Key words: Brassica, embryogenesis, evolution, oilseed, polyploid, seed coat, transcriptome. data for the major subcompartments of the seed, from the unfertilized ovule to the mature embryo and seed coat. This comprehensive dataset for seed development in tetraploid and ancestral diploid Brassicas provides new insights into evolutionary divergence and expression bias at the gene and subgenome levels during the domestication of these valued crop species. Comparisons of gene expression associated with regulatory networks and metabolic pathways operating in the embryo and seed coat during seed development reveal differences in storage reserve accumulation and fatty acid metabolism among the six Brassica species. This study illustrates the genetic underpinnings of seed traits and the selective pressures placed on seed production, providing an immense resource for continued investigation of Brassica polyploid biology, genomics and evolution. I. Introduction Brassicaceae is a morphologically distinct and economically valued Cruciferae family used in a plethora of vegetable and seed products including vegetable oil, condiments, fodder and biofuels (Al- Shehbaz et al., 2006). Owing to their diverse uses and commercial value, improving Brassica productivity presents an opportunity to address thedemandsof agrowingpopulation (Waminal et al., 2016; Song etal.,2020).Brassicabreedingprogramshavetargetedseeds for increased and modified oil content and meal quality (Chalhoub et al., 2014; Song et al., 2020) yielding 21–46%oil by seedmass and 20–40%protein-rich seedmeal. Atmaturity, themultilayered seed coat is tightly bound to the embryo, separated by a thin, degenerating endosperm, with the bulk of oil and protein biosyn- thesis initiated by, and in coordination with, embryo morphogen- esis (Wanasundara, 2011; Sharafi et al., 2015; Ziegler et al., 2019). In the 1930s,NagaharuU formulatedU’sTriangle hypothesis to explain the evolutionary relationships among six species of Brassica (Nagaharu & Nagaharu, 1935). Through hybridization and polyploidization, the diploid progenitors B. rapa (AA) and B. oleracea (CC) putatively gave rise to the allotetraploid B. napus (AACC); B. rapa and B. nigra (BB) putatively gave rise to B. juncea (AABB); and B. nigra and B. oleracea putatively gave rise to B. carinata (BBCC) (Chen et al., 2011; Kim et al., 2018; Supporting Information Fig. S1). Although diploid and tetraploid Brassicas share many features, they differ in their genetic structure and ploidy levels (Jiang et al., 2015; Clarke et al., 2016). Germplasm carrying new genes or adapted alleles can facilitate superior trait development, dramatically increase productivity, and create new species through genomic bridges and interspecific U’s Triangle crosses (Chen et al., 2011; Zhang et al., 2016). The selection of high-value traits has shaped the cultivated Brassicas relative to their ancestral species (Chalhoub et al., 2014; Cheng et al., 2016), including for example, the preferential selection of traits favoring vegetable production in B. oleracea, or the selection of oil yield and quality traits in B. napus. Despite the importance of understanding the evolutionary histories of Brassicas belonging to U’s Triangle, the patterns of gene expression and their contribu- tions during embryo and seed coat development and transcriptional divergence between ancestral and cultivated species are not known. Comparative genomics and transcriptomics among progenitor and cultivated crop species provide an opportunity to characterize the genetic changes that broaden diversity and contribute to speciation (Davidson et al., 2012; Ram�ırez-Gonz�alez et al., 2018; Shahid et al., 2019; Visger et al., 2019; K.Wang et al., 2019; Xiang et al., 2019). Uncovering gene expression differences and the evolutionary context of these changes has the potential to explain phenotypic variation, the regulatory switches affecting gene expression, and developmental programs operating across embryo- genesis and seed development. For Brassicas, capturing and characterizinggeneexpressionprofiles fordeveloping seedsprovides a valuable resource and insightful framework to guide the genetic improvement of germplasm (Johnston et al., 2005; Liu et al., 2014). Further, the study of gene expression in Brassica relatives with expanding ploidy levels facilitates homolog and homoeolog iden- tification, providing an opportunity to gain insights into the functional divergence of Brassica species subjected to agricultural selective pressures (Cheng et al., 2016; Yang et al., 2016). Seed development in angiosperms produces three genetically distinct compartments: the embryo, endosperm and seed coat (or testa) (Downie et al., 2003; Garcia et al., 2005; Xiang et al., 2011; Millar et al., 2015). The embryo and endosperm are products of fertilization, whereas the seed coat is derived from maternal tissue (Haughn&Chaudhury, 2005; Dean et al., 2011; Figueiredo et al., 2016;Coen et al., 2017).Theembryo transmits genetic information tothenextgeneration,whereas theendospermisa transientnutritive tissue that is almost absent in mature Brassica seeds. Characterizing the transcriptional programs operating in the embryo and seed coat, and evaluating these programs in multiple Brassica relatives, holds the potential to uncover the impact of hybridization, polyploidiza- tion and breeding selection onBrassica seed development.Here, we capture comprehensive data across species, stages and tissues to generate a3Dtranscriptomeatlas for the embryo and seed coat of six interrelated Brassica species, incorporating global gene expression data across key stages of seed development. By comparing and contrasting transcriptome data for ancestral diploid and their derivative tetraploid Brassica species, this study uniquely provides evolutionary context for gene expression dynamics, divergence and expression bias at the subgenome level, for the two principal subcompartments of Brassica seeds. II. Results 1. A gene expression atlas for the embryo and seed coat of related Brassica species Inspection of embryo morphology revealed a uniform develop- mental progression across the six Brassica species of U’s Triangle, � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation New Phytologist (2022) 233: 30–51 www.newphytologist.com New Phytologist Community resources Forum 31 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense which was highly similar to Arabidopsis thaliana (J€urgens, 2001; Xiang et al., 2011; Belmonte et al., 2013; Gao et al., 2019; Hofmann et al., 2019). Thus, our previous nomenclature was used to describe seed development (Xiang et al., 2011). We focused our study on the two principal seed subcompartment tissues: the embryo and the seed coat (defined herein as the testa along with small amounts of remnant endosperm) and the gene expression landscape for these tissues throughout seed development. A detailed microscopic analysis of isolated embryos and correspond- ing seed coats for developing B. napus (DH12075) seeds is presented (Fig. 1). In order to obtain transcriptomes for seed tissue development for U’s Triangle Brassica species, we used an Illumina Hi-seq platform (Illumina, Inc., San Diego, CA, USA) for RNA-seq analysis of unfertilized ovule (UO), developing embryo (E) and seed coat (S) samples. The 15 samples obtained from each of the six Brassica species contained one pre-fertilization stage (unfertilized ovule) and seven stages of development for each of the two seed subcompartments: zygote (E1, S1); quadrant (E2, S2); globular (E3, S3); heart (E4, S4); torpedo (E5, S5); bent (E6, S6) andmature (E7, S7) stages (Fig. 1; Table S1). The six Brassica species included three diploids, B. rapa (RpAA), B. oleracea (OlCC) and B. nigra (NgBB), and three tetraploids, B. napus (NpAACC), B. juncea (JnAABB) and B. carinata (CrBBCC) (Table S1), comprising a total of 270 RNA-seq samples and yielding over 10.3 billion paired-end 125-bp reads. A total of 261 726 expressed genes were identified. These datasets were used to develop an eFP browser at https://bar. utoronto.ca for embryo and seed coat tissues in tetraploid Brassicas and their putative diploid ancestors. 2. Relationship of the transcriptomes in U’s Triangle of Brassica species For a global view of gene expression between Brassica seed tissues over developmental time, multiple unsupervised and supervised clustering approaches were applied, including principal- component analysis (PCA; Fig. 2a), hierarchical clustering (Fig. 2b), partial least squares discriminant analysis (PLS-DA) and orthogonal PLS-DA (OPLS-DA) (Fig. S2). PCA was per- formed using the log-transformed transcripts per kilobase million (TPM) of 7851 homoeologs (as defined in Section IV.3) over the seed development (see Section IV; Dataset S1; Fig. 2a). The proportion of variance for each principal component explained 48.8% of the total variance from PC1 to PC4 (18.9%, 12.5%, 10.5% and 6.9%). Bi-plots between each of the two principal components were generated (Fig. 2a). PC1 and PC2 separated 135 of the samples into seven clusters, based on tissue type and developmental stage (Fig. 2a). The seven clusters consisted of early (E1andE2),middle (E3–E5)and late (E6andE7)phasesof embryo development, and pre-seed coat (UO and S1), early (S2), middle (S3–S5) and late (S6 and S7) phases of seed coat development. The PC3 and PC4 bi-plot showed partial separation of the three subgenomes (Fig. 2a), with the close relationship between A (circles) and C (squares) subgenomes easily distinguished from the more distant B subgenome (triangles). Using an unsupervised hierarchical clustering approach, we expanded our clustering analysis by constructing a phylogenetic tree for homoeolog expression, to provide evolutionary context for the divergence of gene expression in the six Brassica species (Fig. 2b). As in the PCA analysis, homoeologous gene expression separated into two clusters based on tissue-type, as highlighted by the embryonic (gray) and seed coat (light green) sample brackets (Fig. 2b, inner circle) and seven phases of embryo or seed coat development (pre-seed coat, early,middle and late) (Fig. 2b, outer circle). In each subclade of the unsupervised hierarchical clustering tree, the distances between subgenomes A and C were consistently shorter than between subgenomes A and B, or B and C, substantiating the B subgenome expression divergence observed in the PCA analysis. These unsupervised clustering analyses reveal close similarity amongst the transcriptome datasets and support substantial conservation of gene expressionduring seeddevelopment for the sixBrassica species. In order to explain a wider proportion of the transcript sample variance amongst the six Brassica species, PLS-DA and OPLS-DA supervised clustering analyses were used to force the separation of samples in a select dimension and compliment the unsupervised clustering analyses. PLS-DA using forced separation of the transcriptomes by species produced score-plots similar to the unsupervised PC1 vs PC2 bi-plot and hierarchical clustering analyses (Fig. 2), in which subgenome B showed clear separation from the tightly overlapping A and C subgenomes (Fig. S2a,b). Imposing separation of samples by OPLS-DA based on tissue type supported a strong distinction between embryo and seed coat expression patterns, indicating that tissue differences contribute a major source of variation among samples (Fig. S2c,d). Clustering of samples based on stage type (colored text within colored ovals) emerged from the PLS-DA analysis with forced separation of samples based on three seed developmental phases (early, middle and late). In this PLS-DAplot based on stage type, the separation by ploidy was also clear, with samples from the tetraploid Brassica species (enclosed within the shaded pink oval) appearing distant from the diploid species (within gray-shaded crescent) (Fig. S2e,f). In order to compare evolutionary divergence at the subgenome level, pairwise Spearman’s correlation coefficients (rho) at each stage and tissue were calculated for 7851 homoeologs (Dataset S1; Fig. S3). Ternary plots were used to assess correlation distances (Fig. S4). The results support conclusions obtained from the supervised and unsupervised analyses, in which the tissues and stages contributed more to the sample variations than the subgenomes and species. Furthermore, ternary plots revealed a closer evolutionary relationship between the A and C subgenomes than between the B and C subgenomes at the transcript level; and greater divergence of the C subgenome in comparison with the A and B subgenomes. The C subgenome of B. napus and B. carinata exhibited the smallest distance in late stages of embryo and seed coat development, supporting a close evolutionary relationship between seed expression programs in these species (Fig. S4d). Considering oil accumulation occurs in late stages of seed development, a close evolutionary relationship between B. napus and B. carinata may reflect long-term breeding selection for common seed traits. Overall, PCA, hierarchical clustering, PLS-DA and evolutionary distance analyses of expressed homoeologous genes (Figs 2, S2) suggest that the degree of impact on the evolutionary divergence of New Phytologist (2022) 233: 30–51 www.newphytologist.com � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation Forum Community resources New Phytologist32 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense https://bar.utoronto.ca https://bar.utoronto.ca (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n) (o) (p) (q) (r) Fig. 1 Brassica napus embryo and seed coat development. (a–h) Representativemicrographs depict the developmental progression of theB. napus embryo (E) frombefore fertilization tomaturity. (a) Unfertilized ovule (UO; no embryo); (b) 1-cell (left) and 2-cell (right) zygote (E1); (c) 4-cell (left) and 8-cell (right) (E2); (d) 16-cell (left), 32-cell (middle) and 64-cell (right) (E3); (e) heart (E4); (f) torpedo (E5); (g) bent (E6); and (h)mature (E7) embryo. Schematic of the embryo (i) and seed coat (S) (j) across the seven stages of seed development used for RNA-seq analysis, from the unfertilized ovule stage (UO, j), and progressing from immature (left, E1 and S1, respectively) to mature (right, E7 and S7, respectively). Basal cell and suspensor illustrations are included for E1–E4 only (i). Illustrations are not drawn to scale. Gray-shaded graphics represent tissue excluded from samples (j). (k–r) Representative Toluidine blue-stained light micrographs depict the developmental progression of theB. napus seed coat, fromovule integuments into a differentiated seed coatwith specialized cell types. (k)Unfertilized ovule integuments (UO; noembryo). (l) 1- to 2-cell zygote stage (S1); (m) 4- to 8-cell stage (S2, 8-cell stage shown); (n) 16- to 64-cell stage (S3, globular stage shown); (o)heart stage (S4); (p) torpedostage (S5); (q)bent stage (S6); and (r)mature (S7) stageof seed formation.AC,apical cell;Al, endosperm aleurone; BC, basal cell; Cot, cotyledon; E, embryo; En, endosperm; Ep, epidermis; ISC, inner seed coat; OSC, outer seed coat; Pal, palisade layer; Par, parenchyma layer; Pi, pigmented layer; RAM, root apical meristem; S, seed coat; SE, subepidermis; Sus, suspensor; * indicates early shoot apical meristem. Bars: (b) 5 µm; (c, d) 10 µm; (a, e, k–n) 20 µm; (f, o–r) 50 µm; (g, h) 100 µm. � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation New Phytologist (2022) 233: 30–51 www.newphytologist.com New Phytologist Community resources Forum 33 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense (a) (b) Fig. 2 Relationship of the transcriptomes in six Brassica species from different stages, tissues and subgenomes during seed development. (a) Principal componentanalysis (PCA)using7851commonhomoeologousgenes in135 individuals fromthree subgenomesof sixBrassica species. Left, 2D-plotofPC1and PC2. Right, 2D-plot of PC3 and PC4. Proportion of variance for each principal component is indicated in brackets of axis titles. The three phases of embryo development (Early, E1 andE2;Middle, E3, E4 andE5; Late, E6 andE7) and four phases of seed coat development (Pre,UOandS1; Early, S2;Middle, S3, S4 and S5; Late, S6 and S7) are labeled with different colors (see Tissue Stage inset). Transparent ovals are used to group individuals in the same tissues (a) or same subgenomes (b). Each subgenome is labeled with a different shape, with A, B, and C subgenomes represented by a circle, triangle or square, respectively (see Subgenome inset). (b)Homoeolog expression phylogenetic tree of embryoand seed coat development in six species: R,B. rapa; N,B. nigra;O,B. oleracea; JA- A, subgenomeofB. juncea; JB-B, subgenomeofB. juncea; PA-A, subgenomeofB. napus, PC-C, subgenomeofB. napus;CB-B, subgenomeofB. carinata, CC- C, subgenome of B. carinata. Color strips in the outermost ring indicate the embryo and seed coat developmental stages as described in (a). Inner color strips indicate two tissues: gray, embryo; green, seed coat. Embryo and seed coat illustrations in different stages were placed in corresponding positions. Branch line colors differentiate the three subgenomes: orange, A subgenome; green, B subgenome; blue, C subgenome as shown in the left branch color inset. Detailed information for 135 individuals used for PCA and building of the expression phylogenetic tree are shown in Supporting Information Table S3. New Phytologist (2022) 233: 30–51 www.newphytologist.com � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation Forum Community resources New Phytologist34 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense Brassica species during seed development can be categorized as follows: seed coat > embryo; and pre > late > early >middle stage. 3. Biased expression of subgenome homoeologs in tetraploid Brassica species In order to systematically investigate homoeolog expression divergence and bias in the sixBrassica species, the relative expression of homoeologs across all seed developmental stages and tissues was normalized (see Section IV, deposited in eFP browser) and diads and triads (see Section IV for definition) were identified. Further- more, a large set of differentially expressed genes (DEGs) among tissues, triads, diads, subgenomes and species was identified using multiple analysis methods as shown inDataset S2. A detailedDEG list and accompanying gene ontology (GO) enrichment analyses for these DEGs were compiled (Figs S5–S11; Dataset S3–S7). TheseDEGs provide a data resource for investigating the evolution of gene expression over seed development during polyploidization of Brassica species. The expression bias for all diads was calculated as in Wu et al. (2018). DEG analysis based on comparisons of the expression of homoeologs from tetraploid Brassica species and their putative diploid ancestors found nine groups (Fig. 3a, I–IX). These nine groups formed three categories based on balanced or biased expression between subgenomes, which were termed the ‘parental condition’, ‘no bias in tetraploid species’ and ‘novel bias in tetraploid species’ (comprising A subgenome-bias, B subgenome- bias and C subgenome-bias). As shown in Dataset S8, the majority of diads (60.9–81.4%) maintained the expression pattern of the diploid parental homoeologs in tetraploid species. Biased diads showed more B-biased than A-biased expression in B. juncea, more C-biased than A-biased expression in B. napus, and more C-biased than B-biased expression in B. carinata in each stage of embryo and seed coat development (Fig. 3b; Dataset S8). Genes that display biased expression in only one tissue (embryo or seed coat), but maintain the parental expression pattern in the other tissues across three tetraploid Brassica species were identified (Dataset S8). Interestingly, comparison of all three tetraploid species revealed relativelymore genes with biased expression in the seed coat than in the embryo, suggesting that after polyploidization, the regulation of subgenomebiased expressionmayhave played amore prominent role in the seed coat than in the embryo. 4. Additive and nonadditive expression features of subgenome homoeologs in U’s Triangle Tetraploid Brassica species arose through hybridization and polyploidization of parental diploid genomes. The extra heterozy- gosity and multiple gene copies available to these polyploids provided new avenues for generating modulated gene expression and diversity, but carried the cost of coordinating divergent subgenome co-expression.To understand how subgenome homoe- ologs are coordinately expressed during Brassica seed development, we identified scenarios in which comparisons between the expression of the diad in a tetraploid species and in its two ancestors could be made, including additive features, and nonadditive features consisting of expression dominance and transgressive expression (Fig. 3c). Based on the DEG analysis comparing the total expression levels of the two homoeologs of a tetraploid species with the expression of corresponding homoe- ologs from two diploid ancestors, the additive, dominant or transgressive features were defined for each diad, for the embryo and seed coat across the seven developmental stages in the three tetraploid Brassica species (Fig. 3d; Dataset S9). The majority (61.1–92.8%) of stages, tissues and species exhibited additive expression, followed by dominant expression (12.8–34.3%) and transgressive expression (2–7.2%). Unlike the expression bias pattern at the subgenome level (subgenome C > B > A), we did not find a significantly elevated number of dominant features in any subgenome in the three allotetraploid species (Fig. 4d), suggesting a difference in transcriptional regulation of biased and dominant expression. Similar to the expression bias analysis, we identified genes that display dominant expression features in only one tissue (embryo or seed coat), but maintain the additive expression feature in the other tissue across three tetraploid Brassica species (Dataset S9). As in the tissue comparison, biased expression analysis of the three tetraploid species revealed more homoeologs displaying dominant expression features in the seed coat relative to the embryo. Furthermore, as these analyses are based on domesticated tetraploid Brassica species rather than newly synthetic tetraploid lines, select regulatory features from the domesticated lines may reflect the genetic impact of improvements targeting oil content and seed coat composition. After extracting oil content-related genes with nonadditive expression features from the three inves- tigated tetraploid species (Dataset S9), we found one FAD5 homolog in B. juncea that was significantly upregulated (transgressive-upregulated) and one CAC1-B homolog in B. car- inata that was significantly downregulated (transgressive- downregulated), relative to the respective diploid ancestors, putatively resulting from the selective pressures placed on these genes during domestication. These results indicate that after polyploidization, homoeolog diversity produces alterations in spatiotemporal patterns of gene expression, and that the novel regulation processes (biased and dominant expression in tetraploid compared to diploid species) occurmore frequently in the seed coat than in the embryo. 5. Specific and conserved gene expression in Brassica species The identification of stage-specific and tissue-specific gene expres- sion patterns for each of U’s Triangle of Brassica species can guide gene product functional characterizations, aiding the identification of putative divergence among species and informingmechanisms of embryo and seed coat crosstalk during seed development. The Tau Index (Kryuchkova-Mostacci & Robinson-Rechavi, 2017), a widely used method for specific expression analysis, was calculated for each gene to determine whether it is specifically (defined as Tau > 0.8 in one stage or tissue) or broadly expressed. RNA-seq data from the embryo and seed coat of six Brassica species over seed development were used to generate comprehensive stage-specific and tissue-specific gene expression catalogs (Dataset S10), � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation New Phytologist (2022) 233: 30–51 www.newphytologist.com New Phytologist Community resources Forum 35 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense (a) (b) (c) (d) Fig. 3 Homoeolog expression bias in tetraploid Brassica species during seed development. (a) Schematic illustration of nine models (I–IX) of homoeologous geneexpressioneffect after polyploidization. The relativeexpression levels of homoeologs in different species are indicatedby circle size (diploid) or proportions of a circle (tetraploid). The polyploidization event of B. juncea is used as a representative, similar situations are existing in B. napus and B. carinata. (b) The boxplot of percentage of novel expression bias in three tetraploid species, including A-bias, B-bias and C-bias groups, in seven embryo (dots) and seed coat (triangles) developmental stages, respectively. (c) Schematic illustration of additive and nonadditive (dominance and transgressive) features of tetraploid species after polyploidization.The relativeexpression levels of homoeologs indifferent species are indicatedby relativeheightof squares.Brassica juncea is used as a representative, similar situations are existing in B. napus and B. carinata. Green, expression of homoeolog in B. rapa; yellow, expression of homoeolog in B. nigra; orange, total expression of two homoeologs fromA and B subgenomes inB. juncea. (d) Boxplot of percentage of dominant expressed homoeologs in three tetraploid species, including A dominance, B dominance and C dominance groups, in seven embryo (dot) and seed coat (triangle) developmental stages, respectively. For the boxplots in (b) and (d), individual values are indicated by the dotswith colors and shapes defined by the legend on the right panel. Boxplot horizontal lines indicate the median values; the whiskers represent the ranges for the bottom 25% and the top 25% of the data values, excluding outliers; the outliers are considered to be any data outside the whiskers. *, 0.05 > P > 0.01; **, 0.01 > P > 0.001; ***, P < 0.001. New Phytologist (2022) 233: 30–51 www.newphytologist.com � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation Forum Community resources New Phytologist36 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense heatmaps and GO enrichment analyses (Figs 4, S12). All Brassica species analyzed exhibited similar distribution of stage-specific gene ratios in embryo and seed coat tissues (Fig. 4a,b), with enrichment of stage-specific genes in early and late stages of seed development, suggesting genome activation and reprogramming in early embryo- genesis, and specific regulation to complete seed development. The majority of conserved stage-specific homoeologs identified among the six species were expressed in late stages of embryo and seed coat development (Fig. 4a,b), supporting the selection of storage reserve traits in Brassica seed formation during domestication. In order to investigate the putative functions associated with stage-specific gene categories, we performed GO enrichment analysis (Ashburner et al., 2000) and determined the topGO terms in the biological process (BP), cellular component (CC) and molecular function (MF) categories (Figs 4c, S12). For embryo samples, significant GO terms involved in cell morphogenesis, cell tip growth, developmental cell growth, unidimensional cell growth and JA metabolism were enriched in the early development stages. During themiddle seed development stages, when embryonic body plan is established, embryo-associated GO terms were enriched in the maintenance of meristem identity, maintenance of stem cell population, pattern specification and adaxial/abaxial axis specifica- tion. Metal ion transport, lipid oxidation, seed dormancy and maturation process-related terms were enriched in late embryogen- esis, supporting functions for gene products in storage reserve accumulation. In seed coat samples, which included remnant endosperm, enrichmentofGOterms in stage-specificgenecategories was prominent in late stages of seed development, and included lipid storage and lipid localization-related terms. These data reveal overlapping and possibly competing efforts to accumulate storage reserves in the late stages of Brassica seed development (Fig. 4c). In order to investigate the putative functions associated with tissue-specific genes in the six Brassica species, 18 471 embryo- specific and11 771 seed coat-specific geneswere identified (Dataset S10), and these tissue-specific genes were used for GO enrichment analyses (Fig. S13). For embryo-specific genes, cell morphogenesis and unidimensional cell growth were enriched GO terms; whereas in the seed coat, phenylpropanoid metabolic and biosynthetic process-relatedGO termswere enriched, suggesting key differences in the BPs occurring in the development of these two major compartments of Brassica seeds. 6. Coordinated expression networks inBrassica species seed development Comparison of dynamic gene expression patterns in different seed tissues provides a framework for understanding gene expression within seeds. To determine the relative correlation coefficient of transcripts in embryo and seed coat tissues during seed develop- ment, co-expression analyses were performed. Among the 7973 positively co-expressed genes (Dataset S11), two clusters with similar gene counts and contrasting expression profile trends during seed development were identified (Fig. 5a,b). These two clusters exhibited either continually increasing or continually decreasing expression patterns during seed development. Further investigation of Pearson correlation coefficient (PCC) distribution revealed similar distribution patterns across the six Brassica species analyzed, with slight differences in B. oleracea (Fig. S14). Gene co-expression analyses using global transcript profiles allow estimations of the probability that genes share a common regulatory mechanism (Langfelder&Horvath, 2008). To examine how the spatiotemporal coordination of gene expression may influence BPs, we first identified expressed genes, which are defined as those expressed in at least one stage from either embryo or seed coat with > 1 TPM in diploids or > 0.5 TPM in tetraploids, and then performed an unbiased weighted gene co-expression network analysis (WGCNA) using 149 485 embryo-expressed genes and 125 163 seed coat-expressed genes. WGCNA analysis identified 56 embryo modules (clusters) (Fig. 5c; Dataset S12) and 67 seed coat modules (clusters) (Fig. 5d; Dataset S12). Each module was correlated with a module eigengene (ME) and contained 30– 39 979 genes. Next, the clade genes of co-expression modules and overlapping expression patterns in the embryo and seed coat were investigated using PCC analysis of cluster MEs for the embryo (E) and seed coat (S) clusters. As shown in Fig. 5(c,d), clusters E01 & S02, E03 & S01, E02 & S05, E14 & S06 and E09 & S08 were the top five cluster pairs, with high similarity of expression patterns (PCC > 0.9) between embryo and seed coat clusters. In order to gain insight into the functional significance of modules with similar expression patterns, the top two cluster pairs (cluster pairs E01 & S02 and E03 & S01, based on PCC) were selected for GO annotation and enrichment analysis, using the 2235 and 3593 overlapping genes from the embryo and seed coat, respectively (Dataset S12). Genes in cluster pair E01 & S02 were highly expressed during early stages, then decreased in middle and late stages. GO enrichment analysis revealed putative roles for the encoded gene products in circadian rhythm, cell population proliferation, organ development, and response to ABA and organonitrogen compounds. Genes in cluster pair E03& S01 were expressed at a steady, low level from E1 to E4, then increased from E4 to E7. GO enrichment analysis revealed putative roles for the encoded gene products in fatty acid (FA) metabolic and biosyn- thetic processes, seed maturation and lipid storage (Dataset S12). These findings support well-established processes prominent in early or middle to late stages of seed development, including hormone-regulated cell proliferation in early stages, and FA biosynthesis in middle to late stages, which is consistent with findings in Arabidopsis (Xiang et al., 2011; Gao et al., 2019; Hofmann et al., 2019). 7. Transcriptome dynamics in embryo and seed coat development In angiosperms, genome-wide activation of transcription, known as zygotic genome activation (ZGA), occurs shortly after double fertilization (Anderson et al., 2017; Chen et al., 2017). ZGA is coordinated with the degradation of maternally deposited tran- scripts that control the initial stages of embryo development (Zhao et al., 2019).Up- and downregulated genes in the zygotemay be the consequence of ZGA, the gradual clearance of maternal mRNA or embryonic developmental programming. To gain insight into the reprogramming of gene expression in early seed development, we � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation New Phytologist (2022) 233: 30–51 www.newphytologist.com New Phytologist Community resources Forum 37 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense (a) (b) (c) Fig. 4 Specific and conserved gene clusters in tetraploid Brassica species and their diploid ancestors. (a, b) Heatmap of expression dynamics for stage-specific gene members in the developing embryo (a) and seed coat (b) from six Brassica species. Genes in each species are associated with a color on the left of each heatmap, and defined in the Species inset. Z-score was applied for each row. Expression levels were indicated by color scheme, from magenta (high) to blue (low) in corresponding stages, as defined in the inset. Homologs (Homo, from three species) and homoeologs (Homoeo, from six species) shared in corresponding species are labeled with horizontal lines to the left of each heatmap. Rp, B. rapa; Ng, B. nigra; Ol, B. oleracea; Jn, B. juncea; Np, B. napus; Cr, B. carinata. The gene lists used for generating the heatmaps can be found in Supporting InformationDataset S10. (c)Gene ontology (GO; biological processes) enrichment of Specific and ConservedGene clusters in tetraploidBrassica species and their diploid ancestors. TopGO from all stages (UO, E1/S1–E7/S7) were emphasized. X-axis, from left to right indicating early to late seed development. Y-axis, description of enriched GO terms. Enrichment significance was calculated, and adjusted P-values were associated with a color scheme from blue (low) to white (high). New Phytologist (2022) 233: 30–51 www.newphytologist.com � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation Forum Community resources New Phytologist38 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense investigated DEGs by comparing corresponding embryo and seed coat stages, i.e. zygote stage (E1 vs S1), as well as a comparison with ovule (E1 vs UO and S1 vs UO) for sixBrassica species (Dataset S3, S4). Both E1 vsUOand S1 vsUO favoured upregulation ofDEGs, suggesting gene expression reprogramming occurs in the zygote and seed coat immediately after fertilization, with the higher proportion of DEGs observed in E1 vs UO than S1 vs UO supporting greater activation in the zygote. Examining early stage DEGs from E1 vs UO and S1 vs UO comparisons (Fig. S15) revealed a large number of auxin and cytokinin related genes differentially expressed in UO, E1 and S1, including PIN1, PIN3, PIN4, PIN7, PIN8, auxin responsive transcription factors (ARFs), YUCCA (YUC) family genes, cytokinin oxidase, cytokinin response factor genes and histone modification-related genes (Dataset S13). Consistent with previous findings in Arabidopsis, these results support the central role of auxin and cytokinin in early seed development (Jenik&Barton, 2005; Rademacher et al., 2012; Mironova et al., 2017). (a) (b) (c) (d) Fig. 5 Correlation of embryo and seed coat gene expression patterns during seed development in six Brassica species. (a, b) Clusters of genes from the six Brassica species with a high Pearson correlation coefficient (PCC, r > 0.9) between embryo and seed coat. Two clusters with high PCC were identified, with clusters 1 (a) and 2 (b) containing geneswith an overall decreasing or increasing expression pattern during seed development, respectively. Color code reflects ‘Membership’ values in the rangeof 0–1 calculatedbasedon cluster cores (blue line in themiddle) consistingof genes,where red corresponds tohighvalues and blue to low values ofMembership scores. The total number of genes in each cluster is labeled after cluster name.X-axis, developmental stages.Y-axis, Z-score- transformed expression level of each gene. (c, d) Cluster analysis using embryo (c) and seed coat (d) expressed genes derived from tetraploid Brassica species and their diploid ancestors.Weighted gene coexpression network analysiswas performedusing variance stabilizing transformation of embryo expressed genes in six Brassica species. Module eigengenes (ME) of each cluster (module) were used for plotting. The embryo and seed coat clusters with high similarity of expressionpatterns (PCC > 0.9) and> 100 commongeneswere labeledwith the samebackgroundcolors.X-axis, developmental stage.Y-axis,MEs. E, embryo clusters; S, seed coat clusters. The gene lists used for plotting can be found in Supporting Information Dataset S11. � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation New Phytologist (2022) 233: 30–51 www.newphytologist.com New Phytologist Community resources Forum 39 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense Phytohormones and phytohormone-related gene expression in the embryo, endosperm and seed coat strongly influence seed developmental programs (El-Showk et al., 2013; Locascio et al., 2014; Batista et al., 2019). Therefore, wemined our large DEG list to identify phytohormone-related processes, and to demonstrate the value of this dataset for identifying expressed genes with importance in key seed developmental processes. By querying Arabidopsis datasets (Dean et al., 2011; Xiang et al., 2011; Schwacke et al., 2019), auxin, cytokinin, ethylene, ABA, GA, SA, seed coat and ‘essential for embryo development’ gene lists were compiled and used to investigate the expression of Brassica genes in seed formation (Dataset S14). Among the phytohormone-related genes, auxin- and cytokinin-related genes were highly expressed in the embryo fromE1 to E4, and in the seed coat during late stages of seed development (Figs 6, S16). Likewise, ethylene-, ABA-, GA- and SA-related genes were highly expressed in E1 and E2 stage embryos and in the seed coat from S3 to S7 (Fig. S16a,b,d). Dynamic changes in the proportion of DEGs in the embryo and seed coat may reflect important regulatory switches for the roles of phytohormones in seed development. The majority of genes deemed essential for embryo or seed coat development were highly expressed in these respective seed subcompartments (Fig. 6b,c), with one significant difference observed during late stages of seed development. Specifically, the proportion of highly expressed genes deemed essential for embryo development decreased in the embryo in late stages of seed development, whereas highly expressed seed coat development related genes remained preferentially expressed in the seed coat (Fig. 6b,c). Among the six Brassica species examined, 73 genes essential for embryonic development and eight genes essential for seed coat development appeared conserved and dominantly expressed in the embryo or seed coat, respectively (Dataset S15). In the embryo, FUS3, cytidine triphosphate synthase 2 (CTPS2), shoot meristemless (STM) and pyridoxine biosynthesis 2 (PDX2) were highly expressed in at least five stages of seed development. In the seed coat, dihydrokaempferol 4-reductase (DFR) and transparent testa 8 (TT8) were highly expressed in at least five stages of seed development across the six Brassica species. These genes are predominantly involved in embryo development ending in seed dormancy or flavan-3-ols, condensed tannins and anthocyanin biosynthetic processes in the seed coat, suggesting conservation of these tissue-specific BPs at the transcriptional level across Brassica species (Dataset S15). 8. Spatiotemporal regulation of storage reserve genes In DEG comparisons amongst the six Brassica species examined, the greatest overlap in shared GO terms among species belonged to the storage reserves category, during late stage seed development (Figs S6–S11). Therefore, we examined the expression of storage protein, carbohydrate and lipid metabolism related genes in the A, B and C subgenomes of diploid and tetraploid species (Dataset S14) to explore the patterns and evolutionary divergence of gene expression associated with the seeds of these species. Analysis of DEG dynamics associated with storage reserves revealed more DEGs with elevated expression related to storage proteins (at E1, E2, E3, E6 and E7), lipid metabolism (at E1, E2 and E6) and carbohydrate metabolism (at E1) in the embryo than in the seed coat (Figs 6d, S16e,f; Dataset S16), whereas moreDEGs associated with lipidmetabolism (at S3–S5) and carbohydratemetabolism (at S2–S7) were highly expressed in the seed coat. The varied expression of overlapping DEGs over the course of seed develop- ment suggests specialized spatiotemporal regulation of storage reserve-related gene expression between seed subcompartments. Co-expression analysis of storage reserve genes in the storage protein, carbohydrate and lipid metabolism related gene categories identified four clusters in the embryo and seed coat (Figs S17–S19). The largest clusters in each gene category showed elevated expression in late stage embryo and seed coat, supporting dominant functions for storage reserve-related genes in maturing seeds. The onset and progression of seed development are associated with regulation by transcription factors (TFs). Using a compre- hensive method for the identification of putative TF targets (Methods S1), candidate TFs associated with putative target members in the largest cluster of storage reserve categories, including storage protein, carbohydrate and lipid metabolism, were identified. In the embryo, the TFs putatively regulating expression of storage-related genes consisted of 638 TFs associated with the storageprotein subcategory, 2734TFs for the carbohydrate metabolism subcategory, and 2076 TFs for the lipid metabolism subcategory. Similarly in the seed coat, TFs putatively regulating expression of storage reserves-related genes were numerous, with 818, 2911 and2489TFs identified as putative regulators of genes in the storage protein, carbohydratemetabolism and lipidmetabolism subcategories, respectively (Dataset S17).CommonTFs, putatively targetinggenes in all three storage reserve subcategories, consisted of 268 in the embryo and 287 in the seed coat (Fig. S20). Among the common TFs, the homolog of IDD4 (a TF in the C2H2 BIRDTF family) inB. rapawas identifiedas aputativekeyplayer in regulating storage reserve genes in the embryo, with > 10 potential storage reserve gene targets, including genes encoding storage proteins (CRU1,CRU2,CRU3 andNAT2); lipid metabolism-related genes (CLO1, PLC2, HSD1, ADS2 and NPC1) and carbohydrate metabolism-related genes (PMI1, PHS2 and FBA1). Further investigation of the IDD4 TF and its connection to the regulation of storage reserve genes could aid efforts for the targeted improve- ment of storage reserve-related traits in Brassicas. 9. Fatty acid biosynthesis and modification in the embryo and seed coat Brassica rapa, B. juncea, B. napus and B. carinata represent the four main Brassica oilseed crops, with B. nigra grown as a condiment mustard and B. oleracea largely used as a vegetable (Yang et al., 2016). For all six Brassica species, the FA profile in seed storage reserves is an important agricultural trait. To explore the patterns and evolutionary divergence of gene expression associated with FA biosynthetic pathways in U’s Triangle, MAPMAN4 database (Schwacke et al., 2019) queries were used to identify putative FA biosynthesis and modification related genes in the six Brassica species. The resulting gene lists were used to investigate expression dynamics in the embryo and seed coat (Dataset S18). Most FA New Phytologist (2022) 233: 30–51 www.newphytologist.com � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation Forum Community resources New Phytologist40 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense biosynthesis and modification genes were highly expressed in the embryo, increasing from E4 to E6 or E7, although a few members of the acyl-acyl carrier protein desaturase (AAD), fatty acid desaturase (FAD) and b-ketoacyl-CoA synthase (KCS) gene families exhibited high expression in the seed coat (Fig. S21; Dataset S18). In particular, members from the AAD gene family exhibited a tissue-specific gene expression pattern, with higher embryo expression seen for AAD5, and higher seed coat expression observed for AAD2 (Fig. 7a). Moreover, plastid-localized FAD6 and FAD7 were expressed in both the embryo and seed coat in all Brassica species (Fig. S21). The similar expression patterns of these genes with that previously identified in Arabidopsis support conservation of FA biosynthesis and desaturation pathway genes at the transcriptional levels inBrassicaceae (Hobbs et al., 2004; Bryant et al., 2016). To examine whether the dynamics of FA biosynthesis and modification-related genes are co-ordinated and consistent with seed FA composition, we performed gas chromatography analysis of embryos for all sixBrassica species in stages E3–E7. Sixmajor FAs were identified, including palmitic (C16:0), stearic (C18:0), oleic (C18:1), linoleic (C18:2), linolenic (C18:3) and erucic (C22:1) acids. Significant differences were observed among Brassica species in the levels of C18:1 and C22:1 (Figs 7b,c, S22; Dataset S19). Brassica nigra, B. oleracea,B. juncea and B. carinata contained close to 40% C22:1, with C18:1 below 20%, whereas B. rapa and B. napus contained close to 60% C18:1, with undetectable C22:1. Further investigation suggested that FAprofiles are closely linked to the expression of FATTY ACID ELONGATION1 (FAE1), encoding the enzyme responsible for the two condensation steps in the elongation of C18:1 to C20:1 andC22:1 in developing seeds (James et al., 1995).FAE1 geneswere highly expressed in stages E5– E7 of seed development (Fig. 7d) in allBrassicas exceptB. rapa, and were the most highly expressed members of the KCS gene family (Dataset S18). Low expression of FAE1 was observed during embryo development in the diploid B. rapa (Fig. 7d), matching the low erucic acid (LEA) trait (Fig. S22a), however, this was not the case for B. napus in which elevated FAE1 expression was observed. In B. napus, LEA is controlled by two FAE1 genes with point mutations in the coding regions (Fourmann et al., 1998;Wu et al., 2008). The high expression pattern observed for theB. napus FAE1 (a) (b) (c) (d) 40 R at io (% ) 30 20 10 0 40 R at io (% ) 30 20 10 0 40 R at io (% ) 30 20 10 0 70 R at io (% ) 50 60 30 40 10 20 0 Fig. 6 Dynamicchangesof tissue-dominantgenenumbers in select keygenecategoriesduring seeddevelopmentof tetraploidBrassica species and their diploid ancestors. (a–d) Dot-plots of the percentage of embryo or seed coat dominant genes in select gene categories, including auxin-related genes (a), seed coat development-related genes (b), embryo development essential genes (c) and carbohydrate metabolism-related genes (d). Dominance was determined by identifying differentially expressed genes (DEGs) in embryo and seed coat comparisons, with embryo or seed coat dominance represented by orange circles or blue triangles, respectively. The ratio of dominance was calculated using the total number of dominant DEGs divided by the total number of expressed genes from six Brassica species, for each of the seven stages of seed development indicated by seed illustrations. � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation New Phytologist (2022) 233: 30–51 www.newphytologist.com New Phytologist Community resources Forum 41 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense genes in late stages of developing embryos is consistent with the production of nonfunctional FAE1 enzymes which are unable to produce detectable C22:1. Thus, the gene expression data for B. napus presented here is consistent with the consensus that the LEA trait is not a result of altered expression of the FAE1 genes, and suggests that the LEA trait in B. napus differs from that of its ancestor B. rapa. Our analysis of the FAE1 gene family illustrates the power of comparative transcript profiling of seed development across related oilseed Brassica species, as a tool for predicting and interpreting the molecular basis of seed oil composition traits. 10. Evolutionary hourglass patterns of seed development in Brassica species Embryonic evolution has been modeled after an hourglass, with greater ontogenetic divergence in early and late stage embryos (a) (b) (c) (d) Fa tty a ci d ra tio (% ) 60 30 0 Fa tty a ci d ra tio (% ) 40 20 0 Fig. 7 The spatiotemporal regulation of fatty acid (FA)-related genes in tetraploid Brassica species and their diploid ancestors during seed development. (a) Heatmap of the expression levels of AAD family genes that are dominantly expressed in the developing embryo (left, stages E1–E7) or seed coat (right, stages S1–S7).All AADfamilygeneswere included forplotting theheatmap.AAD2andAAD5 familygenes canbeobserved toexpressdominantly in embryoandseed coat, respectively, across six Brassica species. (b, c) Validation of FA component in B. napus (b) and B. carinata (c) using gas chromatography (GC) analysis. Ratios of sixmajor FAs from E3 to E7 stages are plotted,with significant differences betweenB. napus andB. carinata profiles in 18:1 and 22:1. (d) Heatmap of the expression level of FAE1 family genes in sixBrassica species. Genes in each species are associatedwith a color on the left of each heatmap and defined in the Species inset. Expression levels calculatedusing Log2-normalized counts are indicatedby color gradient, from red (high) to gray (low) in the seven stagesof seed development. Genes in each species are associated with a color on the left of each heatmap. New Phytologist (2022) 233: 30–51 www.newphytologist.com � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation Forum Community resources New Phytologist42 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense separated by a phylotypic middle stage of conservation (Domazet- Lo�so & Tautz, 2010; Quint et al., 2012). Our findings are consistent with the hourglass model, with greater gene expression variation observed in early and late stages in both evolutionary and developmental analyses. To investigate morphogenetic divergence in Brassica, we applied phylotranscriptomic approaches to deter- mine stage-specific average transcriptome age and transcriptome divergence, usingRNA-seq data derived fromembryo and seed coat tissues of six Brassica species in seven developmental stages. The clades of genes derived from a common ancestor, known as phylostratigraphic (PS) levels, were determined for each gene (Dataset S20) and species. Among the 14 PS levels (Table S2), a greater proportion of genes were observed in PS1 and PS2 (representing the evolutionarily oldest genes), than in PS11 and PS12 (representing the evolutionarily youngest genes). For the six Brassica species examined, the expression distribution of genes with distinct transcriptome ages and transcriptome divergences consis- tently showed higher expression for old age/low divergence genes and lower expression for young age/high divergence genes across embryo and seed coat development (Figs 8a,b, S23–S28). To test whether gene age and sequence divergence are correlated inBrassica species, we calculated Kendall’s rank correlation coefficient of phylostratum and divergence stratum, which quantifies the degree of linear dependence between phylostratum and divergence stratum for each species in a nonparametric manner. The correlations of phylostratum and divergence stratum ranged from 0.281 to 0.299 and were consistent between the embryo and seed coat in Brassica species (Figs 8c,d, S29). To evaluate evolutionary age and sequence divergence across seed development in the six Brassicas, the transcriptome age index (TAI) and transcriptome divergence index (TDI) were determined. Consistent hourglass expression patterns were observed for all Brassica species examined (Fig. 8e). The hourglass expression patterns showed significant differences between the embryo and seed coat (Fig. 8e), with greater divergence in seed coat tissues (Fig. 8e), suggesting that TAI and TDI have the potential to capture independent evolutionary trends and trajectories for Brassica species. III. Discussion U’s Triangle hypothesis describes the evolutionary relationships among six species of Brassica (Nagaharu & Nagaharu, 1935). The diploid Brassica genomes in U’s Triangle experienced whole genome triplication followed by gene loss and reshuffling of genomic segments, whereas the currently grown cultivars of tetraploid Brassica species appear to have arisen by hybridization, polyploidization and breeding selection (Moghe et al., 2014; Parkin et al., 2014; Murat et al., 2015). The diversity of allelic combinations among Brassica species could affect gene expression and regulation during seed development, resulting in phenotypic variations in Brassica seeds (Parkin et al., 2014; Ziegler et al., 2019). Here, a high-resolution transcriptome atlas for the major subcompartments of the seed, from the unfertilized ovule to the mature embryo and seed coat for U’s Triangle Brassica species is presented. This comprehensive resource provides insights into the gene expression at key stages of seed development in diploid and tetraploid Brassica species and the evolutionary relationships of species in U’s Triangle at the transcriptome level. The biased and dominant expression patterns of subgenomes in tetraploid species reveal that the regulation of subgenome biased and dominant expression may have played a more prominent role in the seed coat than in the embryo after polyploidization. Gene expression comparisons highlight key differences in the conserved regulatory networks and metabolic pathways operating in the embryo and seed coat subcompartments during seed development, revealing differences in storage reserve accumulation and FA metabolism among the Brassica species of U’s Triangle. The hourglass expression patterns, spatiotemporal transcriptional dynamics and signatures of the six Brassica species identified herein provide a platform for predicting and characterizing mechanisms of gene regulation and the genetic underpinnings of evolutionary diver- gence driving the phenotypic differentiation in Brassica species seeds. In this study, the distribution of genes expressed in specific stages of development, along with the observed hourglass pattern of gene expression, support greater ontogenetic divergence in early and late stages of seed development among the six Brassica species examined. The significance of these observations, together with those made in recent transcriptome and global alternative splicing landscape studies of embryo development in Arabidopsis, Brachypodium and wheat (Quint et al., 2012; Gao et al., 2021; Hao et al., 2021) emphasize the importance of characterizing and comparing BPs in early and late stages of seed development in the Brassica genus for uncovering the evolutionary significance of seed diversity. 1. Reprogramming of gene expression during early embryogenesis In flowering plants, double fertilization produces the diploid zygote and the triploid endosperm, filial tissues that along with the maternal seed coat produce the mature seed (Dean et al., 2011; Xiang et al., 2011; Sreenivasulu & Wobus, 2013). In the zygote and early stages of embryo development in plants, the contribution of maternally inherited transcripts from the egg cell followed by transcriptional activation of the parental genomes after fertilization define the transition from the maternal to the zygotic programming (Autran et al., 2011; Anderson et al., 2017; Chen et al., 2017; Zhao et al., 2019). In animals, a delay between fertilization and initiation of the maternal-to-zygotic transition is a hallmark of early embryonic development (Schier, 2007; Feng et al., 2010), presumably as a consequence of the reprogramming of gene expression to establish an embryo-specific developmental program. In this study, ZGA and transcriptome reprogramming were highly active in the embryo and seed coat during early seed development (E1, S1, E2 and S2; Fig. S20). Although the seed coat in this study may have contained a small amount of endosperm, the dynamic changes observed likely reflect true gene expression activity in the seed coat. Similar to a previous study on the early developmental stages of the B. napus seed (Ziegler et al., 2019), our transcriptome data for UO, E1, S1, E2 and S2 samples among six Brassica species clearly demonstrated that genes involved in epigenetic changes relating to DNA methylation, � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation New Phytologist (2022) 233: 30–51 www.newphytologist.com New Phytologist Community resources Forum 43 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense (a) (b) (c) (d) (e) Lo g 2 e xp re ss io n le ve l D iv er g en ce s tr at u m B. oleracea B. carinata D iv er g en ce s tr at u m Fig. 8 Evolutionary age and sequence divergence of variousBrassica species. (a, b) The distribution of expressed genes fromdifferent phylostratum (PS, a) and divergence stratum (DS, b) categories from PS1 (old age) to PS14 (young age) or DS1 (low divergence) to DS10 (high divergence). The B. napus PS and DS distributions for the E1 embryo stage are shown here (a, b). For the boxplots, the horizontal black lines indicate the median values; the whiskers represent the ranges for thebottom25%and the top25%of thedatavalues, excludingoutliers; theoutliers are consideredasanydataoutside thewhiskers. (c)Kendall’s rank correlation coefficient of phylostratum and divergence stratum in B. nigra, which shows the lowest value in Brassica species. (d) Kendall’s rank correlation coefficient of phylostratumanddivergence stratum inB. napus, which shows thehighest value inBrassica species.X-axis, PS;Y-axis, DS. Eachdot indicates one gene; the Kendall value is labeled above each panel. (e) Transcriptome indices across all six Brassica species in embryo and seed coat during seed development. TAI, transcriptomeage index; TDI, transcriptomedivergence index.X-axis, TAI or TDI values. The trends from low to high are indicated by arrows, from right to left (TAI) and left to right (TDI), respectively.Y-axis, seed developmental stage from E1/S1 (bottom) to E7/S7 (top). Transcriptome indices depicted on the left are expressionvariability patterns in theembryoacross sixBrassica species, and followhourglass patterns. Transcriptome indicesdepictedon the right represent seed coat expression variability patterns across six Brassica species, and show more divergent patterns, rather than hourglass patterns. New Phytologist (2022) 233: 30–51 www.newphytologist.com � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation Forum Community resources New Phytologist44 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense demethylation and histone modifications are differentially expressed, and genes related to auxin and cytokinin phytohor- mones are highly active in zygote and quadrant stages of embryo and seed coat development (Figs 6a, S16c). The large collection of DEGs between the embryo and seed coat provides a rich resource for future efforts to advance understanding of the genetic regulation of seed development. For example, the auxin-related genes IAA18, YUC, PIN, ARF and GH3 were identified as highly expressed in early embryonic development and conserved in the six Brassica species examined (Fig. 6a; Dataset S3, S4). This result is consistent with findings in Arabidopsis and other dicots (Cao et al., 2020), supporting conserved functions for these auxin- related genes. The functional characterization of key phytohormone-related genes identified in this study will be accelerated by the development of rapid and sensitive phytohor- mone profiling technologies, including mass spectrometry and biosensors with microscale sensitivities (Nov�ak et al., 2017; Cai et al., 2019). Collectively, our results provide an in-depth genome-wide view of patterns of transcriptome reprogramming during early seed development in six interrelated Brassica species. 2. Embryo, endosperm and seed coat signal crosstalk in seed development Embryo, endosperm and seed coat development play central roles in defining agronomically valuable seed traits among crops (Yi et al., 2019; Doll et al., 2020a,b). Compared to the embryo and seed coat, the endosperm of most dicots is a transient nutritive tissue that is almost absent in mature Brassica seeds. To investigate the biological function of endosperm in Brassica species, the development ofmethods to isolate endospermwill be indispensable (Chao et al., 2020; Siles et al., 2020). The application of laser capture microdissection has partially resolved this issue inB. napus, with endosperm samples from early seed developmental stages isolated for global transcriptome analyses (Ziegler et al., 2019; Khan et al., 2020). The continued growth and application of technologies for the cell- and tissue-specific transcriptome analysis of Brassica endosperm will expand understanding of subcompart- ment signal crosstalk within the seed. In this study, manually isolated embryo and seed coat tissues were clearly distinguished by their expression profiles and regula- tory programs. Large sets of genes, including those putatively encoding TFs, phytohormones and storage reserves related genes, showed embryo dominant or seed coat dominant expression patterns during seed development (Figs 6, S16). High overlap was found between the gene sets from the six Brassica species studied herein, as well as in recent transcriptome profiling studies of B. napus seed subcompartments (Ziegler et al., 2019; Khan et al., 2020), suggesting that conserved tissue-dominant expression patterns are shared among the Brassica species in U’s Triangle. Several common homoeologs and expression patterns were observed across the six Brassica species examined, suggesting that these genes play conserved roles in regulating biological pathways in developing seeds. Carbohydrate synthesis-related genes were more specifically expressed in seed coat samples, which included remnants of the endosperm, whereas the genes encoding storage proteins and involved in FA biosynthesis were coordinately expressed in the embryo and seed coat. Because genes functioning in the same pathway tend to appear in the same or similar expression modules, the regulatory programs for synthesizing storage proteins and carbohydrates are expected to differ. Genes that mediate crosstalk amongst the embryo, endosperm and seed coat to coordinate seed development remain to be determined (Xiang et al., 2011; Sreenivasulu&Wobus, 2013; Figueiredo et al., 2016; Wang et al., 2017; Batista et al., 2019). Analysis of transcriptional signatures andDEGs confirmed a key role for phytohormones in tissue fate determination within the seed. As shown in Figs 6(a) and S16(a–d), a large number of phytohormone genes were highly expressed in the embryo during early seed development. Auxin-, GA- and SA-related genes were strongly expressed in the embryo during early and middle seed development phases, followed by higher expression in the seed coat in late phases of seed development. Ethylene- and ABA-related genes showed similar patterns of expression, with higher levels of expression in seed coat than in embryo during the middle and late seed developmental phases (Fig. S16a,b). The dynamic patterns of expression between the embryo and the seed coat highlight the role that shifts of phytohormone-related gene expression likely play in coordinating seed development. Our results support the coordi- nation of embryo and seed coat development through phytohor- mone pathway crosstalk. 3. Subgenome gene expression and regulation in polyploid species Co-expression and subgenome comparative analyses provided insight into the dynamic reprogramming of the transcriptome, revealing functional transitions, a complex origin of polyploidy, and evolutionary conservation and divergence during seed devel- opment in Brassica species. U’s Triangle of Brassica species encompasses three diploid and three allopolyploid (tetraploid) genomes. Following polyploidization, homoeolog duplication and reversion of gene numbers towards diploid levels triggered the reprogramming of allopolyploid transcriptomes (Chalhoub et al., 2014; Yang et al., 2016; Xiang et al., 2019). Recent studies on subgenome dominance in B. napus revealed that A and C subgenomes experienced different evolutionary processes at the subgenome level (Khan et al., 2020; Zhou et al., 2020). Compared to the A subgenome, the C subgenome possesses more expressed genes (Zhou et al., 2020) with lower mean expression values (Khan et al., 2020), when analyzing B. rapa and B. oleracea gene expressions. In this study, we further analyzed gene expression bias at the subgenome level for all three tetraploid Brassica species (Fig. 4a,b; Dataset S8) and found greater C-bias among expressed subgenome homoeologs of B. napus and B. carinata, and greater B- bias among subgenome homoeologs expressed in B. juncea. The biased expression shifts between subgenomes suggest that genes derived from the A, B and C subgenomes likely play different roles in producing qualitative and/or quantitative differences in the seeds of diploid and polyploid Brassicas. Additive and nonadditive analyses of homoeologs in tetraploids compared to their diploid ancestors suggest that the majority of � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation New Phytologist (2022) 233: 30–51 www.newphytologist.com New Phytologist Community resources Forum 45 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense additive expression features in two subgenomes arose after allopolyploid formation, and nonadditive expression exhibited more subgenome dominance than transgressive expression features (Fig. 4c,d; Dataset S9). Similar results were reported from transcriptome analyses for nonseed tissues of natural or resynthe- sized B. napus (Wu et al., 2018; Pan et al., 2019; Li et al., 2020), supporting spatial conservation of these additive and dominant expression features. In this study, we provide a detailed list of genes possessing dominant expression features (Dataset S9), with many validated by previously published reports. For example, dominance features were common in the Glycoside hydrolase 3 (GH3) gene family, which is consistent with a comparative genomics study in B. napus (R. Wang et al., 2019). In contrast to previous analyses focusing on static regulation in one or a few tissues, our datasets on additive/nonadditive features describe the dynamic transcriptional regulation process during the progressive phases of the seed development in the embryo and seed coat, revealing the precise developmental stage(s) when additive or nonadditive regulation occurs for each diad in polyploid Brassica species. It has been reported that in resynthesized B. napus, some transcriptional changes do not explain differential protein regulation (Marmagne et al., 2010), suggesting that the additive and dominant features in protein levels may differ from the transcriptional features identified in this study. Although previous studies have generated lists of the proteins involved in differential regulation in the root and stem from newly synthesized B. napus (Albertin et al., 2006, 2007), proteomic studies on the impact of polyploidization on the embryo and seed coat in Brassica species are lacking, but will shed light on the differential regulation of transcript and protein levels after allopolyploid formation. This study found the expression of genes required for FA biosynthesis to be significantly influenced by modified regulation following polyploidization and domestication in Brassica species of U’s Triangle. With the exception of the transgressive regulation identified in B. juncea (FAD5 homolog) and B. carinata (CAC1-B homolog), numerous dominant expression features were evident in the tetraploid subgenomes (Dataset S9). The A subgenome exhibited the highest dominant expression of FA metabolism- related genes, followed by the C and B subgenomes. The KCS family genes KCS2, KCS10, KCS19 and 3-ketoacyl-ACP syn- thases KAS2 and KASIII were dominantly expressed in the A subgenome, whereas KCS11 and KASI were dominantly expressed in the C and B subgenomes, respectively. This discovery of distinct dominant expression patterns amongst subgenomes for gene members of the same FA metabolism-related gene family has important implications for the evolution of Brassica oil compo- sition traits. In particular, diversity of FA biosynthesis-related genes expression after polyploidization and domestication is clearly evident among Brassica species from the different levels of C18:1 and C22:1 in seed oil, with B. nigra, B. oleracea, B. juncea and B. carinata containing relatively high levels of C22:1 (close to 40%) and low levels of C18:1 (< 20%), vs the oil of B. rapa and B. napus in which C22:1 content is undetectable and levels of C18:1 are elevated (close to 60%; Figs 7b,c, S22). This study supports a role for FAE1 in producing substantially differentiated seed oil profiles among six Brassica species. Although contrasting FAE1 expression profiles appear to be contradictory to the LEA observed in B. rapa and B. napus, the elevated expression of FAE1 genes carrying point mutations in B. napus indicate putative divergence in the functionality of FAE1 between B. rapa and its B. napus descendent. Thus, careful analysis of gene expression profiles, together with oil composition data, suggests inheritance of the LEA trait in B. napus arose through breeding selection, rather than from its B. rapa ancestor and illustrates the value of comparative seed transcript profiling across related species as a tool for uncovering the molecular basis of seed oil composition traits. The expression bias and additive or dominant subgenome regulation identified in this study appear to be a consequence of genome mergers and doubling, but the underlying importance of these expression features for agricultural trait selection remains to be fully determined (Chalhoub et al., 2014; Yang et al., 2016). Discovery of a diversified subgenome origin for polyploids may shed light on the unusual features of selection divergence in Brassica. In the context of polyploid Brassica species, subgenome expression shifts have important functional implications for selecting seeds with increased nutritional value and yield. IV. Materials and Methods 1. Plant materials and growth conditions Brassica species, B. napus (DH12075-P, NpAACC), B. juncea (AC Vulcan-J, JnAABB), B. carinata (C901163-C, CrBBCC), B. rapa (Parkland-R, RpAA), B. oleracea (Chinese Kale-O, OlCC) and B. nigra (CR2748-N, NgBB) were used in this study (Table S1). Plants were grown under 16 h : 8 h, 22°C : 20°C, light : dark photoperiod, with 120–150 lmol m�2 s�1 light intensity. 2. Embryo and seed coat dissection Embryo isolations were performed as described previously, using a dissecting microscope, fine forceps (Dumont 55 forceps, 11295- 55; Fine ScienceTools, FosterCity, CA,USA) andneedles (10130- 05; Fine Science Tools) in a 4.8% sucrose solution with 0.1% RNALater solution (AM7021; Ambion, TX, USA) (Xiang et al., 2011). Brassica flowers were emasculated and pollinated at the flowering stage, to ensure sufficient and developmentally coordi- nated seed production for embryo and seed coat isolation. Embryo isolation was performed as described previously (Xiang et al., 2011, 2019; Gao et al., 2019). Through a detailed microscopic analysis for each of the six Brassica species in this study, slight embryo morphological differences over the post-fertilization developmen- tal progression were observed between species. However, the same basic progression through embryogenesis could be identified based on cell numbers andmorphological features common among all six Brassica species, which allowed us to distinguish seven common stages of embryo and seed coat development. For this reason, we sampled the collection of seed tissues at different hours and days after pollination for different Brassica species, using embryonic cell numbers and morphological cues. Information about the timing for sampling is listed in Table S1 based on hours (HAP) or days New Phytologist (2022) 233: 30–51 www.newphytologist.com � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation Forum Community resources New Phytologist46 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense (DAP) after pollination. To minimize contamination of early stages of embryo mRNA by the seed coat and endosperm, each embryo was transferred to fresh isolation buffer in a mini petri dish for a minimum of three sequential washing steps. All washing steps were performed on ice and inspected with a dissecting Leica microscope (Leica Microsystems, Wetzlar, Germany). After wash- ing, embryos were transferred to Eppendorf tubes on dry ice using fine glass pipettes. An extra validation for embryo contamination at early and middle stage through droplet digital PCRwas performed using selected representative seed coat specific expressed genes (Fig. S30; Table S4; Methods S1 for details). For seed coat isolations, embryosweremanually removed and the remaining seed coat was kept. For each embryo sample in early stages (E1–E3) of development, 25–30 embryos were pooled in each biological replicate sample. For each sample in late embryo stages (E4–E7), a minimum of 10 embryos were pooled in each biological replicate sample. A minimum of 10 seeds were used for corresponding seed coat isolation in each biological replicate sample. A total of three biological replicates for each tissue at each stage of development were used for RNA-seq library construction. 3. Homolog and homoeolog identification in Brassica species In this study, we defined: (1) homologs, as homologous syntenic genes of the same subgenome for each diploidBrassica ancestor and its two tetraploid descendants (e.g. homologous gene pairs from AARp, AAJn and AANp); (2) homoeologs, as defined in a recent study (Glover et al., 2016) for homologous syntenic genes from all A, B and C subgenomes of the six Brassica species of U’s Triangle, including three subgenomes for each of the diploid species and six subgenomes for each of the tetraploid species; (3) diads, as homoeologs from two subgenomes in each tetraploid species (B. juncea, B. napus or B. carinata); and (4) triads, as homologs from three diploid species (B. rapa, B. nigra, B. oleracea). To create a gene list for homologs, homoeologs, diads and triads of the six Brassica species in U’s Triangle, we performed reciprocal best hit BLAST among all subgenomes to obtain syntenic genes with 1 : 1 : 1 (homologs), 1 : 1 (diads), 1 : 1 : 1 (triads) and 1 : 1 : 1 : 1 : 1 : 1 : 1 : 1 : 1 (homoeologs) correspondence, respectively. The complete list of genes in all categories is included in Dataset S1. In order to define different categories of genes (e.g. embryo/seed coat development essential genes, phytohormone related genes and storage reserves related genes, and TFs), best-hit BLAST was performed using corresponding genes from the released Ara- bidopsis database: storage proteins and embryo developmental essential genes were based on previously published gene lists (Xiang et al., 2011), seed coat development-related genes were based on Dean et al. (2011), lipid related genes were extracted from the MAPMAN4 (Schwacke et al., 2019) database ‘Lipid metabolism’ category, carbohydrate-related genes were extracted from the MAPMAN4 database ‘Carbohydrate metabolism’ cate- gory, FA-related genes were extracted from the MAPMAN4 database ‘fatty acid biosynthesis’ category, and phytohormone- related genes were extracted from the MAPMAN4 database ‘Phytohormone action’ category. 4. Microscopy, RNA-seq, bioinformatic analyses and FA analysis Details regarding imaging process, FA analysis, sample prepara- tion, experimental procedures and all bioinformatic analyses, along with the associated references, are presented in the Methods S1. Acknowledgements Assistance with SEM and light microscopy sample preparations provided by the Western College of Veterinary Medicine Imaging Centre, University of Saskatchewan, is gratefully acknowledged. We thank Justin Coulson and Halim Song for RNA-seq library preparations and Bianyun Yu for thoughtful manuscript revisions. This workwas supported by theAgri-FoodProgramofAquatic and Crop Resource Development Research Division of the National Research Council of Canada, ACRD manuscript no. 56474. The authors declare no competing interests. Author contributions DX and RD conceived and coordinated the study. DX, PG, TDQ, HY, LQ,VB, LL, JC, CS and YZ performed experiments. PG,DX, TDQ, HY, DC, QL and KTN contributed to bioinformatic data analysis, imaging, figure and table preparations. TDQ, PG, AP, EE and NJP created the eFP Browser for embryo and seed coat development. DX, PG, TDQ, RD, YH, WZ, PB, LVK, DK, YW, SK, MS, NP and CSG contributed to manuscript preparation. All authors read and approved the final manuscript. PG and TDQ contributed equally to this work. ORCID Raju Datla https://orcid.org/0000-0003-0790-5569 Eddi Esteban https://orcid.org/0000-0001-9016-9202 Peng Gao https://orcid.org/0000-0002-6586-4307 C. Stewart Gillmor https://orcid.org/0000-0003-1009-2167 Sateesh Kagale https://orcid.org/0000-0002-7213-1590 Leon V. Kochian https://orcid.org/0000-0003-3416-089X David Konkin https://orcid.org/0000-0001-5410-8357 Kirby T. Nilsen https://orcid.org/0000-0002-9477-5549 Asher Pasha https://orcid.org/0000-0002-9315-0520 Nicholas J. Provart https://orcid.org/0000-0001-5551-7232 Li Qin https://orcid.org/0000-0002-1821-9946 TeagenD.Quilichini https://orcid.org/0000-0003-3311-3776 Mark Smith https://orcid.org/0000-0003-4869-6257 Yangdou Wei https://orcid.org/0000-0001-7161-9845 Daoquan Xiang https://orcid.org/0000-0001-7144-1274 Wentao Zhang https://orcid.org/0000-0003-4301-1597 Data availability All RNA-seq raw data generated from this study have been deposited into the Gene Expression Omnibus under accession no. GSE153257. � 2021 The Authors New Phytologist� 2021 New Phytologist Foundation New Phytologist (2022) 233: 30–51 www.newphytologist.com New Phytologist Community resources Forum 47 14698137, 2022, 1, D ow nloaded from https://nph.onlinelibrary.w iley.com /doi/10.1111/nph.17759 by C anadian A griculture L ibrary, W iley O nline L ibrary on [30/07/2024]. See the T erm s and C onditions (https://onlinelibrary.w iley.com /term s-and-conditions) on W iley O nline L ibrary for rules of use; O A articles are governed by the applicable C reative C om m ons L icense https://orcid.org/0000-0003-0790-5569 https://orcid.org/0000-0003-0790-5569 https://orcid.org/0000-0003-0790-5569 https://orcid.org/0000-0001-9016-9202 https://orcid.org/0000-0001-9016-9202 https://orcid.org/0000-0001-9016-9202 https://orcid.org/0000-0002-6586-4307 https://orcid.org/0000-0002-6586-4307 https://orcid.org/0000-0002-6586-4307 https://orcid.org/0000-0003-1009-2167 https://orcid.org/0000-0003-1009-2167 https://orcid.org/0000-0003-1009-2167 https://orcid.org/0000-0002-7213-1590 https://orcid.org/0000-0002-7213-1590 https://orcid.org/0000-0002-7213-1590 https://orcid.org/0000-0003-3416-089X https://orcid.org/0000-0003-3416-089X https://orcid.org/0000-0003-3416-089X https://orcid.org/0000-0001-5410-8357 https://orcid.org/0000-0001-5410-8357 https://orcid.org/0000-0001-5410-8357 https://orcid.org/0000-0002-9477-5549 https://orcid.org/0000-0002-9477-5549 https://orcid.org/0000-0002-9477-5549 https://orcid.org/0000-0002-9315-0520 https://orcid.org/0000-0002-9315-0520 https://orcid.org/0000-0002-9315-0520 https://orcid.org/0000-0001-5551-7232 https://orcid.org/0000-0001-5551-7232 https://orcid.org/0000-0001-5551-7232 https://orcid.org/0000-0002-1821-9946 https://orcid.org/0000-0002-1821-9946 https://orcid.org/0000-0002-1821-9946 https://orcid.org/0000-0003-3311-3776 https://orcid.org/0000-0003-3311-3776 https://orcid.org/0000-0003-3311-3776 https://orcid.org/0000-0003-4869-6257 https://orcid.org/0000-0003-4869-6257 https://orcid.org/0000-0003-4869-6257 https://orcid.org/0000-0001-7161-9845 https://orcid.org/0000-0001-7161-9845 https://orcid.org/0000-0001-7161-9845 https://orcid.org/0000-0001-7144-1274 https://orcid.org/0000-0001-7144-1274 https://orcid.org/0000-0001-7144-1274 https://orcid.org/0000-0003-4301-1597 https://orcid.org/0000-0003-4301-1597 https://orcid.org/0000-0003-4301-1597 http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE153257 References Albertin W, Alix K, Balliau T, Brabant P, Davanture M, Malosse C, Valot B, Thiellement H. 2007. Differential regulation of gene products in newly synthesized Brassica napus allotetraploids is not related to protein function nor subcellular localization. BMC Genomics 8: 56. AlbertinW,BalliauT, Brabant P, Ch�evre AM, Eber F,Malosse C, ThiellementH. 2006. Numerous and rapid nonstochastic modifications of gene products in newly synthesized Brassica napus allotetraploids. Genetics 173: 1101–1113. Al-Shehbaz IA, Beilstein MA, Kellogg EA. 2006. Systematics and phylogeny of the Brassicaceae (Cruciferae): an overview. Plant Systematics and Evolution 259: 89–120. Anderson SN, JohnsonCS, Chesnut J, JonesDS,Khanday I,WoodhouseM, Li C, ConradLJ, Russell SD, SundaresanV. 2017.The zygotic transition is initiated in unicellular plant zygotes with asymmetric activation of parental genomes. Developmental Cell 43: 349–358.e4. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al. 2000. Gene ontology: tool for the unificationofbiology.TheGeneOntologyConsortium.NatureGenetics25: 25–29. Autran D, Baroux C, Raissig M, Lenormand T, Wittig M, Grob S, Steimer A, BarannM,KlostermeierU, LeblancO et al. 2011.Maternal epigenetic pathways control parental contributions toArabidopsis early embryogenesis.Cell145: 707– 719. Batista RA, Figueiredo DD, Santos-Gonz�alez J, K€ohler C. 2019. Auxin regulates endosperm cellularization in Arabidopsis. Genes and Development 33: 466–476. Belmonte MF, Kirkbride RC, Stone SL, Pelletier JM, Bui AQ, Yeung EC, Hashimoto M, Fei J, Harada CM, Munoz MD et al. 2013. Comprehensive developmental profiles of gene activity in regions and subregions of the Arabidopsis seed.Proceedings of theNational Academy of Sciences,USA110: E435– E444. Bryant FM, Munoz-Azcarate O, Kelly AA, Beaudoin F, Kurup S, Eastmond PJ. 2016. ACYL-ACYL CARRIER PROTEINDESATURASE2 and 3 are responsible for making omega-7 fatty acids in the Arabidopsis aleurone. Plant Physiology 172: 154–162. CaiWJ, Yu L,WangW, SunMX, Feng YQ. 2019. Simultaneous determination of multiclass phytohormones in submilligram plant samples by one-pot multifunctional derivatization-assisted liquid chromatography-tandem mass spectrometry. Analytical Chemistry 91: 3492–3499. Cao J, Li G, Qu D, Li X, Wang Y. 2020. Into the seed: auxin controls seed development and grain yield. International Journal ofMolecular Sciences 21: 1662. ChalhoubB,Denoeud F, Liu S, Parkin IAP,TangH,WangX,Chiquet J, Belcram H, Tong C, Samans B et al. 2014. Early allopolyploid evolution in the post- neolithic Brassica napus oilseed genome. Science 345: 950–953. ChaoH,LiT, LuoC,HuangH,RuanY, LiX,NiuY, FanY, SunW,ZhangK et al. 2020. BrassicaEDB: a gene expression database for Brassica crops. International Journal of Molecular Sciences 21: 5831. Chen J, StriederN, KrohnNG,Cyprys P, Sprunck S, Engelmann JC,Dresselhaus T. 2017. Zygotic genome activation occurs shortly after fertilization in maize. Plant Cell 29: 2106–2125. Chen S, Nelson MN, Ch�ever AM, Jenczewski E, Li Z, Mason AS, Meng J, Plummer JA, Pradhan A, Siddique KHM et al. 2011. Trigenomic bridges for Brassica improvement. Critical Reviews in Plant Sciences 30: 524–547. Cheng F, Sun R, Hou X, ZhengH, Zhang F, Zhang Y, Liu B, Liang J, ZhuangM, Liu Y et al. 2016. Subgenome parallel selection is associated with morphotype diversification and convergent crop domestication in Brassica rapa and Brassica oleracea. Nature Genetics 48: 1218–1224. ClarkeWE,HigginsEE, Plieske J,WiesekeR, SidebottomC,Khedikar Y,Batley J, Edwards D, Meng J, Li R et al. 2016. A high-density SNP genotyping array for Brassica napus and its ancestral diploid species based on optimised selection of single-locus markers in the allotetraploid genome. Theoretical and Applied Genetics 129: 1887–1899. Coen O, Fiume E, Xu W, De Vos D, Lu J, Pechoux C, Lepiniec L, Magnani E. 2017. Developmental patterning of the sub-epidermal integument cell layer in Arabidopsis seeds. Development 144: 1490–1497. Davidson RM, Gowda M, Moghe G, Lin H, Vaillancourt B, Shiu S-H, Jiang N, Robin BC. 2012. Comparative transcriptomics of three Poaceae species reveals patterns of gene expression evolution. The Plant Journal 71: 492–502. Dean G, Cao Y, Xiang D, Provart NJ, Ramsay L, Ahad A, White R, Selvaraj G, Datla R, Haughn G. 2011. Analysis of gene expression patterns during seed coat development in Arabidopsis.Molecular Plant 4: 1074–1091. Doll NM, Just J, Brunaud V, Ca€ıus J, Grimault A, Dep�ege-Fargeix N, Esteban E, Pasha A, ProvartNJ, IngramGC et al. 2020a.Transcriptomics atmaize embryo/ endosperm interfaces identifies a transcriptionally distinct endosperm sub- domain adjacent to the embryo scutellum. Plant Cell 32: 833–852. DollNM,Royek S, Fujita S,OkudaS,ChamotS, Stintzi A,WidiezT,HothornM, Schaller A, Geldner N et al. 2020b. A two-way molecular dialogue between embryo and endosperm is required for seed development. Science 367: 431–435. Domazet-Lo�soT,TautzD. 2010.Aphylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature 468: 815–819. Downie AB, Zhang D, Dirk LMA, Thacker RR, Pfeiffer JA, Drake JL, Levy AA, Butterfield DA, Buxton JW, Snyder JC. 2003. Communication between the maternal testa and the embryo and/or endosperm affect testa attributes in tomato. Plant Physiology 133: 145–160. El-Showk S, Ruonala R, Helariutta Y. 2013. Crossing paths: cytokinin signalling and crosstalk. Development 140: 1373–1383. Feng S, Jacobsen SE,ReikW. 2010.Epigenetic reprogramming in plant and animal development. Science 330: 622–627. Figueiredo DD, Batista RA, Roszak PJ, Hennig L, K€ohler C. 2016. Auxin production in the endospermdrives seed coat development inArabidopsis. eLife5: e20542. FourmannM, Barret P, RenardM, PelletierG,DelourmeR, BrunelD. 1998.The two genes homologous to Arabidopsis FAE1 co-segregate with the two loci governing erucic acid content in Brassica napus. Theoretical and Applied Genetics 96: 852–858. Gao P,Quilichini TD, Zhai C,Qin LI, NilsenKT, Li Q, Sharpe AG, Kochian LV, Zou J, Reddy ASN et al. 2021. Alternative splicing dynamics and evolutionary divergence during embryogenesis inwheat species.Plant Biotechnology Journal19: 1624–1643. Gao P, Xiang D, Quilichini TD, Venglat P, Pandey PK, Wang E, Gillmor CS, Datla R. 2019. Gene expression atlas of embryo development in Arabidopsis. Plant Reproduction 32: 93–104. Garcia D, Gerald JNF, Berger F. 2005.Maternal control of integument cell elongation and zygotic control of endospermgrowth are coordinated todetermine seed size in Arabidopsis. Plant Cell 17: 52–60. Glover NM, Redestig H, Dessimoz C. 2016.Homoeologs: what are they and how do we infer them? Trends in Plant Science 21: 609–621. Hao Z, Zhang Z, Xiang D, Venglat P, Chen J, Gao P, Datla R, Weijers D. 2021. Conserved, divergent and heterochronic gene expression during Brachypodium and Arabidopsis embryo development. Plant Reproduction 34: 207–224. Haughn G, Chaudhury A. 2005. Genetic analysis of seed coat development in Arabidopsis. Trends in Plant Science 10: 472–477. HobbsDH, Flintham JE,HillsMJ. 2004.Genetic control of storage oil synthesis in seeds of Arabidopsis. Plant Physiology 136: 3341–3349. Hofmann F, Schon MA, Nodine MD. 2019. The embryonic transcriptome of Arabidopsis thaliana. Plant Reproduction 32: 77–91. James DW, Lim E, Keller J, Plooy I, Ralston E, Dooner HK. 1995. Directed tagging of the Arabidopsis FATTYACIDELONGATION1 (FAE1) gene with the maize transposon activator. Plant Cell 7: 309–319. Jenik PD, Barton MK. 2005. Surge and destroy: the role of auxin in plant embryogenesis. Development 132: 3577–3585. Jiang J, Wang Y, Zhu B, Fang T, Fang Y, Wang Y. 2015. Digital gene expression analysis of gene expression differences withinBrassica