Orthologous gene cluster visualisations identify CNV/PAV across the pangenome

A pangenome matrix of orthologous gene clusters was generated using Get_Homologous-EST and the following visualisations were produced. Phylogenetic trees were produced using Get_Phylomarkers. The pangenome matrix is available to download here.



Core and accessory gene clusters with minimum occupancy of 1

Core and accessory gene clusters with minimum occupancy of 3

An occupancy of 3 limits analysis to non-cloud clusters, disregarding singleton sequences, to identify dispensable sequences (shell/cloud) more confidently.

Pangenome growth plots

Average nucleotide identity of single-copy clusters



Phylogenetic tree (unrooted) from pangenome matrix



Phylogenetic tree of single-copy clusters (unrooted) from pangenome matrix



Pangenome matrix (clustered heatmap of all PFAM counts with Morex as control - log10 and scaled to unit variance)





Enriched pangenome matrix (clustered heatmap of enriched PFAM counts with Morex as control - raw counts)





Pangenome matrix (clustered heatmap of all PFAM counts with FT11 as control - log10 and scaled to unit variance)





Enriched pangenome matrix (clustered heatmap of enriched PFAM counts with FT11 as control - raw counts)





Pangenome matrix (hierarchical edge bundle visualisation of clusters)

An interactive visualisation of the pangenome matrix of clusters as a hierarchical edge bundle can be viewed here

The pangenome matrix was filtered to contain a count of 30 sequences in at least 1 of the varieties. Each node on the outer edge of the visualisation is a cultivar and sequence cluster. Each cultivar and sequence cluster is annotated with further information. The edges linking cultivars to clusters represent sequence counts in that cluster and are coloured from low counts (blue), middle counts (red), high counts (green).



Enriched pangenome matrix (hierarchical edge bundle visualisation of enriched clusters with Morex as control)

An interactive visualisation of the enriched pangenome matrix of clusters hierarchical edge bundle can be viewed here

The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using Morex as a control, and clusters visualised as a hierarchical edge bundle.



Enriched pangenome matrix (hierarchical edge bundle visualisation of enriched clusters with FT11 as control)

An interactive visualisation of the enriched pangenome matrix of clusters hierarchical edge bundle can be viewed here

The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using the wild variety FT11 as a control, and clusters visualised as a hierarchical edge bundle.



Pangenome matrix (force-directed network visualisation of clusters)

An interactive visualisation of the pangenome matrix of clusters as a force-directed network can be viewed here

The pangenome matrix was filtered to contain a count of 30 sequences in at least 1 of the varieties. Each node in the network connects clusters to varieties based on the number of sequence counts.



Enriched pangenome matrix (force-directed network visualisation of enriched clusters with Morex as control)

An interactive visualisation of the enriched pangenome matrix of clusters as a force-directed network can be viewed here

The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using Morex as a control, and clusters visualised as a force-directed network. Each node in the network connects clusters to varieties based on the number of sequence counts.



Enriched pangenome matrix (force-directed network visualisation of enriched clusters with FT11 as control)

An interactive visualisation of the enriched pangenome matrix of clusters as a force-directed network can be viewed here

The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using the wild variety FT11 as a control, and clusters visualised as a force-directed network. Each node in the network connects clusters to varieties based on the number of sequence counts.



Pangenome matrix (hierarchical edge bundle visualisation of PFAM counts with Morex as control)

An interactive visualisation of the pangenome matrix of PFAM counts as a hierarchical edge bundle can be viewed here

The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using Morex as a control. The enriched subset was visualised as part of all identified PFAM domains within the pangenome matrix. Each node on the outer edge of the visualisation is a variety and PFAM. The edges linking varieties to PFAMs represent PFAM counts in the connected varieties and are coloured from low counts (blue), middle counts (red), high counts (green).



Pangenome matrix (hierarchical edge bundle visualisation of PFAM counts with FT11 as control)

An interactive visualisation of the pangenome matrix of PFAM counts as a hierarchical edge bundle can be viewed here

The pangenome matrix was filtered to contain a count of 3 of the same PFAMs in at least 1 of the varieties. The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using the wild variety FT11 as a control. The enriched subset was visualised as part of all identified PFAM domains within the pangenome matrix. Each node on the outer edge of the visualisation is a variety and PFAM. The edges linking varieties to PFAMs represent PFAM counts in the connected varieties and are coloured from low counts (blue), middle counts (red), high counts (green).



Enriched pangenome matrix (hierarchical edge bundle visualisation of enriched PFAM counts with Morex as control)

An interactive visualisation of the pangenome matrix of enriched PFAM counts as a hierarchical edge bundle can be viewed here

The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using Morex as a control, and visualised as a hierarchical edge bundle. Each node on the outer edge of the visualisation is a variety and PFAM. The edges linking varieties to PFAMs represent PFAM counts in the connected varieties and are coloured from low counts (blue), middle counts (red), high counts (green).



Enriched pangenome matrix (hierarchical edge bundle visualisation of enriched PFAM counts with FT11 as control)

An interactive visualisation of the pangenome matrix of enriched PFAM counts as a hierarchical edge bundle can be viewed here

The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using the wild variety FT11 as a control, and visualised as a hierarchical edge bundle. Each node on the outer edge of the visualisation is a variety and PFAM. The edges linking varieties to PFAMs represent PFAM counts in the connected varieties and are coloured from low counts (blue), middle counts (red), high counts (green).



Enriched pangenome matrix (force-directed network visualisation of enriched PFAM counts with Morex as control)

An interactive visualisation of the pangenome matrix of enriched PFAM counts as a force-directed network can be viewed here

The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using Morex as a control, and visualised as a force-directed network. Each node on the outer edge of the visualisation is a variety and PFAM. The edges linking varieties to PFAMs represent PFAM counts in the connected varieties.



Enriched pangenome matrix (force-directed network visualisation of enriched PFAM counts with FT11 as control)

An interactive visualisation of the pangenome matrix of enriched PFAM counts as a force-directed network can be viewed here

The pangenome matrix was enriched using Fishers exact test (q-value < 0.05) using the wild variety FT11 as a control, and visualised as a force-directed network. Each node on the outer edge of the visualisation is a variety and PFAM. The edges linking varieties to PFAMs represent PFAM counts in the connected varieties.



Phenotype enrichment analysis

Scoary was used to enrich for multiple different traits, which included cultivar/landrace, winter/spring, malting/feed, eastern/western, and two-rowed/six-rowed, using a naive p-value < 0.05. The resulting enriched pangenome matrices were then filtered to contain a count of 2 sequences in at least 1 of the varieties, then visualised as hierarchical edge bundles.



To visualise the cultivar enriched pangenome matrix, click here



To visualise the landrace enriched pangenome matrix, click here



To visualise the winter enriched pangenome matrix, click here



To visualise the spring enriched pangenome matrix, click here



To visualise the malting enriched pangenome matrix, click here



To visualise the feed enriched pangenome matrix, click here



To visualise the eastern enriched pangenome matrix, click here



To visualise the western enriched pangenome matrix, click here



To visualise the two-rowed enriched pangenome matrix, click here



To visualise the six-rowed enriched pangenome matrix, click here



Scoary enriched sequences were categorised into KEGG pathways

Sequences were searched against KEGG using blastKoala using taxonomy group 'plants' and searched against database 'genus_eukaryotes'.