Pangenome graph

These graph visualisations depict the 20-accession barley pangenome across each chromosome. Each image represents a pangenome graph with binned, linearised renderings of the embedded paths versus the pangenome sequence in a binary matrix (horizontal bars) versus the topology of the graph, shown under the paths as links representing sites where the sequences diverge. Each path in the graph is ordered by cultivar: a) Golden Promise, b) Hockett, c) RGT Planet, d) Barke, e) Igri, f) HOR3081, g) Morex, landrace: h) HOR8148, i) HOR13821, j) HOR3365, k) HOR9043, l) HOR10350, m) HOR13942, n) HOR21599, o) ZDM01467, p) HOR7552, q) Akashinriki, r) ZDM2064, s) OUN333 and then wild type: t) B1K-04-12. The pangenome graphs were produced using the PanGenome Graph Builder (PGGB) v0.5.1.

Segmented pangenome graphs, broken into four different quadrants across the pangenome based on the Morex reference coordinates, to illustrate a closer inspection of different genomic regions, can be downloaded here.



Pangenome graph with inversions

The graph visualisation with red bands indicate sites of inversions.



Pangenome graph with node depth

The graph visualisation with dark regions along the bands indicate sites of high complexity, such as the centromere. The green boxes highlight the region with the highest mean node depth, likely the centromere.



Compressed pangenome graph with path depth

The compressed view of the graph summarising the path coverage across all paths with a heatmap colour-coding the depth, with dark blue meaning highest coverage and dark red meaning the lowest coverage.



Mean depth coverage across the pangenome graph

The mean path depth coverage is calculated with a 1Kbp sliding window using ODGI with Morex as a reference. Regions of highest mean path depth are indicative of regions with the highest complexity in the graph, likely the centromere. Taken together with the node and path depth pangenome graphs, the most likely position of the centromere can be inferred.



Pangenome graph evaluation: Precision, recall and combined F1 score across the pangenome graph

Whole genome pair-wise alignment with Nucmer as base truth

Using RTG-Tools, the precision, recall and the combined F1 score of SNPs was calculated for each pangenome graph, for each chromosome, from each haplotype aligned against Morex, using conventional Nucmer alignments as the base truth.



High coverage WGS read alignment with BWA/GATK as base truth

Using RTG-Tools, the precision, recall and the combined F1 score of SNPs was calculated for each pangenome graph, for each chromosome, from high coverage (30x) reads of Barke, Morex and Igri aligned against RGT Planet, with a conventional BWA alignment and GATK variant calling pipeline as the base truth.