Pangenome graph
These graph visualisations depict the 20-accession barley pangenome across each chromosome.
Each image represents a pangenome graph with binned, linearised renderings of the embedded paths versus the pangenome sequence in a binary matrix (horizontal bars)
versus the topology of the graph, shown under the paths as links representing sites where the sequences diverge. Each path in the graph is ordered by cultivar: a) Golden Promise,
b) Hockett, c) RGT Planet, d) Barke, e) Igri, f) HOR3081, g) Morex, landrace: h) HOR8148, i) HOR13821, j) HOR3365, k) HOR9043, l) HOR10350, m) HOR13942,
n) HOR21599, o) ZDM01467, p) HOR7552, q) Akashinriki, r) ZDM2064, s) OUN333 and then wild type: t) B1K-04-12.
The pangenome graphs were produced using the PanGenome Graph Builder (PGGB) v0.5.1.
Segmented pangenome graphs, broken into four different quadrants across the pangenome based on the Morex reference coordinates, to illustrate a closer inspection of different genomic regions, can be downloaded here.
Pangenome graph with inversions
The graph visualisation with red bands indicate sites of inversions.
Pangenome graph with node depth
The graph visualisation with dark regions along the bands indicate sites of high complexity, such as the centromere.
The green boxes highlight the region with the highest mean node depth, likely the centromere.
Compressed pangenome graph with path depth
The compressed view of the graph summarising the path coverage across all paths with a heatmap colour-coding the depth,
with dark blue meaning highest coverage and dark red meaning the lowest coverage.
Mean depth coverage across the pangenome graph
The mean path depth coverage is calculated with a 1Kbp sliding window using ODGI with Morex as a reference.
Regions of highest mean path depth are indicative of regions with the highest complexity in the graph, likely the centromere. Taken together with the node
and path depth pangenome graphs, the most likely position of the centromere can be inferred.
Pangenome graph evaluation: Precision, recall and combined F1 score across the pangenome graph
Whole genome pair-wise alignment with Nucmer as base truth
Using RTG-Tools, the precision, recall and the combined F1 score of SNPs was calculated for each pangenome graph, for each chromosome, from each haplotype
aligned against Morex, using conventional Nucmer alignments as the base truth.
High coverage WGS read alignment with BWA/GATK as base truth
Using RTG-Tools, the precision, recall and the combined F1 score of SNPs was calculated for each pangenome graph, for each chromosome, from high coverage (30x) reads of Barke, Morex and Igri
aligned against RGT Planet, with a conventional BWA alignment and GATK variant calling pipeline as the base truth.