Day 1: Meet the Genomes

Author

Duy PT, Hao CT, Quynh NPN, Nguyen MTS

Published

May 12, 2026

Welcome to the first core day of our workshop! Having established our technical foundation with Linux in the pre-workshop, we now transition into the heart of bacterial genomics. Today’s journey, “Meet the Genomes”, is designed to take you from a theoretical understanding of Whole-Genome Sequencing (WGS), to the practical reality of generating high-quality bacterial assemblies from raw data.

Itinerary

Modern microbiology is being revolutionized by genomic data. We will explore how these digital sequences translate into actionable public health insights — from tracking hospital outbreaks to monitoring the global spread of antimicrobial resistance.

The day is structured into three modules:

1.1 Whole-genome sequencing - What & Why | Practical Guides | Slides

We begin with the “what” and “why” of WGS. This module covers the fundamental concepts of sequencing technologies and their critical applications in:

  • Clinical Microbiology: Enhancing diagnostic speed and precision.
  • Outbreak Investigation: Inferring the transmission dynamics from the genomic relationships between the isolates.
  • AMR Surveillance: Identifying the genetic determinants of drug resistance.
  • Case Study Introduction: We will introduce the real-world Klebsiella pneumoniae dataset from a recent hospital outbreak that will serve as our primary study material.

1.2 Studying Public Genomes | Practical Guides | Slides

Genomic epidemiology relies heavily on comparative analysis. In this module, you will learn to navigate the vast archives of public genomic data:

  • NCBI Resources: Searching for and retrieving reference genomes.
  • Genome Visualization: Hands-on training with Artemis — a powerful tool for browsing through bacterial genomes, inspecting gene regions, and understanding genomic features.

1.3 From Raw Data to Assembly | Practical Guides | Slides

At the end of the day, we’ll put theory into practice. Using an K. pneumoniae outbreak dataset, you will learn the essential bioinformatic pipeline to process raw short-read sequencing data:

  • Quality Control (QC): Identifying issues in raw sequencing reads using FastQC and MultiQC.
  • De novo Assembly: Using Unicycler to reconstruct full genomes from short-read sequences.

Learning Objectives

By the end of today, you will be able to:

  1. Explain the utility of WGS in clinical and public health microbiology.
  2. Retrieve and visualize bacterial genomes from public databases.
  3. Perform quality control on raw sequencing data.
  4. Generate bacterial genome assemblies using state-of-the-art bioinformatic tools.

Let’s begin our journey into the bacterial genome!

Back to top