Library Workshops and Events

Event Details

Data Wrangling and Processing for Genomics Part 1 of 3

OPEN TO:
-Graduate Students

In a previous session, you learned how to use the bash shell to interact with your computer through a command line interface.

In this session, you will be applying this new knowledge to carry out a common genomics workflow—identifying variants among sequencing samples taken from multiple individuals within a population. We will be starting with a set of sequenced reads (.fastq files), performing some quality control steps, aligning those reads to a reference genome, and ending by identifying and visualizing variations among these samples. 

Part 1 covers the following topic(s):

  • Background and metadata
  • Assessing read quality

As you progress through this session, keep in mind that, even if you aren’t going to be doing this same workflow in your research, you will be learning some very important lessons about using command-line bioinformatic tools. What you learn here will enable you to use a variety of bioinformatic tools with confidence and greatly enhance your research efficiency and productivity.

Bring your own laptop.

This lesson assumes a working understanding of the bash shell. If you haven’t already completed the Shell Genomics lesson, and aren’t familiar with the bash shell, please review those materials before starting this lesson.

This lesson also assumes some familiarity with biological concepts, including the structure of DNA, nucleotide abbreviations, and the concept of genomic variation within a population.

DATE
Tuesday, July 28, 2020
TIME
9:45AM - 10:50AM
PRESENTER
David Molik
Registration has closed.

Contact Info

Profile photo of Center for Digital Scholarship
Center for Digital Scholarship

Hesburgh Library–2nd Floor NE
cds.library.nd.edu
cds@nd.edu

Julie C. Vecchio '04, MPH, MLIS
Co-Interim Director
jvecchio@nd.edu
(574) 631-4900