More than two decades after the draft human genome was celebrated as a scientific milestone, scientists have finally finished the job. The first complete, gap-free sequence of a human genome has been published in an advance expected to pave the way for new insights into health and what makes our species unique. Until now, about 8% of the human genome was missing, including large stretches of highly repetitive sequences, sometimes described as "junk DNA." In reality though, these repeated sections were omitted due to technical difficulties in sequencing them, rather than pure lack of interest.
Sequencing a genome is something like slicing up a book into snippets of text then trying to reconstruct the book by piecing them together again. Stretches of text that contain a lot of common or repeated words and phrases would be harder to put in their correct place than more unique pieces of text. New "long-read" sequencing techniques that decode big chunks of DNA at once -- enough to capture many repeats -- helped overcome this hurdle. Scientists were able to simplify the puzzle further by using an unusual cell type that only contains DNA inherited from the father (most cells in the body contain two genomes -- one from each parent). Together these two advances allowed them to decode the more than 3 billion letters that comprise the human genome.
The science behind the sequencing effort and some initial analysis of the new genome regions are outlined in six papers published in the journal Science. You can read more by starting at: https://www.science.org/doi/10.1126/science.abp8653.