Detection and classification of somatic structural variants, and its application in the study of neuronal development

dc.contributor
Universitat de Barcelona. Facultat de Biologia
dc.contributor.author
Planas Fèlix, Mercè
dc.date.accessioned
2021-07-15T10:26:50Z
dc.date.available
2021-10-05T02:00:14Z
dc.date.issued
2020-10-05
dc.identifier.uri
http://hdl.handle.net/10803/672163
dc.description
Tesi realitzada al Centre de Supercomputació de Barcelona (BSC) / Programa de Doctorat en Biomedicina
en_US
dc.description.abstract
The identification and analysis of genomic variation across individuals has been central in biology, first through comparative genomics to answer evolutionary questions, and then in the context of biomedicine, where it is actually becoming central to the study of most diseases. Next generation sequence technologies are allowing the systematic analysis of thousands of different types of genetic variation, enhancing the identification of disease markers and the understanding of the molecular basis of disease. For the past years, there has been a burst of new methodology for genome analysis around diseases coming from hundreds of groups around the world. Specific computational methods and strategies are being designed and improved around the identification and interpretation of genomic variation. The identification and classification of different types of genomic variants in the context of biomedicine is a key and foundational step for the development of a personalized medicine. This has been particularly central in the field of cancer genomics, which has based the research of the past ten to fifteen years in the sequencing of genomic DNA, and the identification and interpretation of (mostly) somatic and germline variation. Throughout these years, a large number of methods for variant detection have been developed with different action ranges. Despite all these developments, the identification of genomic variants has still room for improvement, not only at the level of sensitivity and specificity, but also at the computational level. Given the emergence of many initiatives for personalized medicine around the world, and the expected number of genomes that will have to be analyzed within health care systems, we require robust algorithms, designed together with a matching implementation that will minimize the computational costs of the analysis. With this aim, during this thesis, I have pushed and designed and implemented an algorithm for the efficient processing of genomic data, in close collaboration with computer scientists of our center that defined the implementation, focusing on lowering the energy and the time of the analysis. This methodology, which relies on a reference free approach of read classification, has been protected with a patent, and is being used as the foundation for the development of SMuFin2, a more accurate and computationally efficient version of the initial SMuFin from 2014. We here show that our method is able to process whole genome sequences very fast and with a minimal energy consumption, compared with existing methods, and that has great potential for the identification of all ranges of variants, including insertions of non-human DNA. Further developments on SMuFin2 are needed to finally assess its full variant calling capabilities. Despite their great importance and their clear role in the biology of the cell, somatic variation that occurs in healthy tissues has remained diffuse in their roles. In the case of development, some hypotheses have been proposed to explain the observed somatic DNA damage that occurs during brain development (e.g., replication stress). But the real impact and the underlying mechanisms of this somatic variation are not yet understood. In order to seed light on the type and potential functional impact of somatic variation in brain development, we established a new collaboration to identify, and describe somatic DNA rearrangements induced by Pgbd5 during brain development and adult state in 36 mice neural tissue samples. The detection of somatic variants in healthy tissues presents more challenges than in the cancer scenario, where a variant is present in a significant number of cells and is easier to detect. We have identified, classified and interpreted the landscape of somatic variation in neural development and identified interesting differences between adult and embryonic variation load, and specific types of variants, as the potential result of the activity of these transposase-like genes.
en_US
dc.format.extent
277 p.
en_US
dc.format.mimetype
application/pdf
dc.language.iso
eng
en_US
dc.publisher
Universitat de Barcelona
dc.rights.license
ADVERTIMENT. Tots els drets reservats. L'accés als continguts d'aquesta tesi doctoral i la seva utilització ha de respectar els drets de la persona autora. Pot ser utilitzada per a consulta o estudi personal, així com en activitats o materials d'investigació i docència en els termes establerts a l'art. 32 del Text Refós de la Llei de Propietat Intel·lectual (RDL 1/1996). Per altres utilitzacions es requereix l'autorització prèvia i expressa de la persona autora. En qualsevol cas, en la utilització dels seus continguts caldrà indicar de forma clara el nom i cognoms de la persona autora i el títol de la tesi doctoral. No s'autoritza la seva reproducció o altres formes d'explotació efectuades amb finalitats de lucre ni la seva comunicació pública des d'un lloc aliè al servei TDX. Tampoc s'autoritza la presentació del seu contingut en una finestra o marc aliè a TDX (framing). Aquesta reserva de drets afecta tant als continguts de la tesi com als seus resums i índexs.
dc.source
TDX (Tesis Doctorals en Xarxa)
dc.subject
Càncer
en_US
dc.subject
Cáncer
en_US
dc.subject
Cancer
en_US
dc.subject
Genètica humana
en_US
dc.subject
Genética humana
en_US
dc.subject
Human genetics
en_US
dc.subject
Bioinformàtica
en_US
dc.subject
Bioinformática
en_US
dc.subject
Bioinformatics
en_US
dc.subject
Genòmica
en_US
dc.subject
Genómica
en_US
dc.subject
Genomics
en_US
dc.subject.other
Ciències Experimentals i Matemàtiques
en_US
dc.title
Detection and classification of somatic structural variants, and its application in the study of neuronal development
en_US
dc.type
info:eu-repo/semantics/doctoralThesis
dc.type
info:eu-repo/semantics/publishedVersion
dc.subject.udc
575
en_US
dc.contributor.director
Torrents Arenales, David
dc.contributor.tutor
Gelpí Buchaca, Josep Lluís
dc.embargo.terms
12 mesos
en_US
dc.rights.accessLevel
info:eu-repo/semantics/openAccess


Documents

MPF_PhD_THESIS.pdf

52.73Mb PDF

This item appears in the following Collection(s)