Data Science: new tools for Astroparticle Physics and its applications to the industry


  • Call:

    IDPASC Portugal - PHD Programme 2019

  • Academic Year:

    2019 / 2020

  • Domain:

    Astroparticle Physics

  • Supervisor:

    Lorenzo Cazon

  • Co-Supervisor:

    Ruben Conceição

  • Institution:

    Instituto Superior Técnico

  • Host Institution:

    Laboratório de Instrumentação e Física Experimental de Partículas

  • Abstract:

    Data Science is an old field with renewed looks, thanks to the advances of computing science. It consists on analyzing data sets to find correlations, causal relations, patterns; build a hypothesis, assign significances to them, assess the efficiency of an algorithm of finding a signal probability of false positives, assess the capabilities of finding a signal. Define control samples, simulate and replicate the reality according to a model. Access, store, retrieve data, moderate or extremely large data-sets (Big Data); create automatic tools that take decisions, machine learning. Data Science is key in modern society outside fundamental science. Data scientist is one of the most sought-after jobs of the moment by a large variety of companies, for instance: social networks, large retail companies, pharmaceutical, consulting and telecommunication companies, among others. The goal of this Ph.D. is to develop and explore new tools in the the domain of astroparticle physics, namely the data collected by the Pierre Auger Collaboration: muons collected at the ground and their relation with high-energy hadronic interaction models; ultra-high-energy cosmic ray arrival directions and connection with astrophysical sources. The student will also take part in a joint project with a well-known company within a research project in Data Science. He/she will apply the standard techniques of physics in general, and the tools developed within this project in particular, aiming at strengthening the synergies between fundamental research and the industry. The proposed Ph.D. combines the data analysis performed in Academia with the needs of Modern Industry.