Big Data Processing and Machine Learning for CLOUD/PS215 at CERN


  • Call:

    PT-CERN Call 2021/2

  • Academic Year:


  • Domain:


  • Supervisor:

    Antonio Amorim

  • Co-Supervisor:

    Jonathan Duplissy

  • Institution:

    FCUL (Universidade de Lisboa)

  • Host Institution:

    CENTRA - Center for astrophysics and gravitation

  • Abstract:

    Following a very recent line of research for atmospheric observations, we propose to apply and improve machine learning methods to three different problems in the context of the CLOUD experiment: • The control of the camera driving mechanisms for achieving definite conditions like humidity, ozone creation, etc. To reach a given camera state, a non-linear combination of factors must act along extended periods. The work will replace the non-linear “control system” of CLOUD that presently is carried out by user informed trial and adjustment by a trained automatic AI agent. • Automatic identification of new particle formation events (NPF). The non-linear characteristics of the phenomena in the chamber lead to nucleation events that are only detected by humans looking at “banana plots” where the aerosol concentrations for different diameters are plotted over time. We propose looking at these particular plots as images and applying the Region-based Convolution Neural Network method for image recognition to identify and classify these nucleation events. • A significant quantitative way to express the evolution of the aerosol distribution is the aerosol growth rates. These rates are hard to compute from raw instrument data and must be stripped from many factors like wall effects. We propose to train a deep neural network from the observed evolution of aerosol distribution data to provide the growth rates in several conditions.