AG Splinter Meeting 2021:
E-Science & Virtual Observatory

Annual Meeting 2021 of the Astronomische Gesellschaft, Virtual Meeting, Germany

E-Science & Virtual Observatory

This Virtual Meeting shows the application of Machine Learning to all areas of astronomy, and also, that more solid built digital infrastructures for the precious data are needed.

In the last decade, the field of artificial intelligence (AI) and machine learning (ML) has vastly expanded, and several ML methods have recently been used in astronomy. Their big advantage is that they give computers the ability to learn from data without being explicitly programmed. Whereas for classical numerical methods we need to know all (complex) 'rules' beforehand, an ML algorithm can detect patterns automatically. In astronomy, the number of studies that apply ML techniques has risen substantially in the last years. Unsupervised learning algorithms have been used to identify different kinematic components of simulated galaxies, to compare stellar spectra, to classify pulsars, and to find high-redshift quasars. Supervised learning has been used to classify variable stars, to find exoplanets, to link galaxies and dark matter haloes, to classify galaxies morphologically, and to determine the redshift of galaxies. New developments include the application of modern learning approaches, such as semi-supervised, reinforcement, or representation learning, and state-of-the-art ML methods, such as generative adversarial networks, recurrent networks, or encoder-decoder-architectures. The ML sessions are inspired by the growing adoption of ML approaches in the astronomy community. We aim to bring together researchers applying ML techniques to data intensive problems in the fields of exoplanets, stars, the interstellar medium, galaxies, and cosmology. This includes approximating physical processes, analysing large data sets, understanding what a learned model really represents, and connecting tools and insights from astrophysics to the study of ML models. The goal is to discuss and share new approaches, disseminate recent results, understand the limitations, and promote the application of existing algorithms to new problems. We expect to strengthen the interdisciplinary dialogue, introduce exciting new possibilities to the broader community, and stimulate the production of new approaches to solving challenging open problems in astronomy.

Thursday,   16.09.2020, 16:15 - 18:00 , Virtual Room ESC

Friday,  17.09.2020, 09:00 - 11:00 and 14:00 - 16:30 , Virtual Room ESC

Convenors

H. Enke (AIP), K. Polsterer (HITS)

Agenda and Presentations

Thursday, 16.09.2020

PUNCH4NFDI Consortium
H.Enke Intent and Scope of NFDI
M. Steinmetz General Structure and Roadmap of PUNCH4NFDI
K. Schwarz Core Task Areas
M. Kramer / H.Enke(p) Data Irreversibility
S. Wagner Synergies, Education
Discussion


Friday, 17.09.2021, 09:00-11:00h


Nikos Gianniotis

Probabilistic flux variation gradient

We present a probabilistic reformulation of the flux variation gradient (FVG) method in the context of photometric reverberation mapping of active galactic nuclei (AGN). The FVG is used to disentangle the AGN and host-galaxy contributions to the total flux in different photometric bands. The method relies on the âobserved bluer when brighter phenomenaâ attributed to the superimposition of a red-host galaxy whose luminosity is constant in time, and a blue AGN whose luminosity does vary over time. Our main motivation for reformulating the existing FVG method is our wish of quantifying the uncertainty of the galaxy contribution estimate. Additionally, the presented probabilistic extension makes it possible to consider observations in multiple bands in a joint manner, and also to account for observational noise. We present details of the probabilistic recasting of the FVG and its application on a select set of AGN.

Fenja Kollasch

UltraPINK returns: Newest developments in visualizing and interacting with Self-Organizing Kohonen Maps

Self-Organizing Kohonen Maps are a promising approach to help with the analysis of big data sets which are often encountered in Astronomy. By showing a dimension of representative shapes, such a Kohonen Map gives a brief overview about morphological structures appearing in the data set. Last year, we introduced UltraPINK, a web-based frontend for the Parallelized rotation and flipping INvariant Kohonen maps framework (PINK). Its basic functions allowed training and visualization of self-organizing maps, as well as interaction, labeling, and export. Yet, the intention behind UltraPINK is bigger. To make one more step on the path towards our final goal of a generic framework for the analysis of astronomical data, we extended UltraPINK among a few features. We will show the new map layouts, communication interfaces with other popular frameworks, and how UltraPINK will improve your experience from just looking at images to interactively exploring your research data.

Caroline Heneka

Learning from 3D tomographic 21cm intensity data

Intensity Mapping (IM) of line emission targets the Universe from present time up to redshifts beyond ten when the Universe reionized and the first galaxies formed, from small to largest scales. Imaging the 21cm signal, with redshift dependency added through frequency, will result in 3D lightcone data that gives valuable insight into the growth of structure, the inter-galactic medium as well as properties and environment of ionising sources. Due to the huge amount of data that radio interferometers, and especially the Square Kilometre Array (SKA) will produce, as well as the highly non-Gaussian nature of the fluctuation signal measured, these data necessitate the development of new methods beyond e.g. power spectrum measurements of fluctuations. In this talk I showcase the use of deep networks that are tailored for the 3D structure of tomographic 21cm lightcones of reionisation and cosmic dawn to to directly infer e.g. dark matter and astrophysical properties without an underlying Gaussian assumption. I compare different architectures and highlight how a relatively simple 3D convolutional network architecture can be constructed to become the best-performing one. I finish by a glimpse at lower redshift results for the recent SKA Science Data Challenge 2, where hydrogen sources where to be detected and characterised in a large (TB) low signal-to-noise datacube of the hydrogen 21cm line. I will highlight lessons learned on the application of a range of machine learning methods and architectures on such types of datacubes.

Samir Nepal

A Convolutional Neural Network Approach for Stellar Atmospheric Parameters and Lithium Abundance Determination

The chemical element Lithium is of a great interest as its evolution in the Milky Way is not yet well understood. To help tighten the constrain on stellar and galactic chemical evolution models, numerous and precise Lithium abundance determination are necessary for stars in a large range of evolutionary stages and galactic populations. In the age of industrial stellar abundances, spectroscopic surveys such as GALAH, RAVE, and LAMOST have used data-driven methods to rapidly and precisely determine stellar labels (atmospheric parameters + abundances). The ultimate goal of this work is to prepare the machine learning ground for Lithium measurement in the context of the future spectroscopic surveys 4MOST and WEAVE. To do so, we develop a Convolution Neural Network (CNN) approach, based on stellar labels and GIRAFFE HR15 spectra of the 6th internal-release of the Gaia-ESO survey, to determine atmospheric parameters and Lithium abundances for ~40,000 stars. The HR15 setup is very well adapted for this purpose, being very similar to the HR red arm of both WEAVE and 4MOST. We show that the unique Lithium feature at 6708 Ã is successfully singled out by the CNN, among the thousands of spectral features. Efficient Lithium measurements are performed in field and open cluster stars, as well as for rare objects like Lithium-rich giants. Such performances are achieved by meticulously building a high quality and homogeneous training sample. Our findings give very good insights for the future of 4MOST and WEAVE surveys in terms of Lithium analysis and science output. Due to the capacity to learn very complex relations and its flexibility, this method (CNN) can be easily adapted, with very little re-engineering, to include additional labels for instance rotational velocities, and other chemical abundances. A CNN can also be easily adapted to other spectroscopic surveys with different wavelength ranges.

Thavisha Dharmawarendena

Deriving the 3D structure of the Milky Way: A fast and scalable Gaussian Process applied to nearby star-formation regions

The detailed 3D distributions of dust and extinction in the Milky Way have long been sought after. Three-dimensional reconstruction from sparse data is a non-trivial problem, but it is essential to understanding the properties of both the stars obscured by dust and the large-scale dynamics and structure of our Galaxy. We present a new fast and scalable model based on Gaussian processes implemented using public python packages that runs on either CPUs or GPUs. We use a Gaussian process latent variable method combined with variational inference to map the Galaxy on parsec scales using data from large surveys including Gaia, 2MASS, and WISE. The model maintains non-decreasing extinction and non-negative densities throughout, which has proven problematic in previous efforts. Once trained, the model can be used to predict both extinction and density structure on the fly. This allows us to view the large-scale structure of the Milky Way while simultaneously peering into individual molecular clouds, and provides insights into multi-scale processes such as fragmentation in molecular clouds and the spiral structure of our Galaxy. We have applied our new model to the Orion, Cygnus, Perseus and Taurus star formation regions to recover detailed 3D density structures and localise small scale regions within them. A number of features that are superimposed in 2D extinction maps are deblended in our 3D dust extinction density maps. For example, we find a large filament on the edge of Orion that may host a number of star clusters. We also identify a coherent structure that may link the Taurus and Perseus regions, and show that Cygnus X is located 1300â1500 pc away, in line with VLBI measurements. By comparing our predicted extinctions to Planck data, we find that known relationships between density and dust processing, where high-extinction lines of sight have the most processed grains, hold up in resolved observations when density is included, and that they exist at smaller scales than previously suggested. This can be used to study the changes in size or composition of dust as they are processed in molecular clouds.

Meetu Verma

Practical application of t-distributed stochastic neighbor embedding in classifying chromospheric spectra

Observational solar physicists are nowadays confronted with huge amounts of data, initially, this mainly concerned images but also spectra enter the realm of Big Data. The number of spectra accumulated at a medium-size telescope such as the Vacuum Tower Telescope (VTT) at Tenerife easily reaches up to millions over a single observing day. Hence, machine learning tools are required to identify and classify spectra with minimal human intervention. Our exploratory work provides the framework and some ideas on how t-distributed stochastic neighbor embedding (t-SNE) can be adapted to the classification of chromospheric spectra, identifying those spectra related to eruptive events.


Friday, 17.09.2021, 14:00-16:00h


Markus Demleitner

Resource Discovery in the VO: new developments

Resource discovery in the VO is still mainly based on finding sources for previously known data ("Gaia eDR3"). To realise the full potential of open data, it is desirable to enable blind discovery, i.e., the use of physical constraints ("redshifts for objects down to 20 mag in V around the Coma cluster"). This talk will discuss traditional (UCD), new (VODataService coverage) and planned (column statistics) facilities in the VO Registry leading up to that vision.

Arman Khalatyan

An infrastructure for the reproducible scientific workflows

The growing data amount in astronomy is require a large amount compute resources. In order to analyze and extract the valuable scientific outcomes the modern astronomy requires not only regular algorithms and methods but also recently quite common the machine learning algorithms (ML). The ML algorithms are requiring specific accelerators such as GPUs. Those are requiring complex hardware and software infrastructure. We will present a concept infrastructure at AIP for the reproducible scientific workflows. A Cloud-based environment for the various micro services workloads based on k8s cluster with the focus on reproducibility of scientific results. The k8s cluster is an open-source system for automating deployment, scaling, and management of containerized applications. The environment will enable rapid application development, easy deployment and scaling, and long-term lifecycle maintenance for small and large teams. we will show the application of this concept to the StarHorse project.

Yori Fournier

Machine Learning on Solar Plates

A machine learning library to detect solar spots on APPLAUSE solar plates

Christian Dersch

Creating variable star catalogs from public photographic plate archivesa

In the era of modern surveys, for example Gaia or ZTF, several catalogs of variable stars exist. However, for photographic plate archives, such catalogs do not exist in a larger scale yet. Starting with a comparative analysis of the Bamberg Southern Sky Patrol (BSSP), being a part of the APPLAUSE database, and the modern ASAS-SN survey, we discuss the requirements on data quality, density etc. of photographic plate series to calculate a catalog of variable stars from plate archives. We have two major scientific goals, first: Providing a measure for the precision of photographic plate measurements by comparing statistical properties with modern survey data. The second goal is to study changes of parameters such as period over time, as for example ASAS-SN measurements happened 50 years after BSSP measurements.


Contact

If you have any further questions, please don't hesitate to contact us

Harry Enke: henke [at] aip [dot] de
Kai Polsterer: kai.polsterer [at] h-its [dot] org

EScience & Virtual Observatory Splinters at AG Meetings