Splinter Meeting: E-Science & Virtual Observatory

Annual Meeting of the Astronomische Gesellschaft 2015, Kiel, Germany

Dates

Tuesday, September 15, 15:00 - 18:30

Wednesday, September 16, 15:00 - 17:30

Convenors

H. Enke, K. Polsterer, J.K. Wambsganss

Agenda   Call

Abstracts

Kai PostererAstronomy in the cloud

Data driven research has become an important aspect in astronomy as data sets are growing exponentially and conventional analysis approaches do not scale appropriately. E.g. selecting interesting objects is often not efficient enough as occurrences of rare objects may be just a few per million while simple explicit criteria would deliver thousands of candidates. Computationally more demanding approaches are necessary that do not perform well on a single computer with dedicated local storage. In the past, when working together in international collaboration, it was still possible to host a copy of the data at every participating institution. As soon as higher level data products were extracted or better post-processing was realized, everyone had to synchronize the locally stored data. With the new generation of overwhelmingly large data sets this approach is rendered impossible. Small institutions are already today no longer able to participate in this field of research. Cloud computing could provide a solution to those challenges.

As part of the Widefield ouTlier Finder (WTF) project within the Evolutionary Map of the Universe (EMU) collaboration we developed a different strategy. This talk presents an approach that makes use of Amazon web services as a cloud infrastructure provider. The basic ideas of this approach are to centrally store a single copy of the data and enable all participating institutions to process the data with cloud instances as well as scaling out on cluster architecture. Simple solutions of how to deal with common problems are presented, exemplarily.

Ole StreicherDesign and status of the MuseWISE data management system

The Multi Unit Spectroscopic Explorer (MUSE) is a second generation instrument that is installed at the ESO's Very Large Telescope (VLT). It is an integral-field spectrograph consisting of 24 integral field units and operates in the visible range of the electromagnetic spectrum. Providing a wide field of view in combination with high spatial resolution, MUSE is a powerful tool for studying galaxy formation, stars and the early stage of stellar evolution and many more. For a transparent and automated analysis, our data management system MuseWISE integrates the ingestion, automated data reduction, QC/QA processes, query, and long term storage. It is used as the main tool to process the GTO data of the instrument. This talk gives an overview of MuseWISE and presents its current status.

Markus DemleitnerIntroduction to DataLink

Datalink is a new protocol defined by the International Virtual Observatory Association (IVOA), adopted on June 17, 2015. Its main purpose is allowing standards-based access to complex datasets, potentially consisting of many different files, as well as interoperable access to server-side dataset manipulation (e.g., cutout, rebinning). In this talk, we will give a brief overview over the basic mechanisms of Datalink, both as regards standalone operation (i.e., without using other parts of the IVOA protocol stack) and the integration of Datalink into the IVOA's Data Access Layer.

S.D. KüglerFeatureless Classification of Lightcurves

Classification of irregularly sampled time series is extremely difficult because the data cannot be represented naturally as a vector directly useable by a classifier. While in the literature, various statistical features serve as vector representations, time series are represented by a density model in this work. The density model can capture all information available in the static behavior of a light curve and including measurement errors properly. Subsequently, a distance matrix is created, by measuring the similarity of each pair of light curves with an according distance. This distance matrix is then made available to several classifiers. To strengthen the meaning of the new representation, data from the OGLE (Optical Gravitational Lensing Experiment) and ASAS (All Sky Automated Survey) survey are used and it is demonstrated that the proposed representation performs up to par with the best currently used feature-based approaches. Since no a priori knowledge is used in the creation (and normalization) of features, the density representation presents an upper boundary in terms of information made available to the classifier. Finally, the predictive power of the proposed representation depends on the choice of similarity measure and classifier, only and can therefore be seen as a more principled representation, that is also suited for tasks beyond classification, e.g. unsupervised learning.

A. BeckerStellar Classification with Photometric Datasets

With the advent of the large photometric and spectroscopic surveys it became more and more important to develop strategies and methods to utilize of the large amount of data. One of the scopes is to classify the observed objects with respect to the existing classification systems. In the course of our Magellanic Clouds Massive Stars and Feedback Survey (MCSF) (Bomans et al., 2014) we gathered a spatial complete census of bright stars in the Small and Large Magellanic Cloud with high quality photometric measurements in broad- (u, B, V, R, I) and narrow- (Hα, [OIII], [SII]) band filters. To augment the wave- length coverage the data was matched with archival data from other surveys (e.g.: GALEX, 2MASS, Spitzer). Based on this sample we are developing a method to estimate spectral types of stars using only photometric data. To de- termine the feasibility we initially restricted ourself to the Magellanic Clouds and started a search on Wolf-Rayet stars in both galaxies. Now, as the proof of concept works we began to apply our method to other galaxies where the stellar population can be resolved. Here we will show the current status of our project and discuss the future perspective and the challenges we are facing in the transfer of our method to other galaxies

Christian DerschDevelopment of an Open Source Light Curve Classificator

Knowledge Discovery and especially Machine Learning in general are very use- ful for automatical data analysis. The usage of Machine Learning has grown within the last few years and it was shown, that it is a working powerful tool for classification problems. In domain of photometry the classification of light curves is a main task. The purpose of this approach is the development of light curve classificators as free software, in connection to astroML and the well-known astropy project. In general, algorithms for Machine Learning are broadly available as free software, so the task of this work is the adaption of these algorithms for light curve classification. The deeply analyzed and classi- fied OGLE-III database of variable stars has been chosen to develop, learn and test the software by comparison the classification with results from previous analysis. The final goal is the contribution of a reliable light curve classificator to astroML or a comparable open source project, a possible application is the light curve analysis of data from the Sonneberg plate archive.

Helmuth Meusinger (Poster)Data Mining in Sectra Database of SDSS by Menas of Kohonen Self-Organizing Maps

Over the past 15 years, the Sloan Digital Sky Survey (SDSS) I-III has obtained four million spectra, mostly from galaxies and quasars. Mining such a tremen- dous data pool must lead to the discovery of very rare spectral types. However, whereas the spectroscopic pipeline of the SDSS is accurate and efficient for the vast majority, it fails in case of unusual spectra. We developed the software package ASPECT that is able to organise large spectra data pools by means of similarity in a topological map. The approach is based on the Kohonen method of self-organising maps (SOMs), an artificial neural network algo- rithm that uses unsupervised learning to produce a two-dimensional mapping of higher-order input data. The resulting SOM with its clustering properties constitutes an efficient tool for the selection of certain spectral types providing simultaneously a greater picture of the entire data set. We computed a huge SOM for one million spectra from the SDSS Data Release (DR) 7 and hun- dreds of smaller SOMs for galaxy and quasar spectra from SDSS DR10 binned in narrow redshift intervals. The computation of a SOM for all spectra from the final SDSS III data release, which is more demanding in terms of hardware and software, is in preparation. So far the SOMs were applied mainly for the search of rare spectral types: odd quasar spectra such as weak line quasars or unusual broad absorption line quasars (details are given in the talk in Splin- ter A), E+A galaxies, supernovae, and C stars. Other applications of SOMs computed with ASPECT are also possible: clustering of photometric spectral energy distributions or of structure functions from light curves.

M. Spasovic

First Steps Towards a Photometric Analysis of the Sonneberg Sky Patrol Plates

The Sonneberg Plate Archive is one of the largest in the world, with about 270 000 photographic plates taken mainly at Sonneberg Observatory between 1925 and 1997. Photometric observations were divided into two programs:

• Field Patrol monitoring of about 80 selected fields of the sky at high resolution in the range of 2 /pixel

• Sky Patrol monitoring of entire sky of the northern hemisphere with wide angle cameras at lower resolution of roughly 17 /pixel.

Up to the present almost all photometric measurements on photographic plates were done by aperture photometry or methods similar to aperture pho- tometry. As aperture photometry for a field patrol plate seems to be of suffi- cient quality, the photometric analysis of sky patrol plates appears to be more demanding due to overlapping star images because of the smaller resolution. Early works on the sky patrol plates showed that photometric accuracy can be enhanced with fitting algorithms. The used procedure was a manu- ally supported click-and-fit-routine, not suitable for automatic analysis of vast amount of photographic plates. We will present our progress on deconvolution of overlapping sources on the plates and compare photometric analysis using different methods. Our goal is to get light curves of sufficient quality from sky patrol plates, which can be classified with machine learning algorithms.

Contact

If you have any further questions, please don't hesitate to contact us

Harry Enke:     henke [at] aip [dot] de

Kai Polsterer: kai.polsterer [at] h-its [dot] org

Joachim Wambsganss: jkw [at] ari [dot] uni-heidelberg [dot] de