Speaker
Description
In astronomy as well as other sciences, neural networks are often trained on simulation data with the prospect of being used on real instrument data. Astronomical large-scale surveys are already producing very large datasets, and machine learning will play a crucial role in enabling us to fully utilize all of the available data. Unfortunately, training a model on simulated data and then applying it to observations can potentially lead to a substantial decrease in model accuracy on the new target dataset. Simulated and telescope data represent different data domains, and for an algorithm to work in both, domain-invariant learning is necessary. Here we study the problem of distinguishing between merging and non-merging galaxies in simulated (Illustris-1 cosmological simulation) and observational data (Sloan Digital Sky Survey). Understanding galaxy mergers is an important step in understanding the evolution of matter in the universe, and our ability to utilize and combine knowledge from different data domains will be very important for these efforts. In order to unable deep learning algorithms to work in multiple domains we test two domain adaptation techniques: Maximum Mean Discrepancy (MMD) and Domain Adversarial Neural Networks (DANNs). We show that the addition of domain adaptation improves target domain classification accuracy up to ${\sim}20\%$ in the target domain. With further development, these techniques will allow different domain scientists to construct machine learning models that can successfully combine the knowledge from simulated and instrument data or data originating from multiple instruments.