J/A+A/666/A122      M31/M33 sources with known spectral type (Maravelias+, 2022)

A machine-learning photometric classifier for massive stars in nearby galaxies. I. The method. Maravelias G., Bonanos A.Z., Tramper F., de Wit S., Yang M., Bonfini P. <Astron. Astrophys. 666, A122 (2022)> =2022A&A...666A.122M 2022A&A...666A.122M (SIMBAD/NED BibCode)
ADC_Keywords: Galaxies, nearby ; Spectral types ; Stars, early-type ; Stars, late-type ; Stars, supergiant ; Optical Keywords: stars: massive - stars: mass-loss - stars: evolution - galaxies: individual: WLM, M31, IC 1613, M33, Sextans A - methods: statistical Abstract: Mass loss is a key parameter in the evolution of massive stars. Despite the recent progress in the theoretical understanding of how stars lose mass, discrepancies between theory and observations still hold. Moreover, episodic mass loss in evolved massive stars is not included in models, and the importance of its role in the evolution of massive stars is currently undetermined. A major hindrance to determining the role of episodic mass loss is the lack of large samples of classified stars. Given the recent availability of extensive photometric catalogs from various surveys spanning a range of metallicity environments, we aim to remedy the situation by applying machine-learning techniques to these catalogs. We compiled a large catalog of known massive stars in M31 and M33 using IR (Spitzer) and optical (Pan-STARRS) photometry, as well as Gaia astrometric information, which helps with foreground source detection. We grouped them into seven classes (Blue, Red, Yellow, B[e] supergiants, Luminous Blue Variables, Wolf-Rayet stars, and outliers, e.g., quasi-stellar objects and background galaxies). As this training set is highly imbalanced, we implemented synthetic data generation to populate the underrepresented classes and improve separation by undersampling the majority class. We built an ensemble classifier utilizing color indices as features. The probabilities from three machine-learning algorithms (Support Vector Classification, Random Forest, and Multilayer Perceptron) were combined to obtain the final classification. The overall weighted balanced accuracy of the classifier is ∼83%. Red supergiants are always recovered at ∼94%. Blue and Yellow supergiants, B[e] supergiants, and background galaxies achieve ∼50-80%. Wolf-Rayet sources are detected at ∼45%, while Luminous Blue Variables are recovered at ∼30% from one method mainly. This is primarily due to the small sample sizes of these classes. In addition, the mixing of spectral types, as there are no strict boundaries in the features space (color indices) between those classes, complicates the classification. In an independent application of the classifier to other galaxies (IC 1613, WLM, and Sextans A), we obtained an overall accuracy of ∼70%. This discrepancy is attributed to the different metallicity and extinction effects of the host galaxies. Motivated by the presence of missing values, we investigated the impact of missing data imputation using a simple replacement with mean values and an iterative imputer, which proved to be more capable. We also investigated the feature importance to find that r-i and y-[3.6] are the most important, although different classes are sensitive to different features (with potential improvement with additional features). The prediction capability of the classifier is limited by the available number of sources per class (which corresponds to the sampling of their feature space), reflecting the rarity of these objects and the possible physical links between these massive star phases. Our methodology is also efficient in correctly classifying sources with missing data as well as at lower metallicities (with some accuracy loss), making it an excellent tool for accentuating interesting objects and prioritizing targets for observations. Description: The catalog of sources in M31 and M33 galaxies with known spectral types, as collected from the literature (see Table 1 for the list of references and numbers). File Summary: -------------------------------------------------------------------------------- FileName Lrecl Records Explanations -------------------------------------------------------------------------------- ReadMe 80 . This file tablea1.dat 50 2530 Sources with known spectral types -------------------------------------------------------------------------------- Byte-by-byte Description of file: tablea1.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 8 A8 --- ID Object ID in current work, M31-NNNN or M33-NNNN 11- 18 F8.5 deg RAdeg Right Ascension (J2000) 21- 28 F8.5 deg DEdeg Declination (J2000) 31- 46 A16 --- SpType Spectral type from literature 49- 50 I2 --- r_SpType Reference for spectral type (1) -------------------------------------------------------------------------------- Note (1): References as follows: 1 = Gordon et al., 2016ApJ...825...50G 2016ApJ...825...50G, Cat. J/ApJ/825/50 2 = Massey et al., 2016AJ....152...62M 2016AJ....152...62M, Cat. J/AJ/152/62 3 = Massey et al., 2019AJ....157..227M 2019AJ....157..227M 4 = Neugent et al., 2019ApJ...875..124N 2019ApJ...875..124N, Cat. J/ApJ/875/124 5 = Drout et al., 2009ApJ...703..441D 2009ApJ...703..441D, Cat. J/ApJ/703/441 6 = Massey et al., 2009ApJ...703..420M 2009ApJ...703..420M, Cat. J/ApJ/703/420 7 = Humphreys et al., 2017ApJ...836...64H 2017ApJ...836...64H, Cat. J/ApJ/836/64 8 = Neugent et al., 2012ApJ...759...11N 2012ApJ...759...11N, Cat. J/ApJ/759/11 9 = Kraus, M. 2019, Galaxies, 7, 83 10 = Kourniotis et al., 2018MNRAS.480.3706K 2018MNRAS.480.3706K 11 = Neugent & Massey, 2011ApJ...733..123N 2011ApJ...733..123N, Cat. J/ApJ/733/123 12 = Massey & Johnson, 1998ApJ...505..793M 1998ApJ...505..793M, Cat. J/ApJ/505/793 13 = Bruhweiler et al., 2003AJ....125.3082B 2003AJ....125.3082B 14 = Massey, 1998ApJ...501..153M 1998ApJ...501..153M, Cat. J/ApJ/501/153 15 = Humphreys et al., 2014ApJ...790...48H 2014ApJ...790...48H, Cat. J/ApJ/790/48 16 = Massey et al., 1996ApJ...469..629M 1996ApJ...469..629M, Cat. J/ApJ/469/629 17 = Martin et al., 2017AJ....154...81M 2017AJ....154...81M, Cat. J/AJ/154/81 18 = Massey et al., 2007AJ....134.2474M 2007AJ....134.2474M, Cat. J/AJ/134/2474 19 = Drout et al., 2012ApJ...750...97D 2012ApJ...750...97D, Cat. J/ApJ/750/97 -------------------------------------------------------------------------------- Acknowledgements: Grigoris Maravelias, maravelias(at)noa.gr
(End) Patricia Vannier [CDS] 27-May-2022
The document above follows the rules of the Standard Description for Astronomical Catalogues; from this documentation it is possible to generate f77 program to load files into arrays or line by line