J/A+A/649/A81     KiDSDR4 QSOs photometric redshifts catalog (Nakoneczny+, 2021)

Photometric selection and redshifts for quasars in the Kilo-Degree Survey Data Release 4. Nakoneczny S.J., Bilicki M., Pollo A., Asgari M., Dvornik A., Erben T., Giblin B., Heymans C., Hildebrandt H., Kannawadi A., Kuijken K., Napolitano N.R., Valentijn E. <Astron. Astrophys. 649, A81 (2021)> =2021A&A...649A..81N 2021A&A...649A..81N (SIMBAD/NED BibCode)
ADC_Keywords: Surveys ; Active gal. nuclei ; QSOs ; Galaxy catalogs; Galaxies, photometry ; Redshifts ; Colors Keywords: methods: data analysis - methods: observational - catalog - surveys - quasars: general - large-scale structure of Universe Abstract: We present a catalog of quasars with their corresponding redshifts derived from the photometric Kilo-Degree Survey (KiDS) Data Release 4. We achieved it by training machine learning (ML) models using optical ugri and near-infrared ZYJHKs bands, on objects known from SDSS spectroscopy. We define inference subsets from the 45 million objects of the KiDS photometric data limited to 9-band detections, based on a feature space built from magnitudes and their combinations. We show that projections of the high-dimensional feature space on two dimensions can be successfully used instead of the standard color-color plots, to investigate the photometric estimations, compare them with spectroscopic data, and efficiently support the process of building a catalog. The model selection and fine-tuning employs two subsets of objects: those randomly selected and the faintest ones, which allows us to properly fit the bias vs. variance trade-off. We test three ML models: Random Forest (RF), XGBoost (XGB) and Artificial Neural Network (ANN). We find that XGB is the most robust and straightforward model for classification, while ANN performs the best for combined classification and redshift. The ANN inference results are tested using number counts, Gaia parallaxes and other quasar catalogs external to the training set. Based on these tests, we derive the minimum classification probability for quasar candidates which provides the best purity vs. completeness trade-off: p(QSO_cand)>0.9 for r<22, and p(QSO_cand)>0.98 for 22<r<23.5. We find 158000 quasar candidates in the safe inference subset (r<22), and further 185000 in the reliable extrapolation regime (22<r<23.5). Test-data purity equals 97%, completeness is 94%, the latter dropping by 3% in the extrapolation to data fainter by one magnitude than the training set. The photometric redshifts are derived with ANN and modeled with Gaussian uncertainties. Test-data redshift error (mean and scatter) equals 0.009±0.12 in the safe subset, and -0.0004±0.19 in the extrapolation, averaged over redshift range 0.14<z<3.63 (1st and 99th percentiles). Our success of the extrapolation challenges the way that models are optimized and applied at the faint data end. The resulting catalog is ready for cosmology and Active Galactic Nucleus (AGN) studies. Description: The catalog results from applying artificial neural networks to process the KiDS data limited to 9-band detections. The machine learning (ML) models are trained on KiDS objects cross-matched with the spectroscopic SDSS survey. We address the problem of extrapolation to KiDS objects fainter than the SDSS limit by properly generalising ML models, and creating inference subsets which describe the reliability of estimations: safe, extrapolation, unsafe. We provide the suggested cuts on magnitude and probability of photometric classification, which are derived from validating the catalog with several methods. File Summary: -------------------------------------------------------------------------------- FileName Lrecl Records Explanations -------------------------------------------------------------------------------- ReadMe 80 . This file qsos.dat 155 1095711 *Catalog of quasar candidates all.dat 155 45469955 *Catalog of all machine learning estimates -------------------------------------------------------------------------------- Note on qsos.dat: Data limited to: 9-band detections, r<25, CLASS_STAR<0.2 or CLASS_STAR>0.8, QSO_PHOTO>0.9. Note on all.dat: Data limited to 9-band detections. -------------------------------------------------------------------------------- See also: J/A+A/624/A13 : KiDS DR3 QSO catalog (Nakoneczny+, 2019) Byte-by-byte Description of file: qsos.dat all.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 29 A29 --- KiDSDR4 KiDSDR4 designation, KiDSDR4 JHHMMSS.sss+DDMMSS.ss 31- 40 F10.6 deg RAdeg [0/360] Centroid sky position right ascension (J2000) 42- 51 F10.6 deg DEdeg Centroid sky position declination (J2000) 53- 59 F7.4 mag rmag r-band GAaP magnitude with optimal MIN_APER (extinction corrected) 61- 68 F8.6 --- ClassStar SExtractor star-galaxy classifier 70- 74 I5 --- Mask 9-band mask information 76- 87 E12.6 --- PGalaxy Probability that the source is a galaxy 89-100 E12.6 --- PQSO Probability that the source is a QSO 102-113 E12.6 --- PStar Probability that the source is a star 115-120 A6 --- Class Object class with the highest probability, GALAXY, QSO or STAR 122-130 F9.7 ---- zph Photometric redshift 132-141 F10.8 --- e_zph Uncertainty of photometric redshift 143-155 A13 --- Subset ML inference subset (1) -------------------------------------------------------------------------------- Note (1): see Section 2.2 in the paper. Values as follows: safe = safe subset is r<22 and a stellarity index of ∉ (0.2, 0.8) extrapolation = extrapolation subset is r ∈ (22, 25) and a stellarity index of ∉ (0.2, 0.8) unsafe = unsafe subset is r>25 or a stellarity index of ∈ (0.2, 0.8) -------------------------------------------------------------------------------- Acknowledgements: Szymon J. Nakoneczny, szymon.nakoneczny(at)ncbj.gov.pl
(End) Szymon J. Nakoneczny [NCBJ, Poland], Patricia Vannier [CDS] 22-Feb-2021
The document above follows the rules of the Standard Description for Astronomical Catalogues; from this documentation it is possible to generate f77 program to load files into arrays or line by line