J/A+A/649/A81


J/A+A/649/A81     KiDSDR4 QSOs photometric redshifts catalog (Nakoneczny+, 2021)
Photometric selection and redshifts for quasars in the
Kilo-Degree Survey Data Release 4.
    Nakoneczny S.J., Bilicki M., Pollo A., Asgari M., Dvornik A., Erben T.,
    Giblin B., Heymans C., Hildebrandt H., Kannawadi A., Kuijken K.,
    Napolitano N.R., Valentijn E.
    <Astron. Astrophys. 649, A81 (2021)>
    =2021A&A...649A..81N 2021A&A...649A..81N        (SIMBAD/NED BibCode)
ADC_Keywords: Surveys ; Active gal. nuclei ; QSOs ; Galaxy catalogs;
              Galaxies, photometry ; Redshifts ; Colors
Keywords: methods: data analysis - methods: observational - catalog - surveys -
          quasars: general - large-scale structure of Universe

Abstract:
    We present a catalog of quasars with their corresponding redshifts
    derived from the photometric Kilo-Degree Survey (KiDS) Data Release 4.
    We achieved it by training machine learning (ML) models using optical
    ugri and near-infrared ZYJHKs bands, on objects known from SDSS
    spectroscopy. We define inference subsets from the 45 million objects
    of the KiDS photometric data limited to 9-band detections, based on a
    feature space built from magnitudes and their combinations. We show
    that projections of the high-dimensional feature space on two
    dimensions can be successfully used instead of the standard
    color-color plots, to investigate the photometric estimations, compare
    them with spectroscopic data, and efficiently support the process of
    building a catalog. The model selection and fine-tuning employs two
    subsets of objects: those randomly selected and the faintest ones,
    which allows us to properly fit the bias vs. variance trade-off. We
    test three ML models: Random Forest (RF), XGBoost (XGB) and Artificial
    Neural Network (ANN). We find that XGB is the most robust and
    straightforward model for classification, while ANN performs the best
    for combined classification and redshift. The ANN inference results
    are tested using number counts, Gaia parallaxes and other quasar
    catalogs external to the training set. Based on these tests, we derive
    the minimum classification probability for quasar candidates which
    provides the best purity vs. completeness trade-off: p(QSO_cand)>0.9
    for r<22, and p(QSO_cand)>0.98 for 22<r<23.5. We find 158000 quasar
    candidates in the safe inference subset (r<22), and further 185000 in
    the reliable extrapolation regime (22<r<23.5). Test-data purity equals
    97%, completeness is 94%, the latter dropping by 3% in the
    extrapolation to data fainter by one magnitude than the training set.
    The photometric redshifts are derived with ANN and modeled with
    Gaussian uncertainties. Test-data redshift error (mean and scatter)
    equals 0.009±0.12 in the safe subset, and -0.0004±0.19 in the
    extrapolation, averaged over redshift range 0.14<z<3.63 (1st and 99th
    percentiles). Our success of the extrapolation challenges the way that
    models are optimized and applied at the faint data end. The resulting
    catalog is ready for cosmology and Active Galactic Nucleus (AGN)
    studies.

Description:
    The catalog results from applying artificial neural networks to
    process the KiDS data limited to 9-band detections. The machine
    learning (ML) models are trained on KiDS objects cross-matched with
    the spectroscopic SDSS survey. We address the problem of extrapolation
    to KiDS objects fainter than the SDSS limit by properly generalising
    ML models, and creating inference subsets which describe the
    reliability of estimations: safe, extrapolation, unsafe. We provide
    the suggested cuts on magnitude and probability of photometric
    classification, which are derived from validating the catalog with
    several methods.

File Summary:
--------------------------------------------------------------------------------
 FileName      Lrecl  Records   Explanations
--------------------------------------------------------------------------------
ReadMe            80        .   This file
qsos.dat         155  1095711  *Catalog of quasar candidates
all.dat          155 45469955  *Catalog of all machine learning estimates
--------------------------------------------------------------------------------
Note on qsos.dat: Data limited to:
   9-band detections, r<25, CLASS_STAR<0.2 or CLASS_STAR>0.8, QSO_PHOTO>0.9.
Note on all.dat: Data limited to 9-band detections.
--------------------------------------------------------------------------------

See also:
   J/A+A/624/A13 : KiDS DR3 QSO catalog (Nakoneczny+, 2019)

Byte-by-byte Description of file: qsos.dat all.dat
--------------------------------------------------------------------------------
   Bytes Format Units  Label     Explanations
--------------------------------------------------------------------------------
   1- 29  A29   ---    KiDSDR4   KiDSDR4 designation,
                                  KiDSDR4 JHHMMSS.sss+DDMMSS.ss
  31- 40  F10.6 deg    RAdeg     [0/360] Centroid sky position
                                  right ascension (J2000)
  42- 51  F10.6 deg    DEdeg     Centroid sky position declination (J2000)
  53- 59  F7.4  mag    rmag      r-band GAaP magnitude with optimal MIN_APER
                                  (extinction corrected)
  61- 68  F8.6  ---    ClassStar SExtractor star-galaxy classifier
  70- 74  I5    ---    Mask      9-band mask information
  76- 87  E12.6 ---    PGalaxy   Probability that the source is a galaxy
  89-100  E12.6 ---    PQSO      Probability that the source is a QSO
 102-113  E12.6 ---    PStar     Probability that the source is a star
 115-120  A6    ---    Class     Object class with the highest probability,
                                  GALAXY, QSO or STAR
 122-130  F9.7  ----   zph       Photometric redshift
 132-141  F10.8 ---  e_zph       Uncertainty of photometric redshift
 143-155  A13   ---    Subset    ML inference subset (1)
--------------------------------------------------------------------------------
Note (1): see Section 2.2 in the paper. Values as follows:
  safe          = safe subset is r<22 and
                   a stellarity index of ∉ (0.2, 0.8)
  extrapolation = extrapolation subset is r ∈ (22, 25) and
                   a stellarity index of ∉ (0.2, 0.8)
  unsafe        = unsafe subset is r>25 or a stellarity index of ∈ (0.2, 0.8)
--------------------------------------------------------------------------------

Acknowledgements:
    Szymon J. Nakoneczny, szymon.nakoneczny(at)ncbj.gov.pl

(End)   Szymon J. Nakoneczny [NCBJ, Poland], Patricia Vannier [CDS]  22-Feb-2021


  The document above
  follows the rules of the Standard Description for Astronomical Catalogues;

    from this documentation it is possible to generate
    f77 program to load files
         into arrays
         or line by line
  
     




















Contact - Legals