J/MNRAS/476/2117Outliers and similarity in APOGEE (Reis+, 2018)

Detecting outliers and learning complex structures with large spectroscopic surveys - a case study with APOGEE stars. Reis I., Poznanski D., Baron D., Zasowski G., Shahaf S. <Mon. Not. R. Astron. Soc. 476, 2117 (2018)> =2018MNRAS.476.2117R (SIMBAD/NED BibCode)ADC_Keywords: Spectroscopy ; Stars, normalKeywords: methods: data analysis - stars: general - stars: peculiarAbstract: In this work we apply and expand on a recently introduced outlier detection algorithm that is based on an unsupervised random forest. We use the algorithm to calculate a similarity measure for stellar spectra from the Apache Point Observatory Galactic Evolution Experiment (APOGEE). We show that the similarity measure traces non-trivial physical properties and contains information about complex structures in the data. We use it for visualization and clustering of the dataset, and discuss its ability to find groups of highly similar objects, including spectroscopic twins. Using the similarity matrix to search the dataset for objects allows us to find objects that are impossible to find using their best fitting model parameters. This includes extreme objects for which the models fail, and rare objects that are outside the scope of the model. We use the similarity measure to detect outliers in the dataset, and find a number of previously unknown Be-type stars, spectroscopic binaries, carbon rich stars, young stars, and a few that we cannot interpret. Our work further demonstrates the potential for scientific discovery when combining machine learning methods with modern survey data.Description: t-SNE is a dimensionality reduction algorithm that is particularly well suited for the visualization of high-dimensional datasets. We use t-SNE to visualize our distance matrix. A-priori, these distances could define a space with almost as many dimensions as objects, i.e., tens of thousand of dimensions. Obviously, since many stars are quite similar, and their spectra are defined by a few physical parameters, the minimal spanning space might be smaller. By using t-SNE we can examine the structure of our sample projected into 2D. We use our distance matrix as input to the t-SNE algorithm and in return get a 2D map of the objects in our dataset. For each star in a sample of 183232 APOGEE stars, the APOGEE IDs of the 99 stars with most similar spectra (according to the method described in paper), ordered by similarity.File Summary:

FileName Lrecl Records Explanations

ReadMe 80 . This file apogeenn.dat 1899 183232 Nearest neighbors APOGEE IDs distance.dat 1602 183232 Distances to nearest neighbors tsnecoor.dat 56 193556 t-SNE coordinates (map in paper)

See also: J/AJ/146/133 : SDSS-III APOGEE DR10 stellar parameters (Meszaros+, 2013) J/ApJ/794/125 : IN-SYNC. I. APOGEE stellar parameters (Cottaar+, 2014) J/A+A/589/A80 : APOGEE strings (Hacar+, 2016) J/A+A/594/A43 : APOGEE/Kepler sample stars abundances (Hawkins+, 2016) J/MNRAS/460/3179 : APOGEE stars distance and extinction (Wang+, 2016)Byte-by-byte Description of file: apogeenn.dat

Bytes Format Units Label Explanations

1- 18 A18 --- Target Target name 20- 37 A18 --- NN1 1st nearest neighbor of Target object 39- 56 A18 --- NN2 2sd nearest neighbor of Target object 58- 75 A18 --- NN3 3rd nearest neighbor of Target object 77- 94 A18 --- NN4 4th nearest neighbor of Target object 96-113 A18 --- NN5 5th nearest neighbor of Target object 115-132 A18 --- NN6 6th nearest neighbor of Target object 134-151 A18 --- NN7 7th nearest neighbor of Target object 153-170 A18 --- NN8 8th nearest neighbor of Target object 172-189 A18 --- NN9 9th nearest neighbor of Target object 191-208 A18 --- NN10 10th nearest neighbor of Target object 210-227 A18 --- NN11 11th nearest neighbor of Target object 229-246 A18 --- NN12 12th nearest neighbor of Target object 248-265 A18 --- NN13 13th nearest neighbor of Target object 267-284 A18 --- NN14 14th nearest neighbor of Target object 286-303 A18 --- NN15 15th nearest neighbor of Target object 305-322 A18 --- NN16 16th nearest neighbor of Target object 324-341 A18 --- NN17 17th nearest neighbor of Target object 343-360 A18 --- NN18 18th nearest neighbor of Target object 362-379 A18 --- NN19 19th nearest neighbor of Target object 381-398 A18 --- NN20 20th nearest neighbor of Target object 400-417 A18 --- NN21 21th nearest neighbor of Target object 419-436 A18 --- NN22 22th nearest neighbor of Target object 438-455 A18 --- NN23 23th nearest neighbor of Target object 457-474 A18 --- NN24 24th nearest neighbor of Target object 476-493 A18 --- NN25 25th nearest neighbor of Target object 495-512 A18 --- NN26 26th nearest neighbor of Target object 514-531 A18 --- NN27 27th nearest neighbor of Target object 533-550 A18 --- NN28 28th nearest neighbor of Target object 552-569 A18 --- NN29 29th nearest neighbor of Target object 571-588 A18 --- NN30 30th nearest neighbor of Target object 590-607 A18 --- NN31 31th nearest neighbor of Target object 609-626 A18 --- NN32 32th nearest neighbor of Target object 628-645 A18 --- NN33 33th nearest neighbor of Target object 647-664 A18 --- NN34 34th nearest neighbor of Target object 666-683 A18 --- NN35 35th nearest neighbor of Target object 685-702 A18 --- NN36 36th nearest neighbor of Target object 704-721 A18 --- NN37 37th nearest neighbor of Target object 723-740 A18 --- NN38 38th nearest neighbor of Target object 742-759 A18 --- NN39 39th nearest neighbor of Target object 761-778 A18 --- NN40 40th nearest neighbor of Target object 780-797 A18 --- NN41 41th nearest neighbor of Target object 799-816 A18 --- NN42 42th nearest neighbor of Target object 818-835 A18 --- NN43 43th nearest neighbor of Target object 837-854 A18 --- NN44 44th nearest neighbor of Target object 856-873 A18 --- NN45 45th nearest neighbor of Target object 875-892 A18 --- NN46 46th nearest neighbor of Target object 894-911 A18 --- NN47 47th nearest neighbor of Target object 913-930 A18 --- NN48 48th nearest neighbor of Target object 932-949 A18 --- NN49 49th nearest neighbor of Target object 951-968 A18 --- NN50 50th nearest neighbor of Target object 970-987 A18 --- NN51 51th nearest neighbor of Target object 989-1006 A18 --- NN52 52th nearest neighbor of Target object 1008-1025 A18 --- NN53 53th nearest neighbor of Target object 1027-1044 A18 --- NN54 54th nearest neighbor of Target object 1046-1063 A18 --- NN55 55th nearest neighbor of Target object 1065-1082 A18 --- NN56 56th nearest neighbor of Target object 1084-1101 A18 --- NN57 57th nearest neighbor of Target object 1103-1120 A18 --- NN58 58th nearest neighbor of Target object 1122-1139 A18 --- NN59 59th nearest neighbor of Target object 1141-1158 A18 --- NN60 60th nearest neighbor of Target object 1160-1177 A18 --- NN61 61th nearest neighbor of Target object 1179-1196 A18 --- NN62 62th nearest neighbor of Target object 1198-1215 A18 --- NN63 63th nearest neighbor of Target object 1217-1234 A18 --- NN64 64th nearest neighbor of Target object 1236-1253 A18 --- NN65 65th nearest neighbor of Target object 1255-1272 A18 --- NN66 66th nearest neighbor of Target object 1274-1291 A18 --- NN67 67th nearest neighbor of Target object 1293-1310 A18 --- NN68 68th nearest neighbor of Target object 1312-1329 A18 --- NN69 69th nearest neighbor of Target object 1331-1348 A18 --- NN70 70th nearest neighbor of Target object 1350-1367 A18 --- NN71 71th nearest neighbor of Target object 1369-1386 A18 --- NN72 72th nearest neighbor of Target object 1388-1405 A18 --- NN73 73th nearest neighbor of Target object 1407-1424 A18 --- NN74 74th nearest neighbor of Target object 1426-1443 A18 --- NN75 75th nearest neighbor of Target object 1445-1462 A18 --- NN76 76th nearest neighbor of Target object 1464-1481 A18 --- NN77 77th nearest neighbor of Target object 1483-1500 A18 --- NN78 78th nearest neighbor of Target object 1502-1519 A18 --- NN79 79th nearest neighbor of Target object 1521-1538 A18 --- NN80 80th nearest neighbor of Target object 1540-1557 A18 --- NN81 81th nearest neighbor of Target object 1559-1576 A18 --- NN82 82th nearest neighbor of Target object 1578-1595 A18 --- NN83 83th nearest neighbor of Target object 1597-1614 A18 --- NN84 84th nearest neighbor of Target object 1616-1633 A18 --- NN85 85th nearest neighbor of Target object 1635-1652 A18 --- NN86 86th nearest neighbor of Target object 1654-1671 A18 --- NN87 87th nearest neighbor of Target object 1673-1690 A18 --- NN88 88th nearest neighbor of Target object 1692-1709 A18 --- NN89 89th nearest neighbor of Target object 1711-1728 A18 --- NN90 90th nearest neighbor of Target object 1730-1747 A18 --- NN91 91th nearest neighbor of Target object 1749-1766 A18 --- NN92 92th nearest neighbor of Target object 1768-1785 A18 --- NN93 93th nearest neighbor of Target object 1787-1804 A18 --- NN94 94th nearest neighbor of Target object 1806-1823 A18 --- NN95 95th nearest neighbor of Target object 1825-1842 A18 --- NN96 96th nearest neighbor of Target object 1844-1861 A18 --- NN97 97th nearest neighbor of Target object 1863-1880 A18 --- NN98 98th nearest neighbor of Target object 1882-1899 A18 --- NN99 99th nearest neighbor of Target object

Byte-by-byte Description of file: distance.dat

Bytes Format Units Label Explanations

1- 18 A18 --- Target Target name 20- 34 F15.13 --- Dist1 Distance matrix to 1st nearest neighbor of Target 36- 50 F15.13 --- Dist2 Distance matrix to 2nd nearest neighbor of Target 52- 66 F15.13 --- Dist3 Distance matrix to 3rd nearest neighbor of Target 68- 82 F15.13 --- Dist4 Distance matrix to 4th nearest neighbor of Target 84- 98 F15.13 --- Dist5 Distance matrix to 5th nearest neighbor of Target 100-114 F15.13 --- Dist6 Distance matrix to 6th nearest neighbor of Target 116-130 F15.13 --- Dist7 Distance matrix to 7th nearest neighbor of Target 132-146 F15.13 --- Dist8 Distance matrix to 8th nearest neighbor of Target 148-162 F15.13 --- Dist9 Distance matrix to 9th nearest neighbor of Target 164-178 F15.13 --- Dist10 Distance matrix to 10th nearest neighbor of Target 180-194 F15.13 --- Dist11 Distance matrix to 11th nearest neighbor of Target 196-210 F15.13 --- Dist12 Distance matrix to 12th nearest neighbor of Target 212-226 F15.13 --- Dist13 Distance matrix to 13th nearest neighbor of Target 228-242 F15.13 --- Dist14 Distance matrix to 14th nearest neighbor of Target 244-258 F15.13 --- Dist15 Distance matrix to 15th nearest neighbor of Target 260-274 F15.13 --- Dist16 Distance matrix to 16th nearest neighbor of Target 276-290 F15.13 --- Dist17 Distance matrix to 17th nearest neighbor of Target 292-306 F15.13 --- Dist18 Distance matrix to 18th nearest neighbor of Target 308-322 F15.13 --- Dist19 Distance matrix to 19th nearest neighbor of Target 324-338 F15.13 --- Dist20 Distance matrix to 20th nearest neighbor of Target 340-354 F15.13 --- Dist21 Distance matrix to 21th nearest neighbor of Target 356-370 F15.13 --- Dist22 Distance matrix to 22th nearest neighbor of Target 372-386 F15.13 --- Dist23 Distance matrix to 23th nearest neighbor of Target 388-402 F15.13 --- Dist24 Distance matrix to 24th nearest neighbor of Target 404-418 F15.13 --- Dist25 Distance matrix to 25th nearest neighbor of Target 420-434 F15.13 --- Dist26 Distance matrix to 26th nearest neighbor of Target 436-450 F15.13 --- Dist27 Distance matrix to 27th nearest neighbor of Target 452-466 F15.13 --- Dist28 Distance matrix to 28th nearest neighbor of Target 468-482 F15.13 --- Dist29 Distance matrix to 29th nearest neighbor of Target 484-498 F15.13 --- Dist30 Distance matrix to 30th nearest neighbor of Target 500-514 F15.13 --- Dist31 Distance matrix to 31th nearest neighbor of Target 516-530 F15.13 --- Dist32 Distance matrix to 32th nearest neighbor of Target 532-546 F15.13 --- Dist33 Distance matrix to 33th nearest neighbor of Target 548-562 F15.13 --- Dist34 Distance matrix to 34th nearest neighbor of Target 564-578 F15.13 --- Dist35 Distance matrix to 35th nearest neighbor of Target 580-594 F15.13 --- Dist36 Distance matrix to 36th nearest neighbor of Target 596-610 F15.13 --- Dist37 Distance matrix to 37th nearest neighbor of Target 612-626 F15.13 --- Dist38 Distance matrix to 38th nearest neighbor of Target 628-642 F15.13 --- Dist39 Distance matrix to 39th nearest neighbor of Target 644-658 F15.13 --- Dist40 Distance matrix to 40th nearest neighbor of Target 660-674 F15.13 --- Dist41 Distance matrix to 41th nearest neighbor of Target 676-690 F15.13 --- Dist42 Distance matrix to 42th nearest neighbor of Target 692-706 F15.13 --- Dist43 Distance matrix to 43th nearest neighbor of Target 708-722 F15.13 --- Dist44 Distance matrix to 44th nearest neighbor of Target 724-738 F15.13 --- Dist45 Distance matrix to 45th nearest neighbor of Target 740-754 F15.13 --- Dist46 Distance matrix to 46th nearest neighbor of Target 756-770 F15.13 --- Dist47 Distance matrix to 47th nearest neighbor of Target 772-786 F15.13 --- Dist48 Distance matrix to 48th nearest neighbor of Target 788-802 F15.13 --- Dist49 Distance matrix to 49th nearest neighbor of Target 804-818 F15.13 --- Dist50 Distance matrix to 50th nearest neighbor of Target 820-834 F15.13 --- Dist51 Distance matrix to 51th nearest neighbor of Target 836-850 F15.13 --- Dist52 Distance matrix to 52th nearest neighbor of Target 852-866 F15.13 --- Dist53 Distance matrix to 53th nearest neighbor of Target 868-882 F15.13 --- Dist54 Distance matrix to 54th nearest neighbor of Target 884-898 F15.13 --- Dist55 Distance matrix to 55th nearest neighbor of Target 900-914 F15.13 --- Dist56 Distance matrix to 56th nearest neighbor of Target 916-930 F15.13 --- Dist57 Distance matrix to 57th nearest neighbor of Target 932-946 F15.13 --- Dist58 Distance matrix to 58th nearest neighbor of Target 948-962 F15.13 --- Dist59 Distance matrix to 59th nearest neighbor of Target 964-978 F15.13 --- Dist60 Distance matrix to 60th nearest neighbor of Target 980-994 F15.13 --- Dist61 Distance matrix to 61th nearest neighbor of Target 996-1010 F15.13 --- Dist62 Distance matrix to 62th nearest neighbor of Target 1012-1026 F15.13 --- Dist63 Distance matrix to 63th nearest neighbor of Target 1028-1042 F15.13 --- Dist64 Distance matrix to 64th nearest neighbor of Target 1044-1058 F15.13 --- Dist65 Distance matrix to 65th nearest neighbor of Target 1060-1074 F15.13 --- Dist66 Distance matrix to 66th nearest neighbor of Target 1076-1090 F15.13 --- Dist67 Distance matrix to 67th nearest neighbor of Target 1092-1106 F15.13 --- Dist68 Distance matrix to 68th nearest neighbor of Target 1108-1122 F15.13 --- Dist69 Distance matrix to 69th nearest neighbor of Target 1124-1138 F15.13 --- Dist70 Distance matrix to 70th nearest neighbor of Target 1140-1154 F15.13 --- Dist71 Distance matrix to 71th nearest neighbor of Target 1156-1170 F15.13 --- Dist72 Distance matrix to 72th nearest neighbor of Target 1172-1186 F15.13 --- Dist73 Distance matrix to 73th nearest neighbor of Target 1188-1202 F15.13 --- Dist74 Distance matrix to 74th nearest neighbor of Target 1204-1218 F15.13 --- Dist75 Distance matrix to 75th nearest neighbor of Target 1220-1234 F15.13 --- Dist76 Distance matrix to 76th nearest neighbor of Target 1236-1250 F15.13 --- Dist77 Distance matrix to 77th nearest neighbor of Target 1252-1266 F15.13 --- Dist78 Distance matrix to 78th nearest neighbor of Target 1268-1282 F15.13 --- Dist79 Distance matrix to 79th nearest neighbor of Target 1284-1298 F15.13 --- Dist80 Distance matrix to 80th nearest neighbor of Target 1300-1314 F15.13 --- Dist81 Distance matrix to 81th nearest neighbor of Target 1316-1330 F15.13 --- Dist82 Distance matrix to 82th nearest neighbor of Target 1332-1346 F15.13 --- Dist83 Distance matrix to 83th nearest neighbor of Target 1348-1362 F15.13 --- Dist84 Distance matrix to 84th nearest neighbor of Target 1364-1378 F15.13 --- Dist85 Distance matrix to 85th nearest neighbor of Target 1380-1394 F15.13 --- Dist86 Distance matrix to 86th nearest neighbor of Target 1396-1410 F15.13 --- Dist87 Distance matrix to 87th nearest neighbor of Target 1412-1426 F15.13 --- Dist88 Distance matrix to 88th nearest neighbor of Target 1428-1442 F15.13 --- Dist89 Distance matrix to 89th nearest neighbor of Target 1444-1458 F15.13 --- Dist90 Distance matrix to 90th nearest neighbor of Target 1460-1474 F15.13 --- Dist91 Distance matrix to 91th nearest neighbor of Target 1476-1490 F15.13 --- Dist92 Distance matrix to 92th nearest neighbor of Target 1492-1506 F15.13 --- Dist93 Distance matrix to 93th nearest neighbor of Target 1508-1522 F15.13 --- Dist94 Distance matrix to 94th nearest neighbor of Target 1524-1538 F15.13 --- Dist95 Distance matrix to 95th nearest neighbor of Target 1540-1554 F15.13 --- Dist96 Distance matrix to 96th nearest neighbor of Target 1556-1570 F15.13 --- Dist97 Distance matrix to 97th nearest neighbor of Target 1572-1586 F15.13 --- Dist98 Distance matrix to 98th nearest neighbor of Target 1588-1602 F15.13 --- Dist99 Distance matrix to 99th nearest neighbor of Target

Byte-by-byte Description of file: tsnecoor.dat

Bytes Format Units Label Explanations

1- 18 A18 --- Target Target name 20- 37 E18.15 --- t-SNE-X t-SNE map X coordinate 39- 56 E18.15 --- t-SNE-Y t-SNE map Y coordinate

Acknowledgements: Itamar Reis, itamarreis(at)mail.tau.ac.il(End)Itamar Reis [Tel-Aviv Uni.], Patricia Vannier [CDS] 28-Dec-2017

