ANNOTATED BIBLIOGRAPHY OF
MULTIVARIATE STATISTICAL METHODS IN ASTRONOMY
F. Murtagh and A. Heck
Version: 1986
Application studies involving the use of multivariate
statistical methods in astronomy are referenced, along with
many annotations as to the methods employed and the
significance of the work. Additionally, general works of
reference are listed. In all more than 150 references are
listed, and an index of authors is included.
INTRODUCTION
When faced with large quantities of data, the use of statistical
data analysis and pattern recognition algorithms
can offer considerable time-savings, together with ensuring
consistency and "objectivity" of treatment. Being multivariate
(multidimensional), they allow the simultaneous treatment of
many variables.
There have been many types of multivariate statistics algorithms, but
among the most commonly used are algorithms for Cluster Analysis,
Discriminant Analysis, Principal Components (or Factor) Analysis,
and Regression Analysis.
Given a set of objects, each characterised on the same set of
variables, clustering methods will produce groups of the objects.
The objects in the resulting groups will either be closer to
one another than to non-group members, or satisfy some other
homogeneity or compactness criterion. "Closeness" is most often
defined by the Euclidean distance, but other metrics may well
merit consideration. The question of "standardization" or
"normalization" (centring the objects in the multidimensional
space and rescaling them to have unit variance) may also have to
be addressed before carrying out the clustering.
Discriminant methods allow assigment of objects to already existing
groups. Such methods may use locally-defined metrics, and thus be
sensitive to different parts of the parameter space; or they may be
based on Bayesian probability. In Discriminant Analysis, the first
step will be to choose a training set; then, in a second step, new
items are assigned to the most appropriate class of items.
Discriminant Analysis has been refered to as "supervised classification"
(because of the need to define the training set, - perhaps by
a visual study of a relatively small number of objects), while Cluster
Analysis has been termed "unsupervised classification".
Principal Components Analysis is used for dimensionality reduction
The best linear combinations of the axes in the initial parameter space
are sought (the criterion of fit used is a least squares one). It can
be used to study what the most relevant variables are for the objects
or items studied.
Regression, or curve fitting generally, are problem areas which are
widely known in the physical sciences.
This bibliography is motivated by increasingly wide interest in the use of
multivariate statistical methods in astronomy. The researcher has, however,
a basic difficulty in going to one of the available on-line bibliographic
databases and, for example, doing a search for all work involving
"clusters"! For this reason, it is helpful to have available
a select bibliography, both of work carried out in astronomy,
and also of the more important works outside astronomy.
In the following, it is attempted to be reasonably comprehensive;
the principal objective is that a selection of the literature available on
particular topics be listed, and in the case of the general
bibliographies, important works - mainly books - be given.
In some cases where it was felt useful, references are repeated
in different sections; in general, however, it may be noted that books often
have material of relevance for topics other than those under which they are
listed. Computer packages are sometimes listed: often the relevant
documentation and examples provide a quick and painless way to
get information on particular techniques.
Finally, a warm acknowledgement is extended to the many colleagues who,
at one time or another, said: "Oh, there is an article which might be of
interest in a recent issue of ...".
CLUSTER ANALYSIS: ASTRONOMY
Principal Components Analysis has often been used for determining a
classification, and these references are not included here.
The problems covered in the following include: star-galaxy separation,
using digitized image data; spectral classification, - the prediction of
spectral type from photometry; taxonomy construction (for asteroids,
stars, and stellar light curves); galaxies; gamma and X-ray astronomy; a
clustering approach not widely used elsewhere is employed for studies
relating to the moon, to asteroids and to cosmic sources; and work relating
to interferogram analysis is represented.
1 J.D. Barrow, S.P. Bhavsar and D.H. Sonoda, "Minimal spanning trees,
filaments and galaxy clustering", Monthly Notices of the
Royal Astronomical Society, 216, 17-35, 1985.
(This article follows the seminal approach of Zahn - see
reference among the general clustering works - in using the MST
for finding visually evident groupings.)
2 R. Bianchi, A. Coradini and M. Fulchignoni, "The statistical
approach to the study of planetary surfaces", The Moon
and the Planets, 22, 293-304, 1980.
(This article contains a general discussion which compares the
so-called G-mode clustering method to other multivariate
statistical methods. Other references by Coradini, Carusi,
and others, also use this method.)
3 R. Bianchi, J.C. Butler, A. Coradini and A.I. Gavrishin, "A
classification of lunar rock and glass samples using the G-mode
central method", The Moon and the Planets, 22, 305-322,
1980.
4 A. Bijaoui, "Methodes mathematiques pour la classification
stellaire", in Classification Stellaire, Compte Rendu
de l'Ecole de Goutelas, ed. D. Ballereau, Observatoire de
Meudon, Meudon, 1979, pp. 1-54.
(This presents a survey of clustering methods.)
5 R. Buccheri, P. Coffaro, G. Colomba, V. Di Gesu, S. Salemi,
"Search of significant features in a direct non-parametric
pattern recognition method. Application to the classification
of multiwire spark chamber pictures", in (eds.) C. de Jager
and Neiuwenhuijzen, Image Processing Techniques in
Astronomy, D. Reidel, Dordrecht, pp. 397-402, 1975.
(A technique is developed for classifying gamma-ray data.)
6 S.A. Butchins, "Automatic image classification",
Astronomy and Astrophysics, 109, 360-365, 1982.
(A method for determining Gaussian clusters, due to Wolf, is
used for star/galaxy separation in photometry.)
7 A. Coradini, M. Fulchignoni and A.I. Gavrishin, "Classification
of lunar rocks and glasses by a new statistical technique",
The Moon, 16, 175-190, 1976.
(The above, along with the references of Bianchi and others,
make use of a novel clustering technique termed the G-mode
method. The above contains a short mathematical description
of the technique proposed.)
8 A. Carusi and E. Massaro, "Statistics and mapping of asteroid
concentrations in the proper elements' space", Astronomy and
Astrophysics Supplement Series, 34, 81-90, 1978.
(This article also uses the so-called G-mode method, employed
by Bianchi, Coradini, and others.)
9 C.R. Cowley and R. Henry, "Numerical taxonomy of Ap and Am stars",
The Astrophysical Journal, 233, 633-643, 1979.
(40 stars are used, characterised on the strength with which
particular atomic spectra - the second spectra of yttrium,
the lanthanides, and the iron group - are represented in the
spectrum. Stars with very similar spectra end up correctly
grouped; and anomolous objects are detected. Clustering using
lanthanides, compared to clustering using iron group data, gives
different results for Ap stars. This is not the case for Am
stars, which thus appear to be less heterogeneous. The need for
physical explanations are thus suggested.)
10 C.R. Cowley, "Cluster analysis of rare earths in stellar spectra",
in Statistical Methods in Astronomy, European Space Agency
Special Publication SP-201, 1983, pp. 153-156.
(About twice the number of stars, as used in the previous reference,
are used here. A greater role is seen for chemical explanations
of stellar abundances and/or spectroscopic patterns over nuclear
hypotheses.)
11 J.K. Davies, N. Eaton, S.F. Green, R.S. McCheyne and A.J. Meadows,
"The classification of asteroids", Vistas in
Astronomy, 26, 243-251, 1982.
(Phyiscal properties of 82 asteroids are used. The dendrogram
obtained is compared with other classification schemes based on
spectral characteristics or colour-colour diagrams. The
clustering approach used is justified also in being able to
pinpoint objects of particular interest for further observation;
and in allowing new forms of data - e.g. broadband infrared
photometry - to be quickly incorporated into the overall
approach of classification-construction.)
12 G.A. De Biase, V. di Gesu and B. Sacco, "Detection of diffuse
clusters in noise background", Pattern Recognition Letters
4, 39-44, 1986.
13 P.A. Devijver, "Cluster analysis by mixture identification", in
V. Di Gesu, L. Scarsi, P. Crane, J.H. Friedman and S. Levialdi
(eds.), Data Analysis in Astronomy, Plenum Press, New York,
1984, pp. 29-44.
(A very useful review article, with many references. A
perspective similar to perspectives adopted by many
discriminant analysis methods is used.)
14 V. Di Gesu and B. Sacco, "Some statistical properties of the
minimum spanning forest", Pattern Recognition, 16,
525-531, 1983.
(In this and the following works, the minimal spanning tree or
fuzzy set theory - which, is clear from the article titles - are
applied to point pattern distinguishing problems involving gamma and
X-ray data. For a rejoinder to the foregoing reference, see
R.C. Dubes and R.L. Hoffman, "Remarks on some statistical
properties of the minimum spanning forest", Pattern Recognition,
19, 49-53, 1986. A reply to this article is forthcoming, from the
authors of the original paper.)
15 V. Di Gesu, B. Sacco and G. Tobia, "A clustering method
applied to the analysis of sky maps in gamma-ray astronomy",
Memorie della Societa Astronomica Italiana, 517-528, 1980.
16 V. Di Gesu and M.C. Maccarone, "A method to classify celestial
shapes based on the possibility theory", in G. Sedmak (ed.),
ASTRONET 1983 (Convegno Nazionale Astronet, Brescia,
Published under the auspices of the Italian Astronomical Society),
355-363, 1983.
17 V. Di Gesu and M.C. Maccarone, "Method to classify spread
shapes based on possibility theory", Proceedings of the 7th
International Conference on Pattern Recognition, Vol. 2,
IEEE Computer Society, 1984, pp. 869-871.
18 V. Di Gesu and M.C. Maccarone, "Features selection and
possibility theory", Pattern Recognition, 19, 63-72, 1986.
19 J.V. Feitzinger and E. Braunsfurth, "The spatial distribution of
young objects in the Large Magellanic Cloud - a problem of
pattern recognition", in eds. S. van den Bergh and K.S. de Boer,
Structure and Evolution of the Magellanic Clouds, IAU, 93-94,
1984.
(In an extended abstract, the use of linkages between objects is
described.)
20 I.E. Frank, B.A. Bates and D.E. Brownlee, "Multivariate statistics
to analyze extraterrestial particles from the ocean floor", in
V. Di Gesu, L. Scarsi, P. Crane, J.H. Friedman and S. Levialdi
(eds.), Data Analysis in Astronomy, Plenum Press, New York,
1984.
21 A. Fresneau, "Clustering properties of stars outside the galactic
disc", in Statistical Methods in Astronomy, European Space Agency
Special Publication SP-201, 1983, pp. 17-20.
(Techniques from the spatial processes area of statistics are used to
assess clustering tendencies of stars.)
22 A. Heck, A. Albert, D. Defays and G. Mersch, "Detection of
errors in spectral classification by cluster analysis",
Astronomy and Astrophysics, 61, 563-566, 1977.
23 A. Heck, D. Egret, Ph. Nobelis and J.C. Turlot, "Statistical
confirmation of the UV spectral classification system based on
IUE low-dispersion stellar spectra", Astrophysics and
Space Science, 120, 223-237, 1986.
(Among other results, it is found that UV standard stars are
located in the neighbourhood of the centres of gravity of
groups found, thereby helping to verify the algorithm implemented.
A number of other papers, by the same authors, analysing
IUE spectra are referenced in this paper. Apart from the use
of a large range of clustering methods, these papers also
introduce a novel weighting procedure - termed the "variable
procrustean bed" - which adjusts for the symmetry/asymmetry
of the spectrum. Therefore, a useful study of certain approaches
to the coding of data is to be found in these papers.)
24 J.P. Huchra and M.J. Geller, "Groups of galaxies. I. Nearby
groups", The Astrophysical Journal, 257, 423-437, 1982.
(The single linkage hierarchical method, or the minimal spanning tree,
have been rediscovered many times - see, for instance, Graham and
Hell, 1985, referenced in the general clustering section. In
this study, a close variant is used for detecting
groups of galaxies using three variables, - two positional variables
and redshift.)
25 J.F. Jarvis and J.A. Tyson, "FOCAS: faint object classification
and analysis system", The Astronomical Journal, 86, 476-495, 1981.
(An iterative minimal distance partitioning method is employed
in the FOCAS system to arrive at star/galaxy/other classes.)
26 G. Jasniewicz, "The Boehm-Vitense gap in the Geneva photometric
system", Astronomy and Astrophysics, 141, 116-126, 1984.
(The minimal spanning tree is used on colour-colour diagrams.)
27 A. Kruszewski, "Object searching and analyzing commands",
in MIDAS - Munich Image Data Analysis System,
European Southern Observatory Operating Manual No. 1, Chapter
11, 1985.
(The Inventory routine in MIDAS has a non-hierarchical
iterative optimization algorithm. It can immediately work on
up to 20 parameters, determined for each object in a
scanned image.)
28 M.J. Kurtz, "Classification methods: an introductory
survey", in Statistical Methods in Astronomy,
European Space Agency Special Publication SP-201, 1983.
(Kurtz lists a large number of parameters - and functions
of these parameters - which have been used to differentiate
stars from galaxies.)
29 J. Materne, "The structure of nearby clusters of galaxies. Hierar-
chical clustering and an application to the Leo region", Astronomy
and Astrophysics, 63, 401-409, 1978.
(Ward's minimum variance hierarchic method is used, following
discussion of the properties of other hierarchic methods.)
30 M.O. Mennessier, "A cluster analysis of visual and near-infrared
light curves of long period variable stars", in Statistical
Methods in Astronomy, European Space Agency Special Publication
SP-201, 1983, pp. 81-84.
(Light curves - the variation of luminosity with time in a
wavelength range - are analysed. Standardization is applied,
and then three hierarchical methods. "Stable clusters" are
sought from among all of these. The study is continued in the
following.)
31 M.O. Mennessier, "A classification of miras from their visual and
near-infrared light curves: an attempt to correlate them with their
evolution", Astronomy and Astrophysics, 144, 463-470,
1985.
32 MIDAS (Munich Image Data Analysis System), European Southern
Observatory, Garching-bei-Muenchen (Version 4.1, January 1986).
Chapter 13: Multivariate Statistical Methods (F. Murtagh).
(This premier astronomical data reduction package contains a
large number of multivariate algorithms.)
33 M. Moles, A. del Olmo and J. Perea, "Taxonomical analysis of
superclusters. I. The Hercules and Perseus superclusters",
Monthly Notices of the Royal Astronomical Society, 213, 365-380,
1985.
(A non-hierarchical descending method, used previously by
Paturel, is employed.)
34 F. Murtagh, "Clustering techniques and their applications",
Data Analysis and Astronomy (Proceedings of International
Workshop on Data Analysis and Astronomy, Erice, Italy, April
1986) Plenum Press, New York (1986, forthcoming).
35 F. Murtagh and A. Lauberts, "A curve matching problem in
astronomy", (forthcoming), 1986.
(A dissimilarity is defined between galaxy luminosity profiles,
in order to arrive at a spiral-elliptical grouping.)
36 G. Paturel, "Etude de la region de l'amas Virgo par
taxonomie", Astronomy and Astrophysics, 71, 106-114, 1979.
(A descending non-hierarchical method is used.)
37 D.J. Tholen, "Asteroid taxonomy from cluster analysis of
photometry", PhD Thesis, University of Arizona, 1984.
(Between 400 and 600 asteroids using good-quality multi-colour
photometric data are analysed.)
38 F. Giovannelli, A. Coradini, J.P. Lasota and M.L. Polimene,
"Classification of cosmic sources: a statistical approach",
Astronomy and Astrophysics, 95, 138-142, 1981.
39 B. Pirenne, D. Ponz and H. Dekker, "Automatic analysis of
interferograms", The Messenger, No. 42, 2-3, 1985.
(The minimal spanning tree is used to distinguish fringes; there
is little description of the MST approach in the above article,
but further articles are in preparation and the software - and
accompanying documentation - are available in the European
Southern Observatory's MIDAS image processing system.)
40 A. Zandonella, " Object classification: some methods of
interest in astronomical image analysis",
in Image Processing in Astronomy, eds.
G. Sedmak, N. Capaccioli and R.J. Allen, Osservatorio
Astronomico di Trieste, Trieste, 304-318, 1979.
(This presents a survey of clustering methods.)
CLUSTER ANALYSIS: GENERAL
41 M.R. Anderberg, Cluster Analysis for Applications,
Academic Press, New York, 1973.
(A little dated, but still very much referenced; good especially
for similarities and dissimilarities.)
42 J.P. Benzecri et coll., L'Analyse des Donnees. I. La
Taxinomie, Dunod, Paris, 1979 (3rd ed.).
(Very influential in the French speaking world; extensive
treatment, and impressive formalism.)
43 R.K. Blashfield and M.S. Aldenderfer, "The literature on cluster
analysis", Multivariate Behavioral Research, 13,
271-295, 1978.
44 H.H. Bock, Automatische Klassifikation, Vandenhoek und
Rupprecht, Goettingen, 1974.
(Encyclopaedic.)
45 CLUSTAN, Clustan Ltd., 16 Kingsburgh Road, Edinburgh EH12 6DZ,
Scotland.
(One of the few exclusively clustering packages available.)
46 B. Everitt, Cluster Analysis, Heinemann Educational Books,
London, 1980 (2nd ed.).
(A very readable, introductory text.)
47 A.D. Gordon, Classification, Chapman and Hall, London, 1981.
(Another recommendable introductory text.)
48 R.L. Graham and P. Hell, "On the history of the minimum spanning
tree problem", Annals of the History of Computing, 7,
43-57, 1985.
(An interesting historical study.)
49 J.A. Hartigan, Clustering Algorithms, Wiley, New York, 1975.
(Often referenced, this book could still be said to be innovative
in its treatment of clustering problems; it contains a wealth of
sample data sets.)
50 M. Jambu and M.O. Lebeaux, Cluster Analysis and Data Analysis,
North-Holland, Amsterdam, 1983.
(Some of the algorithms discussed have been overtaken by, for
instance, the "nearest neighbour chain" or "reciprocal
nearest neighbour" algorithms. These latter are described in
the reference of Murtagh, below.)
51 L. Lebart, A. Morineau and K.M. Warwick, Multivariate Descriptive
Statistical Analysis, Wiley, New York, 1984.
(A useful book, centred on Multiple Correspondence Analysis, but also
including clustering, Principal Components Analysis, and other
methods.)
52 R.C.T. Lee, "Clustering analysis and its applications", in
J.T. Tou (ed.) Advances in Information Systems Science, Vol. 8,
Plenum Press, New York, 1981, pp. 169-292.
(Practically book-length, this is especially useful for the links
between clustering and problems in computing and in Operations
Research.)
53 F. Murtagh, Multidimensional Clustering Algorithms, COMPSTAT
Lectures Volume 4, Physica-Verlag, Wien, 1985.
(Algorithmic details of a range of widely-used clustering methods.)
54 F.J. Rohlf, "Generalization of the gap test for the detection of
multivariate outliers", Biometrics, 31, 93-101, 1975.
(One application of the minimal spanning tree.)
55 G. Salton and M.J. McGill, Introduction to Modern Information
Retrieval, McGraw-Hill, New York, 1983.
(A central reference in the information retrieval area.)
56 P.H.A. Sneath and R.R. Sokal, Numerical Taxonomy, Freeman,
San Francisco, 1973.
(Very influential for biological applications, it also has some
impressive collections of graph representations of clustering
results.)
57 H. Spaeth, Cluster Dissection and Analysis: Theory, Fortran
Programs, Examples, Ellis Horwood, Chichester, 1985.
(Recommendable reference for non-hierarchic, partitioning
methods.)
58 A. Tucker, Applied Combinatorics, Wiley, New York, 1980.
(For background reading on graph theory and combinatorics.)
59 D. Wishart, "Mode analysis: a generalization of nearest neighbour
which reduces chaining effects", in ed. A.J. Cole, Numerical
Taxonomy, Academic Press, New York, 282-311, 1969.
(Discusses various variance-based clustering criteria which,
interestingly, are justified by the difficulties experienced by
more mainstream algorithms in clustering data of the type found
in the H-R diagram.)
60 C.T. Zahn, "Graph-theoretical methods for detecting and describing
Gestalt clusters", IEEE Transactions on Computers, C-20,
68-86, 1971.
(Central reference for the use of the minimal spanning tree for
processing point patterns.)
DISCRIMINANT ANALYSIS: ASTRONOMY
61 H.-M. Adorf, "Classification of low-resolution stellar spectra
via template matching - a simulation study", Data Analysis
and Astronomy, (Proceedings of International Workshop on Data
Analysis and Astronomy, Erice, Italy, April 1986) Plenum Press,
New York (1986, forthcoming).
62 E. Antonello and G. Raffaelli, "An application of discriminant
analysis to variable and nonvariable stars", Publications of
the Astronomical Society of the Pacific, 95, 82-85, 1983.
(Multiple Discriminant Analysis is used.)
63 A. Heck, "An application of multivariate statistical analysis to a
photometric catalogue", Astronomy and Astrophysics, 47,
129-135, 1976.
(Multiple Discriminant Analysis and a stepwise procedure are
applied.)
64 M.J. Kurtz, "Progress in automation techniques for MK classification",
in ed. R.F. Garrison, The MK Process and Stellar Classification,
David Dunlop Observatory, University of Toronto, 1984, pp. 136-152.
(Essentially a k-NN approach is used for assigning spectra to known
stellar spectra classes.)
65 J.F. Jarvis and J.A. Tyson, "FOCAS - Faint object classification
and analysis system", SPIE Instrumentation in Astronomy III,
172, 1979, 422-428.
(See also other references of Tyson/Jarvis and Jarvis/Tyson.)
66 J.F. Jarvis and J.A. Tyson, "Performance verification of an
automated image cataloging system",
SPIE Vol. 264 Applications of Digital Image Processing to
Astronomy, 222-229, 1980.
67 J.F. Jarvis and J.A. Tyson, "FOCAS - Faint object classification
and analysis system", The Astronomical Journal, 86,
1981, 476-495.
(A hyperplane separation surface is determined in a space defined
by 6 parameters used to characterise the objects. This is a
2-stage procedure where the first stage is that of training,
and the second stage uses a partitioning clustering method.)
68 H.T. MacGillivray, R. Martin, N.M. Pratt, V.C. Reddish, H. Seddon,
L.W.G. Alexander, G.S. Walker, P.R. Williams, "A method for the
automatic separation of the images of galaxies and stars from
measurements made with the COSMOS machine", Monthly Notices
of the Royal Astronomical Society, 176, 265-274, 1976.
(Different parameters are appraised for star/galaxy separation.
Kurz - see reference above under Cluster Analysis -
lists other parameters which have been used for the same objective.)
69 M.L. Malagnini, "A classification algorithm for star-galaxy
counts", in Statistical Methods in Astronomy, European
Space Agency Special Publication SP-201, 1983, pp. 69-72.
(A linear classifier is used and is further employed in the
following reference.)
70 M.L. Malagnini, F. Pasian, M. Pucillo and P. Santin, "FODS: a
system for faint object detection and classification in
astronomy", Astronomy and Astrophysics, 144, 1985,
49-56.
71 "Recommendations for Guide Star Selection System", private
notes, GSSS Group, Space Telescope Science Institute, Baltimore,
1984.
(A Bayesian approach, using the IMSL subroutine library - see
below - is employed in the GSSS system. Documentation will
follow on this, in the future.)
72 W.J. Sebok, "Optimal classification of images into stars or
galaxies - a Bayesian approach", The Astronomical Journal,
84, 1979, 1526-1536.
(The design of a classifier, using galaxy models, is studied
in depth and validated on Schmidt plate data.)
73 J.A. Tyson and J.F. Jarvis, "Evolution of galaxies: automated
faint object counts to 24th magnitude", The Astrophyiscal
Journal, 230, 1979, L153-L156.
(A continuation of the work of Jarvis and Tyson, 1979, above.)
74 F. Valdes, "Resolution classifier", SPIE Instrumentation in
Astronomy IV, 331, 1982, 465-471.
(A Bayesian classifier is used, which differs from that used
by Sebok, referenced above. The choice is thoroughly justified.
A comparison is also made with the hyperplane fitting method
used in the FOCAS system - see the references of Jarvis and
Tyson. It is concluded
that the results obtained within the model chosen are better
than a hyperplane based approach in parameter space; but that
the latter is computationally more efficient.)
DISCRIMINANT ANALYSIS: GENERAL
75 S.-T. Bow, Pattern Recognition, Marcel Dekker, New York,
1984.
(A textbook detailling a range of Discriminant Analysis methods,
together with clustering and other topics.)
76 C. Chatfield and A.J. Collins, Introduction to Multivariate
Analysis, Chapman and Hall, London, 1980.
(An excellent introductory textbook.)
77 E. Diday, J. Lemaire, J. Pouget and F. Testu, Elements
d'Analyse de Donnees, Dunod, Paris, 1982.
(Describes a large range of methods.)
78 R. Duda and P. Hart, Pattern Classification and Scene
Analysis, Wiley, New York, 1973.
(Excellent treatment of many image processing problems.)
79 R.A. Fisher, "The use of multiple measurements in taxonomic
problems", The Annals of Eugenics, 7, 179-188, 1936.
(Still an often referenced paper; contains the famous Iris data.)
80 K. Fukunaga, Introduction to Statistical Pattern Recognition,
Academic Press, New York, 1972.
81 D.J. Hand, Discrimination and Classification, Wiley,
New York, 1981.
(A comprehensive description of a wide range of methods; very
recommendable.)
82 International Mathematical and Statistical Library (IMSL), Manual
sections on ODFISH, ODNORM.
(A useful range of algorithms is available in this widely used
subroutine library.)
83 M. James, Classification Algorithms, Collins, London, 1985.
(A very readable introduction.)
84 M.G. Kendall, Multivariate Analysis, Griffin, London, 1980
(2nd ed.).
(Dated in relation to computing techniques, but exceptionally
clear and concise in its treatment of many practical problems.)
85 P.A. Lachenbruch, Discriminant Analysis, Hafner Press, New
York, 1975.
86 J.L. Melsa and D.L. Cohn, Decision and Estimation Theory,
McGraw-Hill, New York, 1978.
(A readable decision theoretic perspective.)
87 J.M. Romeder, Methodes et Programmes d'Analyse
Discriminante, Dunod, Paris, 1973.
(A survey of commonly-used techniques.)
88 Statistical Analysis System (SAS), SAS Institute Inc., Box 8000,
Cary, NC 27511-8000, USA; Manual chapters on STEPDISC,
NEIGHBOUR, etc.
(A range of relevant algorithms is available in this, - one of the
premier statistical packages.)
PRINCIPAL COMPONENTS ANALYSIS: ASTRONOMY
PCA has been a fairly widely used technique in astronomy.
The following list does not aim to be comprehensive, but
indicates instead the types of problems to which PCA
can be applied. It is also hoped that it may provide a
convenient entry-point to literature on a topic of interest.
References below are concerned with stellar
parallaxes; a large number are concerned with the study of galaxies;
and a large number relate also to spectral reduction.
89 A. Bijaoui, "Application astronomique de la compression de
l'information", Astronomy and Astrophysics, 30, 199-202,
1974.
90 A. Bijaoui, SAI Library, Algroithms for Image Processing,
Nice Observatory, Nice, 1985.
(A large range of subroutines for image processing, including the
Karhunen-Loeve expansion.)
91 P. Brosche, "The manifold of galaxies: Galaxies with known
dynamical properties", Astronomy and Astrophysics, 23,
259-268, 1973.
92 P. Brosche and F.T. Lentes, "The manifold of globular clusters",
Astronomy and Astrophysics, 139, 474-476, 1984.
93 V. Bujarrabal, J. Guibert and C. Balkowski, "Multidimensional
statistical analysis of normal galaxies", Astronomy and
Astrophysics, 104, 1-9, 1981.
94 R. Buser, "A systematic investigation of multicolor photometric
systems. I. The UBV, RGU and uvby systems.",
Astronomy and Astrophysics, 62, 411-424, 1978.
95 C.A. Christian and K.A. Janes, "Multivariate analysis of
spectrophotometry".
Publications of the Astronomical Society of the Pacific, 89,
415-423, 1977.
96 C.A. Christian, "Identification of field stars contaminating the
colour-magnitude diagram of the open cluster Be 21", The
Astrophysical Journal Supplement Series, 49, 555-592, 1982.
97 T.J. Deeming, "Stellar spectral classification. I. Application of
component analysis", Monthly Notices of the Royal Astronomical
Society, 127, 493-516, 1964.
(An often referenced work.)
98 T.J. Deeming, "The analysis of linear correlation in
astronomy", Vistas in Astronomy, 10, 125-, 1968.
(For regression also.)
99 G. Efstathiou and S.M. Fall, "Multivariate analysis of elliptical
galaxies", Monthly Notices of the Royal Astronomical Society,
206, 453-464, 1984.
100 S.M. Faber, "Variations in spectral-energy distributions and
absorption-line strengths among elliptical galaxies", The
Astrophysical Journal, 179, 731-754, 1973.
101 M. Fofi, C. Maceroni, M. Maravalle and P. Paolicchi, "Statistics
of binary stars. I. Multivariate analysis of spectroscopic binaries",
Astronomy and Astrophysics, 124, 313-321, 1983.
(PCA is used, together with a non-hierarchical clustering technique.)
102 M. Fracassini, L.E. Pasinetti, E. Antonello and G. Raffaelli,
"Multivariate analysis of some ultrashort period Cepheids (USPC)",
Astronomy and Astrophysics, 99, 397-399, 1981.
103 M. Fracassini, G. Manzotti, L.E. Pasinetti, G. Raffaelli, E. Antonello
and L. Pastori, "Application of multivariate analysis to the para-
meters of astrophysical objects", in Statistical Methods in Astronomy,
European Space Agency Special Publication SP-201, 21-25, 1983.
104 P. Galeotti, "A statistical analysis of metallicity in spiral
galaxies", Astrophysics and Space Science, 75, 511-519,
1981.
105 A. Heck, "An application of multivariate statistical analysis
to a photometric catalogue", Astronomy and Astrophysics,
47, 129-135, 1976.
(PCA is used, along with regression and discriminant analysis.)
106 A. Heck, D. Egret, Ph. Nobelis and J.C. Turlot, "Statistical
confirmation of the UV spectral classification system based on
IUE low-dispersion spectra", Astrophysics and Space
Science, 120, 223-237, 1986.
(Many other articles by these authors,
which also make use of PCA, are referenced in the above.)
107 S.J. Kerridge and A.R. Upgren, "The application of
multivariate analysis to parallax solutions. II. Magnitudes
and colours of comparison stars", The Astronomical Journal,
78, 632-638, 1973.
(See also Upgren and Kerridge, 1971, referenced below.)
108 J. Koorneef, "On the anomaly of the far UV extinction in the 30
Doradus region", Astronomy and Astrophysics, 64, 179-193, 1978.
(PCA is used for deriving a photometric index from 5-channel
photometric data.)
109 M.J. Kurtz, "Automatic spectral classification", PhD Thesis,
Dartmouth College, New Hampshire, 1982.
110 F.T. Lentes, "The manifold of spheroidal galaxies",
Statistical Methods in Astronomy, European Space Agency Special
Publication SP-201, 73-76, 1983.
111 D. Massa and C.F. Lillie, "Vector space methods of photometric
analysis: applications to O stars and interstellar reddening",
The Astrophysical Journal, 221, 833-850, 1978.
112 D. Massa, "Vector space methods of photometric analysis. III. The
two components of ultraviolet reddening", The Astronomical
Journal, 85, 1651-1662, 1980.
113 B. Nicolet, "Geneva photometric boxes. I. A topological approach
of photometry and tests.", Astronomy and Astrophysics,
97, 85-93, 1981.
(PCA is used on colour indices.)
114 S. Okamura, K. Kodaira and M. Watanabe, "Digital surface
photometry of galaxies toward a quantitative classification. III.
A mean concentration index as a parameter representing the
luminosity distribution", The Astrophysical Journal, 280,
7-14, 1984.
115 S. Okamura, "Global structure of Virgo cluster galaxies", in
O.-G. Richter and B. Binggeli (eds.), Proceedings of ESO Workshop on
The Virgo Cluster of Galaxies, ESO Conference and Workshop
Proceedings No. 20, 201-215, 1985.
116 D. Pelat, "A study of H I absorption using Karhunen-Loeve series",
Astronomy and Astrophysics, 40, 285-290, 1975.
117 A. W. Strong, "Data analysis in gamma-ray astronomy: multivariate
likelihood method for correlation studies", Astronomy and
Astrophysics, 150, 273-275, 1985.
(The method presented is not linked to PCA, but in dealing with the
eigenreduction of a correlation matrix it is clearly very closely
related.)
118 B. Takase, K. Kodaira and S. Okamura, An
Atlas of Selected Galaxies, University of Tokyo Press, VNU Science
Press, 1984.
119 D.J. Tholen, "Asteroid taxonomy from cluster analysis of
photometry", PhD Thesis, University of Arizona, 1984.
120 A.R. Upgren and S.J. Kerridge, "The application of
multivariate analysis to parallax solutions. I. Choice of
reference frames", The Astronomical Journal, 76,
655-664, 1971.
(See also Kerridge and Upgren, 1973, referenced above.)
121 J.P. Vader, "Multivariate analysis of elliptical galaxies in
different environments", The Astrophysical Journal,
306, 390-400, 1986.
(The Virgo and Coma clusters are studied.)
122 C.A. Whitney, "Principal components analysis of spectral data.
I. Methodology for spectral classification", Astronomy and
Astrophysics Supplement Series, 51, 443-461, 1983.
123 B.C. Whitmore, "An objective classification system for spiral
galaxies. I. The two dominant dimensions", The Astrophysical
Journal, 278, 61-80, 1984.
PRINCIPAL COMPONENTS ANALYSIS: GENERAL
124 T.W. Anderson, An Introduction to Multivariate
Statistical Analysis, Wiley, New York, 1984 (2nd ed.).
(For inferential aspects relating to PCA.)
125 C. Chatfield and A.J. Collins, Introduction to Multivariate
Analysis, Chapman and Hall, London, 1980.
(An excellent introductory textbook.)
126 R. Gnanadesikan, Methods for Statistical Data Analysis
of Multivariate Observations, Wiley, New York, 1977.
(For details of PCA, clustering and discrimination.)
127 M. Kendall, Multivariate Analysis, Griffin, London, 1980
(2nd ed.).
(Dated in relation to computing techniques, but exceptionally
clear and concise in its treatment of many practical problems.)
128 L. Lebart, A. Morineau and K.M. Warwick, Multivariate Descriptive
Statistical Analysis, Wiley, New York, 1984.
(An excellent geometric treatment of PCA.)
129 F.H.C. Marriott, The Interpretation of Multiple
Observations, Academic Press, New York, 1974.
(A short, readable textbook.)
REGRESSION: ASTRONOMY
Regression analysis, and fitting problems, have always been
central in the physical sciences. The following selection of
references in this area will therefore simply indicate the
range of possible applications, and in some cases will additionally
illustrate where regression and fitting might profitably
complement other multivariate statistical techniques.
130 R.L. Branham Jr., "Alternatives to least-squares",
The Astronomical Journal, 87, 928-937, 1982.
131 R. Buser, "A systematic investigation of multicolor
photometric systems. II. The transformations between the
UBV and RGU systems.", Astronomy and Astrophysics,
62, 425-430, 1978.
132 C.R. Cowley and G.C.L. Aikman, "Stellar abundances from
line statistics", The Astrophysical Journal,
242, 684-698, 1980.
133 M. Creze, "Influence of the accuracy of stellar
distances on the estimations of kinematical parameters from
radial velocities",
Astronomy and Astrophysics, 9, 405-409, 1970.
134 M. Creze, "Estimation of the parameters of galactic
rotation and solar motion with respect to Population I
Cepheids", Astronomy and Astrophysics, 9,
410-419, 1970.
135 T.J. Deeming, "The analysis of linear correlation in
astronomy", Vistas in Astronomy, 10, 125, 1968.
136 H. Eichhorn, "Least-squares adjustment with probabilistic
constraints", Monthly Notices of the Royal Astronomical
Society, 182, 355-360, 1978.
137 H. Eichhorn and M. Standish, Jr., "Remarks on
nonstandard least-squares problems", The Astronomical
Journal, 86, 156-159, 1981.
138 J.R. Kuhn, "Recovering spectral information from unevenly
sampled data: two machine-efficient solutions", The
Astronomical Journal, 87, 196-202, 1982.
139 J.R. Gott III and E.L. Turner, "An extension of the galaxy
covariance function to small scales", The
Astrophysical Journal, 232, L79-L81, 1979.
140 A. Heck, "Predictions: also an astronomical tool", in
Statistical Methods in Astronomy, European Space
Agency Special Publication SP-201, 1983, pp. 135-143.
(A survey article, with many references. Other articles
in this conference proceedings also use regression and
fitting techniques.)
141 A. Heck and G. Mersch, "Prediction of spectral classification
from photometric observations - application to the
uvby beta photometry and the MK spectral classification.
I. Prediction assuming a luminosity class",
Astronomy and Astrophysics, 83, 287-296, 1980.
(Stepwise multiple regression and isotonic regression are used.)
142 W.H. Jefferys, "On the method of least squares", The
Astronomical Journal, 85, 177-181, 1980.
143 W.H. Jefferys, "On the method of least squares. II.", The
Astronomical Journal, 86, 149-155, 1981.
144 M.O. Mennessier, "Corrections de precession, apex et
rotation galactique estimes a partir de mouvements propres
fondamentaux par une methode de maximum vraisemblance",
Astronomy and Astrophysics, 17, 220-225, 1972.
145 M.O. Mennessier, "On statistical estimates from proper motions.
III.", Astronomy and Astrophysics, 11, 111-122,
1972.
146 G. Mersch and A. Heck, "Prediction of spectral classification
from photometric observations - application to the
uvby beta photometry and the MK spectral classification.
II. General case",
Astronomy and Astrophysics, 85, 93-100, 1980.
147 J.F. Nicoll and I.E. Segal, "Correction of a criticism of the
phenimenological quadratic redshift-distance law",
The Astrophysical Journal, 258, 457-466, 1982.
148 J.F. Nicoll and I.E. Segal, "Null influence of possible local
extragalactic perturbations on tests of redshift-distance laws",
Astronomy and Astrophysics, 115, 398-403, 1982.
149 D.M. Peterson, "Methods in data reduction. I. Another look at
least squares", Publications of the Astronomical Society of
the Pacific, 91, 546-552, 1979.
150 I.E. Segal, "Distance and model dependence of observational
galaxy cluster concepts", Astronomy and Astrophysics, 123,
151-158, 1983.
151 I.E. Segal and J.F. Nicoll, "Uniformity of quasars in the
chronometric cosmology", Astronomy and Astrophysics, 144,
L23-L26, 1985.
REGRESSION: GENERAL
152 P.R. Bevington, Data Reduction and Error Analysis
for the Physical Sciences, McGraw-Hill, New York, 1969.
(A very recommendable text for regression and fitting, with
many examples.)
153 N.R. Draper and H. Smith, Applied Regression
Analysis, Wiley, New York, 1981 (2nd ed.).
154 B.S. Everitt and G. Dunn, Advanced Methods of
Data Exploration and Modelling, Heinemann Educational
Books, London, 1983.
(A discursive overview of topics such as linear models
and analysis of variance; PCA and clustering are also
covered.)
155 D.C. Montgomery and E.A. Peek, Introduction to
Linear Regression Analysis, Wiley, New York, 1982.
156 G.A.F. Seber, Linear Regression Analysis, Wiley,
New York, 1977.
157 G.B. Wetherill, Elementary Statistical Methods,
Chapman and Hall, London, 1967.
(An elementary introduction, with many examples.)
OTHER STATISTICAL METHODS: ASTRONOMY
We have not sought to focus on the application of statistics,
tout court, in astronomy in this bibliography. However
some of the varied studies, listed below, constitute valuable
background or survey material.
158 D. Clarke and B.G. Steward, "Statistical methods of
stellar photometry", Vistas in Astronomy, 29,
27-51, 1986.
159 H. Eelsalu, "Theoretical foundations of stellar statistics",
Academy of Sciences of the Estonian S.S.R., 1982.
(A monograph, giving a general theory for stellar statistical
data.)
160 E.D. Feigelson and P.I. Nelson, "Statistical methods for
astronomical data with upper limits. I. Univariate
distributions", The Astrophysical Journal, 293,
192-206, 1985.
(Survival analysis is used for left-censored data. See also
Isobe et al. below.)
161 A. Heck, J. Manfroid and G. Mersch, "On period determination
methods", Astronomy and Astrophysics Supplement Series,
59, 63-72, 1985.
162 Isobe, T., E.D. Feigelson and P.I. Nelson, "Statistical methods
for astronomical data with upper limits. II. Correlation and
regression", The Astrophysical Journal,
1986 (in press).
(Survival analysis is used on data with upper limits.)
163 D.G. Kendall, "Mathematical statistics in the humanities, and
some related problems in astronomy", in A.C. Atkinson and S.E.
Fienberg (eds.), A Celebration of Statistics, Springer-Verlag,
New York, 1985, pp. 393-408.
(Problems relating to testing for one-dimensionality and for
alignments - of importance in quasar astronomy - are overviewed,
and some other relevant references are to be found in this paper.)
164 J.V. Narlikar, "Statistical techniques in astronomy", Sankha:
The Indian Journal of Statistics, Series B, Part 2, 44,
125-134, 1982.
(A range of astronomical problems with statistical solutions are
presented.)
165 M.E. Oezel and H. Mayer-Hasselwander, "Application of bootstrap
sampling in gamma-ray astronomy: time variability in pulsed
emmission from Crab pulsar", in V. Di Gesu, L. Scarsi, P. Crane,
J.H. Friedman and S. Levialdi (eds.), Data Analysis in Astronomy,
Plenum Press, New York, 1985, pp. 81-86.
166 J. Pelt, "Phase dispersion minimization methods for estimation
of periods from unequally spaced sequences of data" in
Statistical Methods in Astronomy, European Space Agency
Special Publication SP-201, 37-42, 1983.
167 J. Pfleiderer and P. Krommidas, "Statistics under incomplete
knowledge of data", Monthly Notices of the Royal
Astronomical Society, 198, 281-288, 1982.
168 J.D. Scargle, "Studies in astronomical time series analysis.
I. Modelling random processes in the time domain", The
Astrophysical Journal Supplement Series, 45, 1-71, 1981.
169 J.V. Wall, "Practical statistics for astronomers. I. Definitions,
the normal distribution, detection of signal", Quarterly
Journal of the Royal Astronomical Society, 20, 130-152,
1972.
INDEX OF NAMES
AUTHOR SEQUENCE NUMBER
OF PUBLICATION
Adorf, H.-M. 61
Aikman, G.C.L. 132
Albert, A. 22
Aldenderfer, M.S. 43
Alexander, L.W.G. 68
Anderberg, M.R. 41
Anderson, T.W. 124
Antonello, E. 62,103
Balkowski, C. 93
Barrow, J.D. 1
Bates, B.A. 20
Benzecri, J.P. 42
Bevington, P.R. 152
Bhavsar, S.P. 1
Bianchi, R. 2,3
Bijaoui, A. 4,89,90
Blashfield, R.K. 43
Bock, H.H. 44
Bow, S.-T. 75
Branham Jr., R.L. 130
Braunsfurth, E. 19
Brosche, P. 91,92
Brownlee, D.E. 20
Buccheri, R. 5
Bujarrabal, V. 93
Buser, R. 94,131
Butchins, S.A. 6
Butler, J.C. 3
Carusi, A. 8
Chatfield, C. 76,125
Christian, C.A. 95,96
Clarke, D. 158
CLUSTAN (software) 45
Coffaro, P. 5
Cohn, D.L. 86
Collins, A.J. 76,125
Colomba, G. 5
Coradini, A. 2,3,7,38
COSMOS (software) 68
Cowley, C.R. 9,10,132
Creze, M. 133,134
Davies, J.K. 11
De Biase, G.A. 12
Deeming, T.J. 97,98,135
Defays, D. 22
Dekker, H. 39
Devijver, P.A. 13
Di Gesu, V. 5,12,14,15,16,17,18
Diday, E. 77
Draper, N.R. 153
Dubes, R.C. 14
Duda, R. 78
Dunn, G. 154
Eaton, N. 11
Eelsalu, H. 159
Efstathiou, G. 99
Egret, D. 23,106
Eichhorn, H. 136,137
Everitt, B.S. 46,154
Faber, S.M. 100
Fall, S.M. 99
Feigelson, E.D. 160,162
Feitzinger, J.V. 19
Fisher, R.A. 79
FOCAS (software) 65,67,74
Fofi, M. 101
Fracassini, M. 102,103
Frank, I.E. 20
Fresneau, A. 21
Fukunaga, K. 80
Fulchignoni, M. 2,7
Galeotti, P. 104
Gavrishin, A.I. 3,7
Geller, M.J. 24
Giovannelli, F. 38
Gnanadesikan, R. 126
Gordon, A.D. 47
Gott III, J.R. 139
Graham, R.L. 48
Green, S.F. 11
GSSS (software) 71
Guibert, J. 93
Hand, D.J. 81
Hart, P. 78
Hartigan, J.A. 49
Heck, A. 22,23,63,105,106,140,141,146,161
Hell, P. 48
Henry, R. 9
Hoffman, R.L. 14
Huchra, J.P. 24
IMSL (software) 82
Isobe, T. 162
Jambu, M. 50
James, M. 83
Janes, K.A. 95
Jarvis, J.F. 25,65,66,67,73
Jasniewicz, G. 26
Jefferys, W.H. 142,143
Kendall, D.G. 163
Kendall, M.G. 84,127
Kerridge, S.J. 107,120
Kodaira, K. 114,118
Koorneef, J. 108
Krommidas, P. 167
Kruszewski, A. 27
Kuhn, J.R. 138
Kurtz, M.J. 28,64,109
Lachenbruch, P.A. 85
Lasota, J.P. 38
Lauberts, A. 35
Lebart, L. 51,128
Lebeaux, M.O. 50
Lee, R.C.T. 52
Lemaire, J. 77
Lentes, F.T. 92,110
Lillie, C.F. 111
MacGillivray, H.T. 68
Maccarone, M.C. 16,17,18
Maceroni, C. 101
Malagnini, M.L. 69,70
Manfroid, J. 161
Manzotti, G. 103
Maravalle, M. 101
Marriott, F.H.C. 129
Martin, R. 68
Massa, D. 111,112
Massaro, E. 8
Materne, J. 29
Mayer-Hasselwander, H. 165
McCheyne, R.S. 11
McGill, M.J. 55
Meadows, A.J. 11
Melsa, J.L. 86
Mennessier, M.O. 30,31,144,145
Mersch, G. 22,141,146,161
MIDAS (software) 32,39
Moles, M. 33
Montgomery, D.C. 155
Morineau, A. 51,128
Murtagh, F. 34,35,53
Narlikar, J.V. 164
Nelson, P.I. 160,162
Nicolet, B. 113
Nicoll, J.F. 147,148,151
Nobelis, Ph. 23,106
Okamura, S. 114,115,118
Olmo, A. del 33
Oezel, M.E. 165
Paolicchi, P. 101
Pasian, F. 70
Pasinetti, L.E. 102,103
Pastori, L. 103
Paturel, G. 36
Peek, E.A. 155
Pelat, D. 116
Pelt, J. 166
Perea, J. 33
Peterson, D.M. 149
Pfleiderer, J. 167
Pirenne, B. 39
Polimene, M.L. 38
Ponz, D. 39
Pouget, J. 77
Pratt, N.M. 68
Pucillo, M. 70
Raffaelli, G. 62,102,103
Reddish, V.C. 68
Rohlf, F.J. 54
Romeder, J.M. 87
Sacco, B. 12,14,15
SAI (software) 90
Salemi, S. 5
Salton, G. 55
Santin, P. 70
SAS (software) 88
Scargle, J.D. 168
Seber, G.A.F. 156
Sebok, W.J. 72
Seddon, H. 68
Segal, I.E. 147,148,150,151
Smith, H. 153
Sneath, P.H.A. 56
Sokal, R.R. 56
Sonoda, D.H. 1
Spaeth, H. 57
Standish Jr., M. 137
Steward, B.G. 158
Strong, A.W. 117
Takase, B. 118
Testu, F. 77
Tholen, D.J. 37,119
Tobia, G. 15
Tucker, A. 58
Turlot, J.C. 23,106
Turner, E.L. 139
Tyson, J.A. 25,65,66,67,73
Upgren, A.R. 107,120
Vader, J.P. 121
Valdes, F. 74
Walker, G.S. 68
Wall, J.V. 169
Warwick, K.M. 51,128
Watanabe, M. 114
Wetherill, G.B. 157
Whitmore, B.C. 123
Whitney, C.A. 122
Williams, P.R. 68
Wishart, D. 59
Zahn, C.T. 60
Zandonella, A. 40