R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, Automatic subspace clustering of high-dimensional data for data mining application, ACM SIGMOD International Conference on Management of Data, pp.94-105, 1998.

H. Akaike, A new look at the statistical model identification, IEEE Transactions on Automatic Control, vol.19, issue.6, pp.716-723, 1974.
DOI : 10.1109/TAC.1974.1100705

T. Alexandrov, J. Decker, B. Mertens, A. M. Deelder, R. A. Tollenaar et al., Biomarker discovery in MALDI-TOF serum protein profiles using discrete wavelet transformation, Bioinformatics, vol.25, issue.5, pp.25643-649, 2009.
DOI : 10.1093/bioinformatics/btn662

E. Anderson, The irises of the Gaspé Peninsula, Bulletin of the American Iris Society, vol.59, pp.2-5, 1935.

J. Baek, G. Mclachlan, and L. Flack, Mixtures of Factor Analyzers with Common Factor Loadings: Applications to the Clustering and Visualization of High-Dimensional Data, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.32, issue.7, pp.1298-1309, 2009.
DOI : 10.1109/TPAMI.2009.149

R. Bellman, Dynamic Programming, 1957.

C. Biernacki, G. Celeux, and G. Govaert, Assessing a mixture model for clustering with the integrated completed likelihood, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.22, issue.7, pp.719-725, 2001.
DOI : 10.1109/34.865189

C. Biernacki, G. Celeux, and G. Govaert, Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models, Computational Statistics & Data Analysis, vol.41, issue.3-4, pp.561-575, 2003.
DOI : 10.1016/S0167-9473(02)00163-9

C. Bishop and M. Svensen, GTM: The Generative Topographic Mapping, Neural Computation, vol.39, issue.1, pp.215-234, 1998.
DOI : 10.1007/BF01889678

S. Boutemedjet, N. Bouguila, and D. Ziou, A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.8, pp.311429-1443, 2009.
DOI : 10.1109/TPAMI.2008.155

C. Bouveyron, S. Girard, and C. Schmid, High-dimensional data clustering, Computational Statistics & Data Analysis, vol.52, issue.1, pp.502-519, 2007.
DOI : 10.1016/j.csda.2007.02.009

URL : https://hal.archives-ouvertes.fr/inria-00548573

N. Campbell, CANONICAL VARIATE ANALYSIS-A GENERAL MODEL FORMULATION, Australian Journal of Statistics, vol.11, issue.1, pp.86-96, 1984.
DOI : 10.1111/j.1467-842X.1984.tb01271.x

G. Celeux and J. Diebolt, The SEM algorithm: a probabilistic teacher algorithm from the EM algorithm for the mixture problem, Computational Statistics Quaterly, vol.2, issue.1, pp.73-92, 1985.

G. Celeux and G. Govaert, A classification EM algorithm for clustering and two stochastic versions, Computational Statistics & Data Analysis, vol.14, issue.3, pp.315-332, 1992.
DOI : 10.1016/0167-9473(92)90042-E

URL : https://hal.archives-ouvertes.fr/inria-00075196

D. Clausi, K-means Iterative Fisher (KIF) unsupervised clustering algorithm applied to image texture segmentation, Pattern Recognition, vol.35, issue.9, pp.1959-1972, 2002.
DOI : 10.1016/S0031-3203(01)00138-8

C. Ding and T. Li, Adaptative dimension reduction using discriminant analysis and k-means clustering. ICML, 2007.

R. Duda, P. Hart, and D. Stork, Pattern classification, 2000.

R. A. Fisher, THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS, Annals of Eugenics, vol.59, issue.2, pp.179-188, 1936.
DOI : 10.1111/j.1469-1809.1936.tb02137.x

D. H. Foley and J. W. Sammon, An Optimal Set of Discriminant Vectors, IEEE Transactions on Computers, vol.24, issue.3, pp.281-289, 1975.
DOI : 10.1109/T-C.1975.224208

C. Fraley and A. Raftery, MCLUST: Software for Model-Based Cluster Analysis, Journal of Classification, vol.16, issue.2, pp.297-306, 1999.
DOI : 10.1007/s003579900058

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.8930

C. Fraley and A. Raftery, Model-Based Clustering, Discriminant Analysis, and Density Estimation, Journal of the American Statistical Association, vol.97, issue.458, p.97, 2002.
DOI : 10.1198/016214502760047131

J. H. Friedman, Regularized Discriminant Analysis, Journal of the American Statistical Association, vol.33, issue.405, pp.165-175, 1989.
DOI : 10.1080/01621459.1989.10478752

K. Fukunaga, Introduction to Statistical Pattern Recognition, 1990.

Y. Guo, S. Li, J. Yang, T. Shu, and L. Wu, A generalized Foley???Sammon transform based on generalized fisher discriminant criterion and its application to face recognition, Pattern Recognition Letters, vol.24, issue.1-3, pp.147-158, 2003.
DOI : 10.1016/S0167-8655(02)00207-6

Y. Hamamoto, Y. Matsuura, T. Kanaoka, and S. Tomita, A note on the orthonormal discriminant vector method for feature extraction, Pattern Recognition, vol.24, issue.7, pp.681-684, 1991.
DOI : 10.1016/0031-3203(91)90035-4

T. Hastie, A. Buja, and R. Tibshirani, Penalized Discriminant Analysis, The Annals of Statistics, vol.23, issue.1, pp.73-102, 1995.
DOI : 10.1214/aos/1176324456

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.8378

T. Hastie, R. Tibshirani, and J. Friedman, The elements of statistical learning, 2009.

P. Howland and H. Park, Generalizing discriminant analysis using the generalized singular value decomposition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26, issue.8, pp.995-1006
DOI : 10.1109/TPAMI.2004.46

A. Jain, M. Marty, and P. Flynn, Data clustering: a review, ACM Computing Surveys, vol.31, issue.3, pp.264-323, 1999.
DOI : 10.1145/331499.331504

Z. Jin, J. Y. Yang, Z. S. Hu, and Z. Lou, Face recognition based on the uncorrelated optimal discriminant vectors, Pattern Recognition, vol.10, issue.34, pp.2041-2047, 2001.

I. Jolliffe, Principal Component Analysis, 1986.
DOI : 10.1007/978-1-4757-1904-8

G. Kimeldorf and G. Wahba, Some results on Tchebycheffian spline functions, Journal of Mathematical Analysis and Applications, vol.33, issue.1, pp.82-95, 1971.
DOI : 10.1016/0022-247X(71)90184-3

W. Krzanowski, Principles of Multivariate Analysis, 2003.

M. Law, M. Figueiredo, and A. Jain, Simultaneous feature selection and clustering using mixture models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26, issue.9, pp.1154-1166, 2004.
DOI : 10.1109/TPAMI.2004.71

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.3284

K. Liu, Y. Cheng, and J. Yang, A generalized optimal set of discriminant vectors, Pattern Recognition, vol.25, issue.7, pp.731-739, 1992.
DOI : 10.1016/0031-3203(92)90136-7

C. Maugis, G. Celeux, and M. Martin-magniette, Variable Selection for Clustering with Gaussian Mixture Models, Biometrics, vol.100, issue.3, pp.701-709, 2009.
DOI : 10.1111/j.1541-0420.2008.01160.x

URL : https://hal.archives-ouvertes.fr/inria-00153057

G. Mclachlan and T. Krishnan, The EM algorithm and extensions, 1997.

G. Mclachlan and D. Peel, Finite Mixture Models, 2000.
DOI : 10.1002/0471721182

G. Mclachlan, D. Peel, and R. Bean, Modelling high-dimensional data by mixtures of factor analyzers, Computational Statistics & Data Analysis, vol.41, issue.3-4, pp.379-388, 2003.
DOI : 10.1016/S0167-9473(02)00183-4

P. Mcnicholas and B. Murphy, Parsimonious Gaussian mixture models, Statistics and Computing, vol.61, issue.3, pp.285-296, 2008.
DOI : 10.1007/s11222-008-9056-0

A. Montanari and C. Viroli, Heteroscedastic Factor Mixture Analysis. Statistical Modeling: An International journal (forthcoming), pp.441-460, 2010.
DOI : 10.1177/1471082x0901000405

L. Parsons, E. Haque, and H. Liu, Subspace clustering for high dimensional data, ACM SIGKDD Explorations Newsletter, vol.6, issue.1, pp.69-76, 1998.
DOI : 10.1145/1007730.1007731

A. Raftery and N. Dean, Variable Selection for Model-Based Clustering, Journal of the American Statistical Association, vol.101, issue.473, pp.168-178, 2006.
DOI : 10.1198/016214506000000113

D. Rubin and D. Thayer, EM algorithms for ML factor analysis, Psychometrika, vol.34, issue.1, pp.69-76, 1982.
DOI : 10.1007/BF02293851

G. Schwarz, Estimating the Dimension of a Model, The Annals of Statistics, vol.6, issue.2, pp.461-464, 1978.
DOI : 10.1214/aos/1176344136

D. Scott and J. Thompson, Probability density estimation in higher dimensions, Fifteenth Symposium in the Interface, pp.173-179, 1983.

E. Tipping and C. Bishop, Mixtures of Probabilistic Principal Component Analyzers, Neural Computation, vol.2, issue.1, pp.443-482, 1999.
DOI : 10.1007/BF00162527

N. Trendafilov and I. T. Jolliffe, DALASS: Variable selection in discriminant analysis via the LASSO, Computational Statistics & Data Analysis, vol.51, issue.8, pp.3718-3736, 2007.
DOI : 10.1016/j.csda.2006.12.046

M. Verleysen and D. François, The Curse of Dimensionality in Data Mining and Time Series Prediction, 2005.
DOI : 10.1007/11494669_93

J. Ye, Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems, Journal of Machine Learning Research, vol.6, pp.483-502, 2005.

J. Ye, Z. Zhao, and M. Wu, Discriminative k-means for clustering, Advances in Neural Information Processing Systems, pp.1649-1656, 2007.