Analysis of Multivariate and High-Dimensional DataCambridge University Press, 02.12.2013 'Big data' poses challenges that require both classical multivariate methods and contemporary techniques from machine learning and engineering. This modern text equips you for the new world - integrating the old and the new, fusing theory and practice and bridging the gap to statistical learning. The theoretical framework includes formal statements that set out clearly the guaranteed 'safe operating zone' for the methods and allow you to assess whether data is in the zone, or near enough. Extensive examples showcase the strengths and limitations of different methods with small classical data, data from medicine, biology, marketing and finance, high-dimensional data from bioinformatics, functional data from proteomics, and simulated data. High-dimension low-sample-size data gets special attention. Several data sets are revisited repeatedly to allow comparison of methods. Generous use of colour, algorithms, Matlab code, and problem sets complete the package. Suitable for master's/graduate students in statistics and researchers in data-rich disciplines. |
Inhalt
3 | |
Principal Component Analysis | 18 |
3 | 70 |
4 | 116 |
Problems for Part I | 165 |
Norms Proximities Features and Dualities | 175 |
6 | 183 |
Factor Analysis | 223 |
11 | 349 |
Kernel and More Independent Component Methods | 381 |
13 | 421 |
14 | 435 |
22 | 442 |
31 | 450 |
Problems for Part III | 476 |
483 | |
8 | 248 |
Problems for Part II | 286 |
9 | 295 |
Independent Component Analysis | 305 |
493 | |
503 | |
Andere Ausgaben - Alle anzeigen
Häufige Begriffe und Wortgruppen
Algorithm approach approximation asymptotic breast cancer data calculate Canonical Correlation Analysis canonical correlation matrix CC scores classes classical column configurations consider corresponding d-dimensional random vector data sets defined Definition density estimation derived diagonal dimension Discriminant Analysis discriminant rule dissimilarities distribution eigenvalues eigenvectors entries error Euclidean distance Factor Analysis factor loadings factor scores FastICA feature Figure Fisher’s rule Gaussian HDLSS Independent Component Analysis kernel kurtosis linear Marron maximiser methods misclassified Multidimensional Scaling multivariate negentropy normal notation number of clusters number of variables observations obtained optimal orthogonal pairs panel parameters plots population Principal Component Analysis probability density function Projection Pursuit random vectors ranking vector regression relationship sample covariance matrix scaled data Section shows similar singular value singular value decomposition skewness and kurtosis solutions spectral decomposition statistic step Table Theorem Tibshirani transformed univariate variance within-cluster variability x-axis