Mieko Tanaka-Yamawaki Yumihiko Ikura Tanaka-Yamawaki Principal Component Analysis and Randomness Tests for Big Data Analysis

Principal Component Analysis and Randomness Tests for Big Data Analysis

von Mieko Tanaka-Yamawaki Yumihiko Ikura

Preis unbekannt

Buch in deiner Nähe kaufen


...oder deine aktuelle Postleitzahl eingeben:
oder

Beschreibung

This book presents the novel approach of analyzing large-sized numerical data (so-called big data). The essence of this approach is to grasp the "meaning" of the data instantly, without getting into the details of individual data. Unlike conventional approaches of principal component analysis, randomness tests, and visualization methods, the authors' approach has the benefits of universality and simplicity of data analysis, regardless of data types, structures, or specific field of science.

First, mathematical preparation is described. The RMT-PCA and the RMT-test utilize the cross-correlation matrix of time series, C = XXT, where X represents a rectangular matrix of N rows and L columns and XT represents the transverse matrix of X. The RMT-PCA uses N samples of time series of length L. The RMT-test uses N elements of length L by cutting a single data to N pieces. Because C is symmetric, namely, C = CT, it can be converted to a diagonal matrix of eigenvalues by a similarity transformation SCST using an orthogonal matrix S. When N is significantly large, the histogram of the eigenvalue distribution can be compared to the theoretical formula derived in the context of the random matrix theory (RMT, in abbreviation).

Then the RMT-PCA is applied to high-frequency stock prices in Japanese and American markets. This approach proves its effectiveness in extracting "trendy" business sectors of the financial market over the prescribed time scale. In this case, X consists of N stock- prices of length L, and the correlation matrix C is an N by N square matrix, whose element at the i-th row and j-th column is the inner product of the price time series of the length L of the i-th stock and the j-th stock of the equal length L.

Next, the RMT-test is applied to measure randomness of various random number generators, including algorithmically generated random numbers and physically generated random numbers.

The book concludes by demonstrating three applications of the RMT-test: (1) a comparison of hash functions, (2) choice of safe stocks, and (3) prediction of stock index by means of a sudden change of randomness.


This book presents the novel approach of analyzing large-sized numerical data (so-called big data). The essence of this approach is to grasp the "meaning" of the data instantly, without getting into the details of individual data. Unlike conventional approaches of principal component analysis, randomness tests, and visualization methods, the authors' approach has the benefits of universality and simplicity of data analysis, regardless of data types, structures, or specific field of science.

First, mathematical preparation is described. The RMT-PCA and the RMT-test utilize the cross-correlation matrix of time series, C = XXT, where X represents a rectangular matrix of N rows and L columns and XT represents the transverse matrix of X. The RMT-PCA uses N samples of time series of length L. The RMT-test uses N elements of length L by cutting a single data to N pieces. Because C is symmetric, namely, C = CT, it can be converted to a diagonal matrix of eigenvalues by a similarity transformation SCST using an orthogonal matrix S. When N is significantly large, the histogram of the eigenvalue distribution can be compared to the theoretical formula derived in the context of the random matrix theory (RMT, in abbreviation).

Then the RMT-PCA is applied to high-frequency stock prices in Japanese and American markets. This approach proves its effectiveness in extracting "trendy" business sectors of the financial market over the prescribed time scale. In this case, X consists of N stock- prices of length L, and the correlation matrix C is an N by N square matrix, whose element at the i-th row and j-th column is the inner product of the price time series of the length L of the i-th stock and the j-th stock of the equal length L.

Next, the RMT-test is applied to measure randomness of various random number generators, including algorithmically generated random numbers and physically generated random numbers.

The book concludes by demonstrating three applications of the RMT-test: (1) a comparison of hash functions, (2) choice of safe stocks, and (3) prediction of stock index by means of a sudden change of randomness.


Presents a practical method to use PCA and randomness measure based on the RMT formula

Proposes a new and universal approach of big data analysis irrelevant to the details of data types or fields

Uses real-world data to derive practical results for stock market forecasts and computer security



Autor*in

Mieko Tanaka-Yamawaki

Themen in »Principal Component Analysis and Randomness Tests for Big Data Analysis«

Big Data Analysis Evaluation of Random Number Generators RMT-PCA RMT-Test Trendy Sectors of the Stock Market

Stimmen zu »Principal Component Analysis and Randomness Tests for Big Data Analysis«

Details

ISBN: 9784431559061
Verlag: Springer Tokyo
Erscheinung: 11.09.2022

Link teilen


Über buchnah.de | Die Buchhandlungen | Die Verlage | Impressum & Kontakt | Datenschutz | Presse


Auf dieser Seite kannst Du Buchhandlungen in der Nähe finden