notebook · Home

PCA

25 Sep 2018 • Maths

Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. If there are n observations with p variables, then the number of distinct principal components is \(min(n-1,p)\). This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components. The resulting vectors are an uncorrelated orthogonal basis set. PCA is sensitive to the relative scaling of the original variables.

More …

Consistent Algorithms For Clustering Time Series

20 Sep 2018 • Articles

The assumption that a given time series is stationary ergodic is one of the most general assumptions used in statistics; in particular, it allows for arbitrary long-range serial dependence, and sub- sumes most of the nonparametric as well as modelling assumptions used in the literature on clustering time series, such as i.i.d., (hidden) Markov, or mixing time series.

More …

BFGS

15 Sep 2018 • Maths

In numerical optimization, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is an iterative method for solving unconstrained nonlinear optimization problems.

More …

Data Sources

26 Aug 2018 • DS

Below is a list most interesting data sources I’ve come across:

Big Data: 33 Brilliant And Free Data Sources Anyone Can Use

Awesome Public Datasets – GitHub

More …

Fisher informationr

23 Aug 2018 • Econometrics

In mathematical statistics, the Fisher information is a way of measuring the amount of information that an observable random variable X carries about an unknown parameter θ of a distribution that models X. Formally, it is the variance of the score, or the expected value of the observed information.

In Bayesian statistics, the Asymptotic distribution of the posterior mode depends on the Fisher information and not on the prior (according to the Bernstein–von Mises theorem, which was anticipated by Laplace for exponential families). The Fisher information is also used in the calculation of the Jeffreys prior, which is used in Bayesian statistics.

The Fisher-information matrix is used to calculate the covariance matrices associated with maximum-likelihood estimates. It can also be used in the formulation of test statistics, such as the Wald test.

The Fisher information has been used to find bounds on the accuracy of neural codes. It is also used in machine learning techniques such as elastic weight consolidation, which reduces catastrophic forgetting in artificial neural networks.

More …