Ing. Andrés Esteban Páez Torres
Director:
Fabio Augusto González Osorio PhD.
MF is a powerful analysis tool. Has applications like clustering, latent topic analysis, dictionary learning, among others.
Kernel methods allow extracting non-linear patterns from data, however have a high space and time cost compared to linear methods
The amount of aviable information is growing fast and there are many opportunities analysing this information.
Matrix factorization is a family of linear-algebra methods that take a matrix and compute two or more matrices, when multiplied are equal to the input matrix
Kernels are functions that map points from a input space X to a feature space F where the non-linear patterns become linear
Kernel Matrix factorization is a similar method, however, instead of factorizing the input-space matrix, it factorizes a feature-space matrix.
Usually isn't possible to calculate the explicit mapping into feature space. There are feature spaces with infinte dimensions or are just unkown.
Using the kernel trick is easier, however, computing a pairwise kernel function, leads to a Gram matrix of size n×n and a computing time O(n²)
To design, implement and evaluate a new KMF method that is able to compute an kernel-induced feature space factorization to a large-scale volume of data
We selected SGD as the optimization technique, given the original loss function can be expressed as the following sum.
Taking the partial derivative respect h and equalling to 0, we have:
Taking the partial derivative respect W and substracting it to W, we have:
The selected performance measure is the clustering accuracy, this measures the ratio between the number of correctly clustered instances and the total number of instances.
Linear kernel
Gaussian kernel