[ML] 機器學習技法：第十五講 Matrix Factorization

ML：基礎技法學習
Package：scikit-learn
課程：機器學習技法
簡介：第十五講 Matrix Factorization

Basic Matrix Factorization

常數項 $x_0^{(l)}$ 被移除

初始化 $\tilde{d}\times 1$ 維度的 $\left \{ \mathbf{w}_m \right \},\left \{ \mathbf{v}_m \right \}$
通常為隨機產生，以免陷入 Local optimum
alternating optimization $E_{in}$ 直到收斂

最佳化 $\mathbf{w}_1,\mathbf{w}_2,\cdots,\mathbf{w}_M$
更新 $\mathbf{w}_m$，利用 m-th-movie 的 $\left \{ (\mathbf{v}_n,r_{nm}) \right \}$ 做 linear regression
最佳化 $\mathbf{v}_1,\mathbf{v}_2,\cdots,\mathbf{v}_N$
更新 $\mathbf{v}_n$，利用 n-th-user 的 $\left \{ (\mathbf{w}_m,r_{nm}) \right \}$ 做 linear regression

Linear Autoencoder vs. Matrix Factorization

Linear Autoencoder ≡ 特別的 Matrix Factorization 其 $\mathbf{X}$ 為 complete
比較如下圖
Linear autoencoder 可參考 [ML] 機器學習技法：第十三講 Deep Learning

Stochastic Gradient Descent Matrix Factorization

較適用於龐大的資料

初始化 $\tilde{d}\times 1$ 維度的 $\left \{ \mathbf{w}_m \right \},\left \{ \mathbf{v}_m \right \}$
通常為隨機產生，以免陷入 Local optimum
for $t=0,1,\cdots ,T$

隨機從 $r_{nm}$ 挑出一個
計算 residual $\tilde{r}_{nm}=(r_{nm}-\mathbf{w}_m^T\mathbf{v}_n)$
SGD-update
$$ \begin{align*} \mathbf{v}_n^{new} &\leftarrow \mathbf{v}_n^{old}+\eta \cdot \tilde{r}_{nm}\mathbf{w}_m^{old}\\ \mathbf{w}_m^{new} &\leftarrow \mathbf{w}_m^{old}+\eta \cdot \tilde{r}_{nm}\mathbf{v}_n^{old}\\ \end{align*} $$

若對演算法有足夠的了解，可依其實際做出改善
針對電影的例子，因理論上越接近現在的時間，越接近使用者的喜好
依 SGD 設計出改進方案，最後 $T{}'$ 次的更新限定在從最近的資料挑選，不再是從全部挑選

總結

Extraction Models

將特徵轉換包含在學習中，最後執行 linear model
好處

簡單，無需再特別設計 features
powerful

壞處

難以最佳化，因為 non-convex
overfitting，需要適當的 regularization/validation

種類

Adaptive/Gradient Boosting

[ML] 機器學習技法：第十一講 Gradient Boosted Decision Tree
特徵轉換

hypotheses $g_t$
方法

functional gradient descent

linear model

weights $\alpha t$

Neural Network/Deep Learning

[ML] 機器學習技法：第十二講 Neural Network
特徵轉換

weights $w_{ij}^{(l)}$
方法

autoencoder 初始化權重
SGD (backprop) 更新權重

linear model

weights $w_{ij}^{(L)}$

RBF Network

[ML] 機器學習技法：第十四講 Radial Basis Function Network
特徵轉換

RBF centers $\mu_m$

方法

k-means clustering 初始化中心點

linear model

weights $\beta_m$

k Nearest Neighbor

lazy learning
[ML] 機器學習技法：第十四講 Radial Basis Function Network
特徵轉換

$\mathbf{x}_n$-neighbor RBF

linear model

weights $y_n$

Matrix Factorization

因是對稱互為特徵轉換與 linear model

user features $\mathbf{v}_n$
movie features $\mathbf{w}_m$
方法

SGD
alternating leastSQR

程式碼

BinaryVectorEncoding

from sklearn import preprocessing


lb = preprocessing.LabelBinarizer()

lb.fit(['A', 'B', 'A', 'AB', 'O'])

print(lb.classes_)
# array(['A' 'AB' 'B' 'O'])

print(lb.transform(['A', 'B']))
# [[1 0 0 0]
#  [0 0 1 0]]

參考

sklearn.preprocessing.LabelBinarizer
sklearn.preprocessing.OneHotEncoder

子風的知識庫

搜尋此網誌

[ML] 機器學習技法：第十五講 Matrix Factorization

Basic Matrix Factorization

Linear Autoencoder vs. Matrix Factorization

Stochastic Gradient Descent Matrix Factorization

總結

程式碼

參考

留言

張貼留言