|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169 |
- #
- # Copyright (C) Stanislaw Adaszewski, 2020
- # License: GPLv3
- #
-
-
- """
- This module implements the basic convolutional blocks of Decagon.
- Just as a quick reminder, the basic convolution formula here is:
-
- y = A * (x * W)
-
- where:
-
- W is a weight matrix
- A is an adjacency matrix
- x is a matrix of latent representations of a particular type of neighbors.
-
- As we have x here twice, a trick is obviously necessary for this to work.
- A must be previously normalized with:
-
- c_{r}^{ij} = 1/sqrt(|N_{r}^{i}| |N_{r}^{j}|)
-
- or
-
- c_{r}^{i} = 1/|N_{r}^{i}|
-
- Let's work through this step by step to convince ourselves that the
- formula is correct.
-
- x = [
- [0, 1, 0, 1],
- [1, 1, 1, 0],
- [0, 0, 0, 1]
- ]
-
- W = [
- [0, 1],
- [1, 0],
- [0.5, 0.5],
- [0.25, 0.75]
- ]
-
- A = [
- [0, 1, 0],
- [1, 0, 1],
- [0, 1, 0]
- ]
-
- so the graph looks like this:
-
- (0) -- (1) -- (2)
-
- and therefore the representations in the next layer should be:
-
- h_{0}^{k+1} = c_{r}^{0,1} * h_{1}^{k} * W + c_{r}^{0} * h_{0}^{k}
- h_{1}^{k+1} = c_{r}^{0,1} * h_{0}^{k} * W + c_{r}^{2,1} * h_{2}^{k} +
- c_{r}^{1} * h_{1}^{k}
- h_{2}^{k+1} = c_{r}^{2,1} * h_{1}^{k} * W + c_{r}^{2} * h_{2}^{k}
-
- In actual Decagon code we can see that that latter part propagating directly
- the old representation is gone. I will try to do the same for now.
-
- So we have to only take care of:
-
- h_{0}^{k+1} = c_{r}^{0,1} * h_{1}^{k} * W
- h_{1}^{k+1} = c_{r}^{0,1} * h_{0}^{k} * W + c_{r}^{2,1} * h_{2}^{k}
- h_{2}^{k+1} = c_{r}^{2,1} * h_{1}^{k} * W
-
- If A is square the Decagon's EdgeMinibatchIterator preprocesses it as follows:
-
- A = A + eye(len(A))
- rowsum = A.sum(1)
- deg_mat_inv_sqrt = diags(power(rowsum, -0.5))
- A = dot(A, deg_mat_inv_sqrt)
- A = A.transpose()
- A = A.dot(deg_mat_inv_sqrt)
-
- Let's see what gives in our case:
-
- A = A + eye(len(A))
-
- [
- [1, 1, 0],
- [1, 1, 1],
- [0, 1, 1]
- ]
-
- rowsum = A.sum(1)
-
- [2, 3, 2]
-
- deg_mat_inv_sqrt = diags(power(rowsum, -0.5))
-
- [
- [1./sqrt(2), 0, 0],
- [0, 1./sqrt(3), 0],
- [0, 0, 1./sqrt(2)]
- ]
-
- A = dot(A, deg_mat_inv_sqrt)
-
- [
- [ 1/sqrt(2), 1/sqrt(3), 0 ],
- [ 1/sqrt(2), 1/sqrt(3), 1/sqrt(2) ],
- [ 0, 1/sqrt(3), 1/sqrt(2) ]
- ]
-
- A = A.transpose()
-
- [
- [ 1/sqrt(2), 1/sqrt(2), 0 ],
- [ 1/sqrt(3), 1/sqrt(3), 1/sqrt(3) ],
- [ 0, 1/sqrt(2), 1/sqrt(2) ]
- ]
-
- A = A.dot(deg_mat_inv_sqrt)
-
- [
- [ 1/sqrt(2) * 1/sqrt(2), 1/sqrt(2) * 1/sqrt(3), 0 ],
- [ 1/sqrt(3) * 1/sqrt(2), 1/sqrt(3) * 1/sqrt(3), 1/sqrt(3) * 1/sqrt(2) ],
- [ 0, 1/sqrt(2) * 1/sqrt(3), 1/sqrt(2) * 1/sqrt(2) ],
- ]
-
- thus:
-
- [
- [0.5 , 0.40824829, 0. ],
- [0.40824829, 0.33333333, 0.40824829],
- [0. , 0.40824829, 0.5 ]
- ]
-
- This checks out with the 1/sqrt(|N_{r}^{i}| |N_{r}^{j}|) formula.
-
- Then, we get back to the main calculation:
-
- y = x * W
- y = A * y
-
- y = x * W
-
- [
- [ 1.25, 0.75 ],
- [ 1.5 , 1.5 ],
- [ 0.25, 0.75 ]
- ]
-
- y = A * y
-
- [
- 0.5 * [ 1.25, 0.75 ] + 0.40824829 * [ 1.5, 1.5 ],
- 0.40824829 * [ 1.25, 0.75 ] + 0.33333333 * [ 1.5, 1.5 ] + 0.40824829 * [ 0.25, 0.75 ],
- 0.40824829 * [ 1.5, 1.5 ] + 0.5 * [ 0.25, 0.75 ]
- ]
-
- that is:
-
- [
- [1.23737243, 0.98737244],
- [1.11237243, 1.11237243],
- [0.73737244, 0.98737244]
- ].
-
- All checks out nicely, good.
- """
-
- from .dense import *
- from .sparse import *
- from .universal import *
|