Consider a two-layer feedforward ANN with two inputs a and b, one hidden unit c,
ID: 3630958 • Letter: C
Question
Consider a two-layer feedforward ANN with two inputs a and b, one hidden unit c, and one output unit d. This network has five weights (wca, wcb, wc0, wdc, wd0) where wx0 represents the threshold weight for unit x. Initialize these weights to the values (.1, .1, .1, .1, .1), then give their values after each of the first two training iterations of the BACKPROPAGATION algorithm. Assume the learning rate a = .3, and momentum beta = 0.9, incremental weight updates, and the following training examplesa b d
1 0 1
0 1 0
Explanation / Answer
For stochastic gradient descent for 2-class problem, we use online version of equations (11.23) and (11.24). =============================================================================================================================(1) First iteration: processing for the ?rst training example a = 1;b = 0;d = 1 ============================================================================================================================= oc = sigmoid(x0 Wc0 +a Wca +b Wcb) = sigmoid(0:2) = 1/1+e0:2=1/1:819= 0:55 =============================================================================================================================d sigmoid(z0 Wd0 +oc Wdc) = sigmoid(0:155) = 1/1+e0:155 =1/1:856 = 0:539 =============================================================================================================================DWd0 = h(r d)z0 = 0:3 (10:539) 1 = 0:138 ============================================================================================================================= Wd0 = Wd0 +DWd0 = 0:238 =============================================================================================================================DWdc = h(r d)oc = 0:3 (10:539) 0:55 = 0:076 ============================================================================================================================= Wdc = Wdc +DWdc = 0:176 =============================================================================================================================DWc0 = h(r d)Wdcoc(1oc)x0 = 0:3 (10:539) 0:1 0:55 (10:55) 1 = 0:0034 ============================================================================================================================= Wc0 = Wc0 +DWc0 = 0:1034 =============================================================================================================================DWca = h(r d)Wdcoc(1oc)a = 0:3 (10:539) 0:1 0:55 (10:55) 1 = 0:0034 =============================================================================================================================) Wca = Wca +DWca = 0:1034 =============================================================================================================================DWcb = h(r d)Wdcoc(1oc)b = 0:3 (10:539) 0:1 0:55 (10:55) 0 = 0 =============================================================================================================================Wcb = Wcb +DWcb = 0:1 =============================================================================================================================Note that the momentum is not considered here since this is the ?rst weight update. =============================================================================================================================So, after the ?rst iteration, the weights (wca, wcb, wc0, wdc, wd0) becomes (0.1034, 0.1, 0.1034, 0.176, 0.238). ==========================================================================================================================================================================================================================================================(2) Second iteration: processing for the second training example a = 0;b = 1;d = 0 =============================================================================================================================oc = sigmoid(x0 Wc0 +a Wca +b Wcb) = sigmoid(0:2034) = 1 =============================================================================================================================1+e0:2034 = 0:551 =============================================================================================================================d = sigmoid(z0 Wd0 +oc Wdc) = sigmoid(0:335) = 1 =============================================================================================================================1+e0:335 = 0:583 =============================================================================================================================DWd0 = h(r d)z0 +aDWd0 = 0:3 (0:583) 1+0:9 0:138 = 0:0507 =============================================================================================================================) Wd0 = Wd0 +DWd0 = 0:2380:0507 = 0:1873 =============================================================================================================================DWdc = h(r d)oc +aDWdc = 0:3 (0:583) 0:551+0:9 0:076 = 0:028 =============================================================================================================================) Wdc = Wdc +DWdc = 0:1760:028 = 0:148 =============================================================================================================================DWc0 = h(r d)Wdcoc(1 oc)x0 + aDWc0 = 0:3 (0:583) 0:176 0:551 (1 0:551) =============================================================================================================================1+0:9 0:0034 = 0:0045 =============================================================================================================================) Wc0 = Wc0 +DWc0 = 0:10340:0045 = 0:0989 =============================================================================================================================DWca = h(r d)Wdcoc(1oc)a+aDWca = 0+0:9 0:0034 = 0:00306 =============================================================================================================================) Wca = Wca +DWca = 0:1034+0:00306 = 0:1065 =============================================================================================================================DWcb = h(r d)Wdcoc(1 oc)b + aDWcb = 0:3 (0:583) 0:176 0:551 (1 0:551) =============================================================================================================================1+0:9 0 = 0:0076 =============================================================================================================================) =============================================================================================================================Wcb = Wcb +DWcb = 0:10:0076 = 0:0924 =============================================================================================================================So, after the second iteration, the weights (wca, wcb, wc0, wdc, wd0) becomes (0.1065, 0.0924, 0.0989, 0.148, 0.1873).