Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Consider a two-layer feedforward ANN with two inputs a and b, one hidden unit c,

ID: 3630958 • Letter: C

Question

Consider a two-layer feedforward ANN with two inputs a and b, one hidden unit c, and one output unit d. This network has five weights (wca, wcb, wc0, wdc, wd0) where wx0 represents the threshold weight for unit x. Initialize these weights to the values (.1, .1, .1, .1, .1), then give their values after each of the first two training iterations of the BACKPROPAGATION algorithm. Assume the learning rate a = .3, and momentum beta = 0.9, incremental weight updates, and the following training examples
a b d
1 0 1
0 1 0

Explanation / Answer

For stochastic gradient descent for 2-class problem, we use online version of equations (11.23) and (11.24). =============================================================================================================================(1) First iteration: processing for the ?rst training example a = 1;b = 0;d = 1 ============================================================================================================================= oc = sigmoid(x0 Wc0 +a Wca +b Wcb) = sigmoid(0:2) = 1/1+e0:2=1/1:819= 0:55 =============================================================================================================================d sigmoid(z0 Wd0 +oc Wdc) = sigmoid(0:155) = 1/1+e0:155 =1/1:856 = 0:539 =============================================================================================================================DWd0 = h(r d)z0 = 0:3 (10:539) 1 = 0:138 ============================================================================================================================= Wd0 = Wd0 +DWd0 = 0:238 =============================================================================================================================DWdc = h(r d)oc = 0:3 (10:539) 0:55 = 0:076 ============================================================================================================================= Wdc = Wdc +DWdc = 0:176 =============================================================================================================================DWc0 = h(r d)Wdcoc(1oc)x0 = 0:3 (10:539) 0:1 0:55 (10:55) 1 = 0:0034 ============================================================================================================================= Wc0 = Wc0 +DWc0 = 0:1034 =============================================================================================================================DWca = h(r d)Wdcoc(1oc)a = 0:3 (10:539) 0:1 0:55 (10:55) 1 = 0:0034 =============================================================================================================================) Wca = Wca +DWca = 0:1034 =============================================================================================================================DWcb = h(r d)Wdcoc(1oc)b = 0:3 (10:539) 0:1 0:55 (10:55) 0 = 0 =============================================================================================================================Wcb = Wcb +DWcb = 0:1 =============================================================================================================================Note that the momentum is not considered here since this is the ?rst weight update. =============================================================================================================================So, after the ?rst iteration, the weights (wca, wcb, wc0, wdc, wd0) becomes (0.1034, 0.1, 0.1034, 0.176, 0.238). ==========================================================================================================================================================================================================================================================(2) Second iteration: processing for the second training example a = 0;b = 1;d = 0 =============================================================================================================================oc = sigmoid(x0 Wc0 +a Wca +b Wcb) = sigmoid(0:2034) = 1 =============================================================================================================================1+e0:2034 = 0:551 =============================================================================================================================d = sigmoid(z0 Wd0 +oc Wdc) = sigmoid(0:335) = 1 =============================================================================================================================1+e0:335 = 0:583 =============================================================================================================================DWd0 = h(r d)z0 +aDWd0 = 0:3 (0:583) 1+0:9 0:138 = 0:0507 =============================================================================================================================) Wd0 = Wd0 +DWd0 = 0:2380:0507 = 0:1873 =============================================================================================================================DWdc = h(r d)oc +aDWdc = 0:3 (0:583) 0:551+0:9 0:076 = 0:028 =============================================================================================================================) Wdc = Wdc +DWdc = 0:1760:028 = 0:148 =============================================================================================================================DWc0 = h(r d)Wdcoc(1 oc)x0 + aDWc0 = 0:3 (0:583) 0:176 0:551 (1 0:551) =============================================================================================================================1+0:9 0:0034 = 0:0045 =============================================================================================================================) Wc0 = Wc0 +DWc0 = 0:10340:0045 = 0:0989 =============================================================================================================================DWca = h(r d)Wdcoc(1oc)a+aDWca = 0+0:9 0:0034 = 0:00306 =============================================================================================================================) Wca = Wca +DWca = 0:1034+0:00306 = 0:1065 =============================================================================================================================DWcb = h(r d)Wdcoc(1 oc)b + aDWcb = 0:3 (0:583) 0:176 0:551 (1 0:551) =============================================================================================================================1+0:9 0 = 0:0076 =============================================================================================================================) =============================================================================================================================Wcb = Wcb +DWcb = 0:10:0076 = 0:0924 =============================================================================================================================So, after the second iteration, the weights (wca, wcb, wc0, wdc, wd0) becomes (0.1065, 0.0924, 0.0989, 0.148, 0.1873).