Piecewise Regression. In this problem, we will design an algorithm for a form of
ID: 3700268 • Letter: P
Question
Piecewise Regression. In this problem, we will design an algorithm for a form of regression. We are given a list of n points x, and we want to output another list yi, which is close to x, but which has very few changepoints. A changepoint in y is an index i such that yi f yi+1 More formally for a vector y, the segments S of y are the sets of contiguous indices for which y has the same value. For example, the vector y = (0,0, 1,2, 2, 1) has segments S = {{1,2), {3), {4,5), {6)). Notice that the segments form a partition of the set {1,... ,n). We use s E S to denote a segment and we must have yi -yj for i,j E s. We use Is| to denote the number of elements in the segment Intuitively if y has few changepoints, then it should have few segments, or relatedly, it should have large segments. This question formulates two optimization problems based on this intuition. (a) For a fixed list x, define the cost of a vector y, with segments S as TL cost 1 (y) SES Give an O(n) algorithm that takes as input r and finds the vector y that minimizes this cost (b) Redefine the cost: SES Give an O(n3) algorithm that takes as input r and finds the vector y that minimizes this cost Fact. In both cases, if s is a segment in the optimal solution, then the optimal y has yi -Hc., for all i E s. This minimizes the squared error, since all ys must take the same value in the segment. You may use this fact without proof. Assume that computing the mean of k numbers takes O(k) time.Explanation / Answer
Answer:
(a) Algorithm:
1. Let sum_of_squared_diff = 0 //sum of squared differences
2. Let sum_of_segment_lengths = n //sum of lengths of all segments in S. This will be equal to the number of points given.
3. For each xj in set, do the following:
i) Estimate yj;
ii) Calculate (yj - xj)2; Add to sum_of_squared_diff;
iii) Calculate cost = sum_of_squared_diff - sum_of_segment_lengths;
iv) Check if cost is minimized. If true, output yj, else ignore.
4. Return