parsimony.datasets.simulate.l1_l2

22 """Returns data generated such that we know the exact solution. 23 24 The data generated by this function is fit to the Linear regression + L1 + 25 L2 + Smoothed total variation function, i.e.: 26 27 f(b) = (1 / 2).|Xb - y|² + l.|b|_1 + (k / 2).|b|² + g.TVmu(b), 28 29 where |.|_1 is the L1 norm, |.|² is the squared L2 norm and TVmu is the 30 smoothed total variation penalty. 31 32 Parameters 33 ---------- 34 l : Non-negative float. The L1 regularisation parameter. 35 36 k : Non-negative float. The L2 regularisation parameter. 37 38 g : Non-negative float. The total variation regularisation parameter. 39 40 beta : Numpy array (p-by-1). The regression vector to generate data from. 41 42 M : Numpy array (n-by-p). The matrix to use when building data. This 43 matrix carries the desired correlation structure of the generated 44 data. The generated data will be a column-scaled version of this 45 matrix. 46 47 e : Numpy array (n-by-1). The error vector e = Xb - y. This vector carries 48 the desired distribution of the residual. 49 50 A : Numpy or (usually) scipy.sparse array (K-by-p). The linear operator 51 for the Nesterov function. 52 53 mu : Non-negative float. The Nesterov smoothing regularisation parameter. 54 55 snr : Positive float. Signal-to-noise ratio between model and residual. 56 57 intercept : Boolean. Whether or not to include an intercept variable. This 58 variable is not penalised. Note that if intercept is True, then e 59 will be centred. 60 61 Returns 62 ------- 63 X : Numpy array (n-by-p). The generated X matrix. 64 65 y : Numpy array (n-by-1). The generated y vector. 66 67 beta : Numpy array (p-by-1). The regression vector with the correct snr. 68 """ 69 l = float(l) 70 k = float(k) 71 g = float(g) 72 73 if intercept: 74 e = e - np.mean(e) 75 76 if snr != None: 77 def f(x): 78 X, y = _generate(l, k, g, x * beta, M, e, A, mu, intercept) 79 80 # print "snr = %.5f = %.5f = |X.b| / |e| = %.5f / %.5f" \ 81 # % (snr, np.linalg.norm(np.dot(X, x * beta)) \ 82 # / np.linalg.norm(e), 83 # np.linalg.norm(np.dot(X, x * beta)), np.linalg.norm(e)) 84 85 return (np.linalg.norm(np.dot(X, x * beta)) / np.linalg.norm(e)) \ 86 - snr

98 99 p = beta.shape[0] 100 101 if intercept: 102 gradL1 = grad_l1(beta[1:, :]) 103 gradL2 = grad_l2_squared(beta[1:, :]) 104 gradTVmu = grad_tvmu(beta[1:, :], A, mu) 105 else: 106 gradL1 = grad_l1(beta) 107 gradL2 = grad_l2_squared(beta) 108 gradTVmu = grad_tvmu(beta, A, mu) 109 110 alpha = -(l * gradL1 + k * gradL2 + g * gradTVmu) 111 Mte = np.dot(M.T, e) 112 if intercept: 113 alpha = np.divide(alpha, Mte[1:, :]) 114 else: 115 alpha = np.divide(alpha, Mte) 116 117 X = np.ones(M.shape) 118 if intercept: 119 for i in xrange(p - 1): 120 X[:, i + 1] = M[:, i + 1] * alpha[i, 0] 121 else: 122 for i in xrange(p): 123 X[:, i] = M[:, i] * alpha[i, 0] 124 125 y = np.dot(X, beta) - e 126 127 return X, y

Source Code for Module parsimony.datasets.simulate.l1_l2_tvmu