parsimony.datasets.simulate.regression

load(size=`[[`100`,` 100`]]`, rho=`[`0.05`]`, delta=0.1, eps=None, density=0.5, snr=100.0, locally_smooth=False)

Generates random data for regression purposes. Builds data with a
regression model on the form

    y = X.beta + e.

Parameters
----------
size : A list or a list of lists. The shapes of the block matrices to
        generate. The numbers of rows must be the same.

rho : A scalar or a list of the average correlation between off-diagonal
        elements of S.

delta : Baseline noise between groups. Only used if the number of groups is
        greater than one and locally_smooth=False. The baseline noise is
        computed as

            delta * rho_min,

        and you must prvide a delta such that 0 <= delta < 1.

eps : Maximum entry-wise random noise. This parameter determines the
        distribution of the noise. The noise is approximately normally
        distributed. If locally_smooth=False the mean is

            delta * rho_min

        and the variance is

            (eps * (1 - max(rho))) ** 2.0 / 10.

        If locally_smooth=True, the mean is zero and the variance is

            (eps * (1.0 - max(rho)) / (1.0 + max(rho))) ** 2.0 / 10.

        You can thus control the noise by this parameter, but note that you
        must have

            0 <= eps < 1.

density : Determines how much of the regression vector is set to zero. If
        density=1.0, the regression vector is dense and if density=0.0
        would mean a zero vector. However, note that you should let

            density * p >= 1,

        where p is the number of columns in size.

snr : The signal-to-noise ratio. The dependent variable is computed as

            y = X.beta + e

        and Var(e) = (||X.beta||² / (n - 1)) / snr.

locally_smooth : If True, uses ToeplitzCorrelation (with "local
        smoothing"); if False, uses ConstantCorrelation.

Returns
-------
X : The matrix of independent variables.

y : The dependent variable.

beta : The regression vector.

e : The noise/residual vector.

Module regression

load(size=[[100, 100]], rho=[0.05], delta=0.1, eps=None, density=0.5, snr=100.0, locally_smooth=False)

load(size=`[[`100`,` 100`]]`, rho=`[`0.05`]`, delta=0.1, eps=None, density=0.5, snr=100.0, locally_smooth=False)