Package parsimony :: Package datasets :: Package simulate :: Module regression
[hide private]
[frames] | no frames]

Module regression

source code

Created on Tue Jun 18 09:22:40 2013

Copyright (c) 2013-2014, CEA/DSV/I2BM/Neurospin. All rights reserved.


Author: Tommy Löfstedt

License: BSD 3-clause.

Functions [hide private]
 
load(size=[[100, 100]], rho=[0.05], delta=0.1, eps=None, density=0.5, snr=100.0, locally_smooth=False)
Generates random data for regression purposes.
source code
Variables [hide private]
  __package__ = 'parsimony.datasets.simulate'
Function Details [hide private]

load(size=[[100, 100]], rho=[0.05], delta=0.1, eps=None, density=0.5, snr=100.0, locally_smooth=False)

source code 
Generates random data for regression purposes. Builds data with a
regression model on the form

    y = X.beta + e.

Parameters
----------
size : A list or a list of lists. The shapes of the block matrices to
        generate. The numbers of rows must be the same.

rho : A scalar or a list of the average correlation between off-diagonal
        elements of S.

delta : Baseline noise between groups. Only used if the number of groups is
        greater than one and locally_smooth=False. The baseline noise is
        computed as

            delta * rho_min,

        and you must prvide a delta such that 0 <= delta < 1.

eps : Maximum entry-wise random noise. This parameter determines the
        distribution of the noise. The noise is approximately normally
        distributed. If locally_smooth=False the mean is

            delta * rho_min

        and the variance is

            (eps * (1 - max(rho))) ** 2.0 / 10.

        If locally_smooth=True, the mean is zero and the variance is

            (eps * (1.0 - max(rho)) / (1.0 + max(rho))) ** 2.0 / 10.

        You can thus control the noise by this parameter, but note that you
        must have

            0 <= eps < 1.

density : Determines how much of the regression vector is set to zero. If
        density=1.0, the regression vector is dense and if density=0.0
        would mean a zero vector. However, note that you should let

            density * p >= 1,

        where p is the number of columns in size.

snr : The signal-to-noise ratio. The dependent variable is computed as

            y = X.beta + e

        and Var(e) = (||X.beta||² / (n - 1)) / snr.

locally_smooth : If True, uses ToeplitzCorrelation (with "local
        smoothing"); if False, uses ConstantCorrelation.

Returns
-------
X : The matrix of independent variables.

y : The dependent variable.

beta : The regression vector.

e : The noise/residual vector.