picos.expressions.samples¶
Implements Samples
.
Classes
- class picos.expressions.samples.Samples(samples=None, forced_original_shape=None, **kwargs)[source]¶
Bases:
object
A collection of data points.
- Example
>>> from picos.expressions import Samples >>> # Load the column-major vectorization of six matrices. >>> data = [[[1*i, 3*i], ... [2*i, 4*i]] for i in range(1, 7)] >>> S = Samples(data) >>> S <Samples: (6 4-dimensional samples)> >>> [S.num, S.dim, S.original_shape] # Metadata. [6, 4, (2, 2)] >>> S.matrix # All samples as the columns of one matrix. <4×6 Real Constant: [4×6]> >>> print(S.matrix) [ 1.00e+00 2.00e+00 3.00e+00 4.00e+00 5.00e+00 6.00e+00] [ 2.00e+00 4.00e+00 6.00e+00 8.00e+00 1.00e+01 1.20e+01] [ 3.00e+00 6.00e+00 9.00e+00 1.20e+01 1.50e+01 1.80e+01] [ 4.00e+00 8.00e+00 1.20e+01 1.60e+01 2.00e+01 2.40e+01] >>> print(S[0].T) # The first sample (transposed for brevity). [ 1.00e+00 2.00e+00 3.00e+00 4.00e+00] >>> print(S.mean.T) # The sample mean (transposed for brevity). [ 3.50e+00 7.00e+00 1.05e+01 1.40e+01] >>> print(S.covariance) # The sample covariance matrix. [ 3.50e+00 7.00e+00 1.05e+01 1.40e+01] [ 7.00e+00 1.40e+01 2.10e+01 2.80e+01] [ 1.05e+01 2.10e+01 3.15e+01 4.20e+01] [ 1.40e+01 2.80e+01 4.20e+01 5.60e+01] >>> print(S.original[0]) # The first sample in its original shape. [ 1.00e+00 3.00e+00] [ 2.00e+00 4.00e+00] >>> U = S.select([0, 2, 4]) # Select a subset of samples by indices. >>> print(U.matrix) [ 1.00e+00 3.00e+00 5.00e+00] [ 2.00e+00 6.00e+00 1.00e+01] [ 3.00e+00 9.00e+00 1.50e+01] [ 4.00e+00 1.20e+01 2.00e+01] >>> T, V = S.partition() # Split into training and validation samples. >>> print(T.matrix) [ 1.00e+00 2.00e+00 3.00e+00] [ 2.00e+00 4.00e+00 6.00e+00] [ 3.00e+00 6.00e+00 9.00e+00] [ 4.00e+00 8.00e+00 1.20e+01] >>> print(V.matrix) [ 4.00e+00 5.00e+00 6.00e+00] [ 8.00e+00 1.00e+01 1.20e+01] [ 1.20e+01 1.50e+01 1.80e+01] [ 1.60e+01 2.00e+01 2.40e+01]
- __init__(samples, forced_original_shape=None, always_copy=True)[source]¶
Load a number of data points (samples).
- Parameters
samples –
Any of the following:
A tuple or list of constants, each of which denotes a sample vector. Matrices are vectorized but their
original_shape
is stored and may be used by PICOS internally.A constant row or column vector whose entries denote scalar samples.
A constant matrix whose columns denote the samples.
Another
Samples
instance. If possible, it is returned as is (Samples
instances are immutable), otherwise a shallow copy with the necessary modifications is returned instead.
In any case, constants may be given as constant numeric data values (anything recognized by
load_data
) or as constant PICOS expressions.forced_original_shape – Overwrites
original_shape
with the given shape.always_copy (bool) – If this is
False
, then data that is provided in the form of CVXOPT types is not copied but referenced if possible. This can speed up instance creation but will introduce inconsistencies if the original data is modified. Note that this argument has no impact if thesamples
argument already is aSamples
instance; in this case data is never copied.
- static __new__(cls, samples=None, forced_original_shape=None, **kwargs)[source]¶
Prepare a
Samples
instance.
- kfold(k)[source]¶
Perform
-fold cross-validation (without shuffling).
If random shuffling is desired, write
S.shuffled().kfold(k)
whereS
is yourSamples
instance. To make the shuffling reproducible, seeshuffled
.- Returns list(tuple)
A list of
training set and validation set pairs.
Warning
If the number of samples
is not a multiple of
, then the last
samples will appear in every training but in no validation set.
- Example
>>> from picos.expressions import Samples >>> n, k = 7, 3 >>> S = Samples(range(n)) >>> for i, (T, V) in enumerate(S.kfold(k)): ... print("Partition {}:\nT = {}V = {}" ... .format(i + 1, T.matrix, V.matrix)) Partition 1: T = [ 2.00e+00 3.00e+00 4.00e+00 5.00e+00 6.00e+00] V = [ 0.00e+00 1.00e+00] Partition 2: T = [ 0.00e+00 1.00e+00 4.00e+00 5.00e+00 6.00e+00] V = [ 2.00e+00 3.00e+00] Partition 3: T = [ 0.00e+00 1.00e+00 2.00e+00 3.00e+00 6.00e+00] V = [ 4.00e+00 5.00e+00]
- select(indices)[source]¶
Return a new
Samples
instance with only selected samples.- Parameters
indices – The indices of the samples to select.
- shuffled(rng=None)[source]¶
Return a randomly shuffled instance of the samples.
- Parameters
rng – A function that generates a random
float
in. Defaults to whatever
random.shuffle
defaults to.- Example
>>> from picos.expressions import Samples >>> S = Samples(range(6)) >>> print(S.matrix) [ 0.00e+00 1.00e+00 2.00e+00 3.00e+00 4.00e+00 5.00e+00] >>> rng = lambda: 0.5 # Fake RNG for reproducibility. >>> print(S.shuffled(rng).matrix) [ 0.00e+00 5.00e+00 1.00e+00 4.00e+00 2.00e+00 3.00e+00]
- property dim¶
Sample dimension.
- property matrix¶
A matrix whose columns are the samples.
- property num¶
Number of samples.
- property original_shape¶
Original shape of the samples before vectorization.