randsampleFS

randsampleFS generates a random sample of k elements from the integers 1 to n (k<=n)

expand all in page

Syntax

y=randsampleFS(n,k)example
y=randsampleFS(n,k,method)example

Description

example

y =randsampleFS(n, k) randsampleFS with default options.

example

y =randsampleFS(n, k, method) randsampleFS with optional argument set to method (2).

Examples

expand all

randsampleFS with default options.

default method (1) is used.

randsampleFS(1000,10)

ans =

   772    20   141   730   541   714   229   674    12   529

randsampleFS with optional argument set to method (2).

method = 2;
randsampleFS(100,10,method)

ans =

    17    27    37    47    57    67    77    87    97     7

Related Examples

expand all

randsampleFS with optional arguments set to method (3).

method = 3;
% Here, being nsel so big wrt nsamp, it is likely to obtain repetitions.
randsampleFS(100,10,method)

randsampleFS Weighted Sampling Without Replacement.

Extract k=10 number in [-1000 -900] with gamma distributed weights.

population = -1000:1:-900;
n = numel(population);
wgts = sort(random('gamma',0.3,2,n,1),'descend');
k=10;
y = randsampleFS(n,k,wgts);
sample  = population(y);
plot(wgts,'.r')
hold on;
text(y,wgts(y),'X');
title('Weight distribution with the extracted numbers superimposed')

Input Arguments

expand all

`n` — A vector of numbers will be selected from the integers 1 to n. Scalar, a positive integer.

Data Types: single|double

`k` — The number of elements to be selected. Non negative integer.

Data Types: single|double

Optional Arguments

`method` — Sampling methods. Scalar or vector.

Methods used to extract the subsets. See more about for details.

Default is method = 0.

- Scalar from 0 to 3 determining the method used to extract (without replacement) the random sample.

- Vector of weights: in such a case, a weighted sampling without replacement algorithm is applied using that vector of weights.

Example: randsampleFS(100,10,2)

Data Types: single|double

Output Arguments

expand all

`y` —A column vector of k values sampled at random from the integers 1:n. For methods 0, 1, 2 and weighted sampling the elements extracted are unique; For method 3 (included for historical reasons) there is no guarantee that the elements extracted are unique

Data Types - single|double.

More About

expand all

Additional Details

The method=0 uses MATLAB function randperm. In old MATLAB releases randperm was slower than FSDA function shuffling, which is used in method 1 (for example, in R2009a - MATLAB 7.8 - randperm was at least 50 slower).

If method=1 the approach depends on the population and sample sizes: - if $n < 1000$ and $k < n/(10 + 0.007n)$ , that is if the population is relatively small and the desired sample is small compared to the population, we repeatedly sample with replacement until there are k unique values;

- otherwise, we do a random permutation of the population and return the first k elements.

The threshold $k < n/(10 + 0.007n)$ has been determined by simulation under MATLAB R2016b. Before, the threshold was $n < 4*k$ .

If method=2 systematic sampling is used, where the starting point is random and the step is also random.

If method=3 random sampling is based on the old but well known Linear Congruential Generator (LCG) method. In this case there is no guarantee to get unique numbers. The method is included for historical reasons.

If method is a vector of n weights, then Weighted Sampling Without Replacement is applied. Our implementation follows Efraimidis and Spirakis (2006). MATLAB function datasample follows Wong and Easton (1980), which is also quite fast; note however that function datasample may be very slow if applied repetedly, for the large amount of time spent on options checking.

Remark on computation performances. Method=2 (systematic sampling) is by far the fastest for any practical population size $n$ . For example, for $n \approx 10^6$ method=2 is two orders of magniture faster than method=1. With recent MATLAB releases (after R2011b) method = 0 (which uses compiled MATLAB function randperm) has comparable performances, at least for reasonably small $k$ . In releases before 2012a, randperm was considerably slow.

References

Fisher, R.A. and Yates, F. (1948), "Statistical tables for biological, agricultural and medical research (3rd ed.)", Oliver & Boyd, pp. 26-27. [For Method 1]

Cochran, W.G. (1977), "Sampling techniques (Third ed.)", Wiley. [For Method 2]

Knuth, D.E. (1997), "The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Third Edition" Addison-Wesley, pp. 10-26. [For Method 3. For details see: Section 3.2.1: The Linear Congruential Method]

Efraimidis, P.S. and Spirakis, P.G. (2006), Weighted random sampling with a reservoir, "Information Processing Letters", Vol. 97, pp. 181-185. [For Weighted Sampling Without Replacement]

Wong, C.K. and Easton, M.C. (1980), An Efficient Method for Weighted Sampling Without Replacement, "SIAM Journal of Computing", Vol. 9, pp. 111-113.

Documentation

randsampleFS

Syntax

Description

Examples

randsampleFS with default options.

randsampleFS with optional argument set to method (2).

Related Examples

randsampleFS with optional arguments set to method (3).

randsampleFS Weighted Sampling Without Replacement.

Input Arguments

`n` — A vector of numbers will be selected from the integers 1 to n. Scalar, a positive integer.

`k` — The number of elements to be selected. Non negative integer.

Optional Arguments

`method` — Sampling methods. Scalar or vector.

Output Arguments

`y` —A column vector of k values sampled at random from the integers 1:n. For methods 0, 1, 2 and weighted sampling the elements extracted are unique; For method 3 (included for historical reasons) there is no guarantee that the elements extracted are unique

More About

Additional Details

References

See Also

Documentation

randsampleFS

Syntax

Description

Examples

randsampleFS with default options.

randsampleFS with optional argument set to method (2).

Related Examples

randsampleFS with optional arguments set to method (3).

randsampleFS Weighted Sampling Without Replacement.

Input Arguments

n — A vector of numbers will be selected from the integers 1 to n. Scalar, a positive integer.

k — The number of elements to be selected. Non negative integer.

Optional Arguments

method — Sampling methods. Scalar or vector.

Output Arguments

y —A column vector of k values sampled at random from the integers 1:n. For methods 0, 1, 2 and weighted sampling the elements extracted are unique; For method 3 (included for historical reasons) there is no guarantee that the elements extracted are unique

More About

Additional Details

References

See Also

`n` — A vector of numbers will be selected from the integers 1 to n. Scalar, a positive integer.

`k` — The number of elements to be selected. Non negative integer.

`method` — Sampling methods. Scalar or vector.

`y` —A column vector of k values sampled at random from the integers 1:n. For methods 0, 1, 2 and weighted sampling the elements extracted are unique; For method 3 (included for historical reasons) there is no guarantee that the elements extracted are unique