leftretail.blogg.se

Python random data generator
Python random data generator












python random data generator

T3 = timeit.Timer(stmt='iter_sample_fast(pop, %i)'%(k_size), setup='from _main_ import iter_sample_fast,pop')

python random data generator

T2 = timeit.Timer(stmt='sample_from_iterable(pop, %i)'%(k_size), setup='from _main_ import sample_from_iterable,pop') T1 = timeit.Timer(stmt='iterSample(pop, %i)'%(k_size), setup='from _main_ import iterSample,pop') K_sizes = įor pop_size, k_size in zip(pop_sizes, k_sizes): Random.shuffle(results) # Randomize their positionsįor i, v in enumerate(iterator, samplesize): Return (x for _, x in nlargest(samplesize, ((random.random(), x) for x in iterable)))ĭef iter_sample_fast(iterable, samplesize): Raise ValueError("Sample larger than population.")ĭef sample_from_iterable(iterable, samplesize): Results = v # at a decreasing rate, replace random items Results.insert(r, v) # add first samplesize items in random order The code I used to time the methods from heapq import nlargest So it turns out that the array.insert has a serious drawback when it comes to large sample sizes. Random.sample( list_item(range(100)), 20 )Īs per MartinPieters's request I did some timing of the currently proposed three methods. I was thinking that maybe there is some way of doing this with something from itertools but couldn't find anything with a bit of searching.Ī somewhat made up example: import random TypeError: object of type 'generator' has no len() The problem is that random.sample() raises the following error. I am trying to get a random sample from a very large text corpus.

PYTHON RANDOM DATA GENERATOR GENERATOR

Do you know if there is a way to get python's random.sample to work with a generator object.














Python random data generator