1

Now I plan to learn more about yield in python. And I found some codes about yield, which implemented the algorithm Reservoir Sampling as following:

def RandomSelect(knum, rand=None):
    ''' (int, func) -> list

    Reservoir Sampling implementation
    '''
    selection = None
    k_elems_list = []
    count = 0

    if rand is None:
        rand = Random()

    while True:
        item = yield selection
        if len(k_elems_list) < knum:
            k_elems_list.append(item)
        else:
            # Randomly replace elements in the reservoir with a decreasing probability
            r = rand.randint(0, count)
            if r < knum:
                k_elems_list[r] = item
        count += 1
    print k_elems_list

In order to break the while loop, I just add some codes after item = yield selection

        if item == -1: # reach to the end of data, just break
            break

Question 1, Is there any better way to break out the while loop?

To call the function RandomSelect,

myList = [1,2,3,4,5,6,7,8,-1]
cr = RandomSelect(3);
cr.next() # advance to the yield statement, otherwise I can't call send
try:
    for val in myList:
        cr.send(val)
except StopIteration:
    pass
finally:
    del cr

I have to catch the StopIteration exception explicitly.

Question 2, is there any better way to swallow the StopIteration in the codes?

4

1 回答 1

3

I think a slightly cleaner way to accomplish what is being done — which addresses both your questions — would be to explicitly close the generator by calling itsclose()method to terminate it and break out of the loop. Doing so also means aStopIterationdoesn't need to be "swallowed". Another benefit is it's no longer necessary to add the -1 sentinel value at the end of the list.

def RandomSelect(knum, rand=None):
    ''' (int, func) -> list

    Reservoir Sampling implementation
    '''
    selection = None
    k_elems_list = []
    count = 0
    if rand is None:
        rand = Random()   

    while True:
        try:
            item = yield selection
        except GeneratorExit:
            break
        if len(k_elems_list) < knum:
            k_elems_list.append(item)
        else:
            # Randomly replace elements in the reservoir with a decreasing probability
            r = rand.randint(0, count)
            if r < knum:
                k_elems_list[r] = item
        count += 1
    print k_elems_list

myList = [1,2,3,4,5,6,7,8]
cr = RandomSelect(3)
cr.next() # advance to the yield statement, otherwise I can't call send
for val in myList:
    cr.send(val)
cr.close()    
del cr 

A minor additional enhancement (about something you didn't ask about) would be to make it so it wasn't necessary to manually advance to theyieldstatement before callingsend(). A good way to accomplish that would be with a decorator function similar to the one namedconsumer()David Beazley described in his Generator Tricks For Systems Programmers presentation at PyCon 2008:

def coroutine(func):
    """ Decorator that takes care of starting a coroutine automatically. """
    def start(*args, **kwargs):
        cr = func(*args, **kwargs)
        cr.next()
        return cr
    return start

@coroutine
def RandomSelect(knum, rand=None):
          .
          .
          .
    print k_elems_list

myList = [1,2,3,4,5,6,7,8]
cr = RandomSelect(3)
#cr.next() # NO LONGER NECESSARY
for val in myList:
    cr.send(val)
cr.close()    
del cr
于 2014-06-16T08:39:38.527 回答