[Zope3-dev] randid()

Tim Peters tim@zope.com
Sat, 23 Mar 2002 16:49:01 -0500


[Jeremy]
> ...
> But if randid() really just returns a random number, and the caller is
> responsible for guaranteeing uniqueness, then this seems like the
> simplest approach:
>
>    random.randrange(1, 2**30)
>
> This is much simpler than the current randid() function, which has
> three default arguments, two calls to random functions, and a
> multiply.

If we want to cover the whole range of a 32-bit int, the cheapest way is
likely

    return int(random.random() * 4294967296.0 - 2147483648.0)

Since 0 <= random.random() < 1.0, the product P there is 0 <= P < 2**32,
then subtracting leaves a float S with -2**31 <= S < 2**32-2**31 == 2**31,
and int() then leaves that covering the full range of a signed 32-bit int.

Note that it's a good deal cheaper to do the multiply yourself than to make
random.randrange() slog thru a minefield of complicated arguments trying to
figure what to multiply.

> My question about randid() started because that code seemed
> to be doing a lot of work for reasons that weren't very clear.

Indeed it needs to be documented!

A note about the birthday paradox:  for things that run a long time, what
bites isn't so much that you can expect to hit the first duplicate in
O(sqrt(range)) calls, but that the odds of collisions skyrocket as time goes
on.

def f():
    from random import random
    d = {}
    unique = 0
    while 1:
        x = int(random() * 4294967296.0 - 2147483648.0)
        if x in d:
            print "duplicate after", unique, "tries"
            unique = 0
        else:
           d[x] = 1
           unique += 1

f()

Capture file written by TCAP 3.1, TCAP Copyright(c) 1999 Tim Kannel

C:\Code\python\PCbuild>python rr.py
duplicate after 77804 tries  # note -- "close to" 2**16 before 1st dup
duplicate after 38857 tries
duplicate after 146 tries
duplicate after 21449 tries
duplicate after 7026 tries
duplicate after 9763 tries
duplicate after 22246 tries
duplicate after 66748 tries
duplicate after 7877 tries
duplicate after 36785 tries
duplicate after 8255 tries
duplicate after 16111 tries
duplicate after 8972 tries
duplicate after 15895 tries
duplicate after 18251 tries
duplicate after 204 tries
duplicate after 16982 tries
duplicate after 16611 tries
duplicate after 2755 tries
duplicate after 8478 tries
duplicate after 2304 tries
duplicate after 22449 tries
duplicate after 4677 tries
duplicate after 24157 tries
duplicate after 1973 tries
duplicate after 4218 tries
duplicate after 6588 tries
duplicate after 6174 tries
duplicate after 24172 tries
duplicate after 3010 tries
duplicate after 15750 tries
duplicate after 2898 tries
duplicate after 3948 tries
duplicate after 1987 tries
duplicate after 1659 tries
duplicate after 9753 tries
duplicate after 10539 tries
duplicate after 11843 tries
duplicate after 9524 tries
duplicate after 629 tries
duplicate after 3080 tries
duplicate after 4173 tries
duplicate after 1509 tries
duplicate after 2624 tries
duplicate after 12628 tries
duplicate after 9420 tries
duplicate after 7860 tries
duplicate after 753 tries
duplicate after 3012 tries
duplicate after 7645 tries
duplicate after 51422 tries
duplicate after 12761 tries
duplicate after 574 tries
duplicate after 10263 tries
duplicate after 5391 tries
duplicate after 9114 tries
duplicate after 1408 tries
duplicate after 1518 tries
duplicate after 14000 tries
duplicate after 1044 tries
duplicate after 20367 tries
duplicate after 2639 tries
duplicate after 18771 tries
duplicate after 1256 tries
duplicate after 788 tries
duplicate after 2623 tries
duplicate after 8339 tries
duplicate after 3433 tries
duplicate after 997 tries
duplicate after 8828 tries
duplicate after 2139 tries
duplicate after 8463 tries
duplicate after 4150 tries
duplicate after 1330 tries
duplicate after 1054 tries
duplicate after 144 tries
duplicate after 4389 tries
duplicate after 1901 tries
duplicate after 1706 tries
duplicate after 5290 tries
duplicate after 208 tries
duplicate after 869 tries
duplicate after 81 tries
duplicate after 5865 tries
duplicate after 1359 tries
duplicate after 65 tries
duplicate after 3379 tries
duplicate after 1797 tries
duplicate after 112 tries
duplicate after 12069 tries
duplicate after 4265 tries
duplicate after 680 tries
duplicate after 4340 tries
duplicate after 7658 tries
duplicate after 11926 tries
duplicate after 6241 tries
duplicate after 16250 tries
duplicate after 498 tries
duplicate after 19036 tries
duplicate after 3077 tries
duplicate after 9644 tries
duplicate after 571 tries
duplicate after 4477 tries
duplicate after 2303 tries
duplicate after 2655 tries
duplicate after 2587 tries
duplicate after 4408 tries
duplicate after 6024 tries
duplicate after 3216 tries
duplicate after 8498 tries
duplicate after 4326 tries
duplicate after 1746 tries
duplicate after 3880 tries
duplicate after 5247 tries
duplicate after 2083 tries
duplicate after 224 tries
duplicate after 11847 tries
duplicate after 6906 tries
duplicate after 5163 tries
duplicate after 6664 tries
duplicate after 2509 tries
duplicate after 2810 tries
duplicate after 4251 tries
duplicate after 107 tries
duplicate after 2790 tries
duplicate after 8865 tries
duplicate after 4351 tries
duplicate after 3235 tries
duplicate after 634 tries
duplicate after 626 tries
duplicate after 2890 tries
duplicate after 1577 tries
duplicate after 10945 tries
duplicate after 1143 tries
duplicate after 5174 tries
duplicate after 5935 tries
duplicate after 2049 tries
duplicate after 14704 tries
duplicate after 420 tries
duplicate after 3311 tries
duplicate after 1973 tries
duplicate after 591 tries
duplicate after 12857 tries
duplicate after 6018 tries
duplicate after 838 tries
duplicate after 1766 tries
duplicate after 12135 tries
duplicate after 2459 tries
duplicate after 9623 tries
duplicate after 20151 tries
duplicate after 2003 tries
duplicate after 444 tries
Traceback (most recent call last):
  File "rr.py", line 14, in ?
    f()
  File "rr.py", line 12, in f
    unique += 1
KeyboardInterrupt