Howard Chu wrote:
Realistically, at about 2Kbytes per entry, a 32 bit machine could directly address at most ~2^20 entries (~1M) so an IDL that maxes out at 1.86M before losing precision may be perfectly acceptable.
For 64 bit machines it seems 50 bits (1 petabyte) is the largest physical address space currently available. That puts a practical upper bound, assuming 4Kbytes per entry, at around 2^37 entries (~128 billion) directly addressable. It would take 2^34 bytes to handle that directly, 16GBytes, which again is still impractical. So again, we need some kind of range representation in addition to the actual bit vectors.
I think using <base,vector> with fixed-length vectors will be the way forward. When an IDL reaches the maximum length, and a new <base,vector> needs to be added, the entire IDL will be scaled by 2. On most 32 bit installs this will probably never happen; on 64 bit installs it will happen infrequently.
It may be possible/desirable to simply scale one piece of the IDL at a time. Anything that can coalesce two vectors into one will be sufficient to allow the next insertion to proceed. The scale factor can be stored in the unused bits of the <base>...
Gavin.
On 10/19/08, Howard Chuhyc@symas.com wrote:
Seems like I keep poking at this idea again every couple years.
If we use a<base,vector> we can cover a much wider range without losing any precision. For a 32 bit machine, 32 bits per word would cover 5 bits worth of entry IDs. So 27 bits would comprise the base, and the next 32 bits would represent up to 32 entry IDs.
That allows us to represent up to 256 entryIDs in only 288 bits (instead of the 16384 bits we currently need). That would allow us to track about 1.86M entryIDs in the same space we currently use to track only 64K entryIDs.
It seems we would still need some kind of range representation to handle IDLs with more than 1.86M entries.