[RC5] packing fields

bwilson at fers.com bwilson at fers.com
Mon Jan 15 11:32:54 EST 2001


Hi, Tom.

The "parallel index/sort file" you describe is the "index" capability in
Sybase.  We're already using it.

It is very possible to index variable-length fields in SQL.  Numeric
fields are not variable length, however, so that's a moot point.

The "packing data into binary fields" method I mentioned was intended to
be a ridiculous example.  It would probably slow down retrieval by a
factor of 100 or more, and would certainly at least quadruple the
complexity of coding for it, for perhaps a 10% savings in disk space.  I
can't fathom an indexing scheme that could overcome such drawbacks.

Sybase SQL is more than capable of dealing with million- or billion-record
databases when the database is normalized and properly tuned to take
advantage of SQL's strengths.  We've made some big improvements, but
haven't moved RC5-64 into that framework.  Once we have, we can talk about
making the next big design improvements and taking it a step higher.  I
expect that the new framework will allow us to cut the downtime from hours
to minutes.  We have ideas in mind that could cut it to zero, but hardware
has been a major limitation.

We (the d.net stats developers) are in unanimous agreement that SQL is an
appropriate tool for our stats, based on existing skills, existing code,
platform requirements, our throughput needs, and our directions for the
future.

Just to reiterate the plan, here's the near-term agenda for stats:

1)  Bandwidth becomes available at UD office.
2)  New stats server is brought online
3)  RC5-64 data is migrated from old-stats to new-stats.
4)  New features and design improvements considered for implementation

Somewhere in there, we also have to allow for the possibility of problems,
or of starting OGR-26.

There are also some minor improvements that are still being ironed out
today, and some fixes that are more urgently needed  than others (removing
team_id vulnerability to the Sybase identity bug, making retires and
team-joins stats-time instead of real-time).  We're also beginning to plan
implementation of opcodeauth infrastructure in Sybase.

__
Bruce Wilson, Manager, FERS Business Services
bwilson at fers.com, 312.245.1750, http://www.fers.com/
PGP KeyID: 5430B995, http://www.lasthome.net/~bwilson/
"A difference that makes no difference is no difference." --Spock




tomdv at datatx.com
Sent by: owner-rc5 at lists.distributed.net
2001-01-14 04:57
Please respond to rc5


        To:     rc5 at lists.distributed.net
        cc:
        Subject:        Re: [RC5] packing fields



Dear Bruce,

okay, if I understand you correctly: if you use binary fields, with
variable record length, SQL doesn't allow indexing and the sorts. However,
you could create a parallel index/sort file. In the IBM AS/400, these are
called 'logical' files where the data is stored in the 'physical' file.
Agreed, this may take a bit more disk space but the index file itself
contains only the path how to access the data, physical file. It does,
however, speed up enormously (factor 10 or more) especially when dealing
with files with some million of records inside, which you guys probably
already have.
IBM AS/400 was derived from their mainframes so has all kind of tricks to
solve database handling with huge files and different ways of accessing
them.
Maybe there are some mainframe wizzards out there that could give you some
of these tricks to help out? I mean, the longer the project runs, the
bigger the data will get, the longer it will take to calculate whatever
average or so. You already had to put the stats-server off line for
calculations purposes, what's it going to be next time.
And just upgrading your machine all the time, is the horror of all
EDP-managers.
Just a thought,
Tom

--
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest





--
To unsubscribe, send 'unsubscribe rc5' to majordomo at lists.distributed.net
rc5-digest subscribers replace rc5 with rc5-digest



More information about the rc5 mailing list