The implementation of buzhug was made with a main objective : make all operations, and especially selections, as fast as possible. When a choice had to be made between different options, I took the one that optimized the speed, even if this meant more files to manage and more space used on disk
This is why a base is not kept in a single file, but in a directory with many files. There are two sorts of files :
- those who hold the data, the "field files" : one file per field
- those who are used internally by buzhug
The field files
In the base definition, a field is defined by its name and its type. A field file is created for each of the fields, and two other field files are added by buzhug :
- one for the record identifiers : field __id__
- one for the record version number (used for concurrency control) : field __version__
The field files store information in a different way depending on the type of the field. Field types can be split into two categories :
- fixed-length types : int (in the range -2**30 +1, 2**30-1), float, links to other bases (it uses an integer)
- variable-length types : str, unicode ; although date and datetime are fixed-length types, they are stored as variable-length types for performance reasons
For fixed-length types, the values are stored as strings using a conversion function. Because of the way the select() method works, it is essential that this conversion function preserves the ordering of values, that is : if you have two values v1 and v2 and the conversion function is conv() then you must have :
if v1 > v2 then conv(v1) > conv(v2)
and more generally,
cmp(v1,v2) == cmp(conv(v1),conv(v2))
The conversion functions can be found in the methods to_block() and from_block() of the classes in the module buzhug_classes
For variable-length files, the value is converted to a string without any line break, so that it can be stored on one line in the file. Python has a loop for line in _file:
which is extremely fast ; I found that it is much faster to browse a file with date and datetime using this loop than reading blocks of fixed-length data with read(), this is why they are stored as variable length types
For all types, the "blocks" that store a value begin with a flag which can take 3 different values :
- - : means that the field is initialized to a value
- ! : the field is not initialized, its value is None (this happens if you use insert() mithout specifying a value for this field)
- # : means that the record has been deleted
Internal files
The files defined for internal use are :
- __info__ : stores the field definition : name and type of the fields
- __pos__ : the position file. It is organized in logical "rows", one row per record in the base. Like in field files, if a record is deleted, the row begins by # ; otherwise it begins by -
The row is structured in 4-byte blocks, one block per field in the base (including the internal fiels __id__ and __version__), in the same order as in the base definition. The block for a field is the physical position of the value of this field in the field file
For instance if the base has only one field, 'name', and you enter some names in the base, the field file name will be structured like this :
Field file 'name' |
position | value |
0 | -pierre\n |
8 | -claire\n |
16 | -simon\n |
23 | -camille\n |
32 | -jean\n |
38 | -florence\n |
48 | -marie-anne\n |
When records are inserted, buzhug will populate the field files __id__ and __version__ this way :
Field file '__id__' |
position | value (hex) |
0 | 2D 40 00 00 00 |
5 | 2D 40 00 00 01 |
10 | 2D 40 00 00 02 |
15 | 2D 40 00 00 03 |
20 | 2D 40 00 00 04 |
25 | 2D 40 00 00 05 |
30 | 2D 40 00 00 06 |
|
Field file '__version__' |
position | value (hex) |
0 | 2D 40 00 00 00 |
5 | 2D 40 00 00 00 |
10 | 2D 40 00 00 00 |
15 | 2D 40 00 00 00 |
20 | 2D 40 00 00 00 |
25 | 2D 40 00 00 00 |
30 | 2D 40 00 00 00 |
|
2D is hex for the character -
In the position file, the row for the record with name 'jean' will store the position of its fields __id__,__version__ and name in their respective field files, here : 20, 20, 32 (hex 14, 14, 20)
These values are converted into 4-byte blocks with struct.pack('>i',value)
, so the position file will be :
Position file |
position | flag |
position in __id__ file |
position in __version__ file |
position in name file |
0 | 2D | 00 00 00 00 | 00 00 00 00 | 00 00 00 00 |
13 | 2D | 00 00 00 05 | 00 00 00 05 | 00 00 00 08 |
26 | 2D | 00 00 00 0A | 00 00 00 0A | 00 00 00 10 |
39 | 2D | 00 00 00 0F | 00 00 00 0F | 00 00 00 17 |
52 | 2D | 00 00 00 14 | 00 00 00 14 | 00 00 00 20 |
65 | 2D | 00 00 00 19 | 00 00 00 19 | 00 00 00 28 |
78 | 2D | 00 00 00 1E | 00 00 00 1E | 00 00 00 30 |
- _id_pos : stores a mapping between record identifiers and its row in the position file. The row is stored as a 4-byte string ; this string is at the position 5*id (always the leading 1-byte flag)
It is used to access records by their id almost immediately. Lookup by id follows these steps :
- compute the position in _id_pos : 5*id
- if the flag is set to # (deleted record) raise an IndexError
- otherwise read 4 bytes from position 5*id+1
- if read() returns an empty string, id is greater than the greatest id in the base : raise an IndexError
- otherwise, transform this 4-byte string into an integer :
row = struct.unpack('>i',s)
- get the position of the row in the position file :
pos = row * L
where L is the length of a row in the position file (in the example, L = 13)
- compute the position of the fields in their field files : for each field, get the 4-byte string holding the position, transform this string into a long integer
- use this position to get the data at the right place in field files
All these operations are simple and fast, so that lookup by id is itself almost immediate
When a base is created and records are inserted, the values are stored sequentially in field files and in the position file ; the first inserted record has the identifier 0, the second 1 and so on. At this stage the mapping between identifiers and rows in the position file is simple
Suppose you have inserted 10 records. You delete record number 3 ; the 3rd row in the position file is marked as deleted (the flag is set to #). If another record is inserted, the row 3 is reused for the new record ; this is done to avoid wasting useless space on disk. The new record will have id = 10 and the information stored in _id_pos for this record will be 3
- __del_rows__ : stores the deleted rows in field files. When a record is deleted, the flag in field files is set to # ; if it is the n-th element in the field file (n is equal for all field files) then n is stored in __del_rows__
This file is used to speed up the opening of existing bases. Buzhug must keep a reference to deleted rows, and it is faster to read a file with the row numbers than to take a field file and browse all the items in it to see which ones have the flag set to #
Deleting a record
When a record is deleted, the field files and position file are marked as deleted with the flag set to #. For instance if the record for 'jean' is deleted, here is how the field files and position file will look like :
Field file 'name' |
pos | value |
0 | -pierre\n |
8 | -claire\n |
16 | -simon\n |
23 | -camille\n |
32 | #jean\n |
38 | -florence\n |
48 | -marie-anne\n |
|
Field file '__id__' |
pos | value (hex) |
0 | 2D 40 00 00 00 |
5 | 2D 40 00 00 01 |
10 | 2D 40 00 00 02 |
15 | 2D 40 00 00 03 |
20 | 23 40 00 00 04 |
25 | 2D 40 00 00 05 |
30 | 2D 40 00 00 06 |
|
Field file '__version__' |
pos | value (hex) |
0 | 2D 40 00 00 00 |
5 | 2D 40 00 00 00 |
10 | 2D 40 00 00 00 |
15 | 2D 40 00 00 00 |
20 | 23 40 00 00 00 |
25 | 2D 40 00 00 00 |
30 | 2D 40 00 00 00 |
|
Position file |
pos | flag |
pos __id__ |
pos __version__ |
pos name |
0 | 2D | 00 00 00 00 | 00 00 00 00 | 00 00 00 00 |
13 | 2D | 00 00 00 05 | 00 00 00 05 | 00 00 00 08 |
26 | 2D | 00 00 00 0A | 00 00 00 0A | 00 00 00 10 |
39 | 2D | 00 00 00 0F | 00 00 00 0F | 00 00 00 17 |
52 | 23 | 00 00 00 14 | 00 00 00 14 | 00 00 00 20 |
65 | 2D | 00 00 00 19 | 00 00 00 19 | 00 00 00 28 |
78 | 2D | 00 00 00 1E | 00 00 00 1E | 00 00 00 30 |
|
The change is minimal (and therefore very fast) : flag '-' (hex 2D) replaced by '#' (hex 23) in 4 files
The problem is that the now useless information about 'jean' remains physically on disk ; when many records have been deleted this can become a problem of memory and speed of selections
Cleanup
So from time to time the method cleanup() should be applied. It removes the useless records from the field files and updates the position file to the new positions in the field files
After cleanup, the files will be :
Field file 'name' |
pos | value |
0 | -pierre\n |
8 | -claire\n |
16 | -simon\n |
23 | -camille\n |
32 | -florence\n |
42 | -marie-anne\n |
|
Field file '__id__' |
pos | value (hex) |
0 | 2D 40 00 00 00 |
5 | 2D 40 00 00 01 |
10 | 2D 40 00 00 02 |
15 | 2D 40 00 00 03 |
20 | 2D 40 00 00 05 |
25 | 2D 40 00 00 06 |
|
Field file '__version__' |
pos | value (hex) |
0 | 2D 40 00 00 00 |
5 | 2D 40 00 00 00 |
10 | 2D 40 00 00 00 |
15 | 2D 40 00 00 00 |
20 | 2D 40 00 00 00 |
25 | 2D 40 00 00 00 |
|
Position file |
pos | flag |
pos __id__ |
pos __version__ |
pos name |
0 | 2D | 00 00 00 00 | 00 00 00 00 | 00 00 00 00 |
13 | 2D | 00 00 00 05 | 00 00 00 05 | 00 00 00 08 |
26 | 2D | 00 00 00 0A | 00 00 00 0A | 00 00 00 10 |
39 | 2D | 00 00 00 0F | 00 00 00 0F | 00 00 00 17 |
52 | 23 | 00 00 00 14 | 00 00 00 14 | 00 00 00 20 |
65 | 2D | 00 00 00 14 | 00 00 00 14 | 00 00 00 20 |
78 | 2D | 00 00 00 19 | 00 00 00 19 | 00 00 00 2A |
|
Note that the row for the deleted record is still in the position file ; it takes a useless space, but this is not a real problem because the row will be reused if another record is inserted. Also note that the file _id_pos is not modified by cleanup(), because the rows in the position file remain at the same place for a given record
Updating
When a record is updated, one of these two cases may occur :
- all the modified values are fixed-length values
- at least one of the modified values is a variable-length value
The first case is the easiest to manage : buzhug finds the position in the field files and modifies the value at the same place ; there is no need to update the position file
In the second case, the update takes place in 3 steps :
- the record is deleted (the flags are set to #)
- the new values (with the same __id__) are inserted at the end of the field files
- the position file is updated to replace the previous positions in field files by the new ones
In all cases, the field __version__ is modified (incremented by 1). This field is used to check that, at the time when update() is called, the record stored in the base has not been modified since it was selected (which may happen if the access to the table is shared between many simultaneous users)