rbtree test

Rbtree testings

gin index utilizes maintenance_work_mem for bulk insert to gin index (CREATE INDEX command). ginInsertEntry() creates tree in memory and when tree exceeds maintenance_work_mem, it flushed to disk. Since 8.4, bulk insert is also used to speedup inserts of new entries (gin fastupdate feature). Btree used in GIN code behaves very bad with presorted data and even limiting of maxdepth didn't help , so we decided to play with red-black tree, which is self-balanced tree.

Test environment: My desktop machine with Linux 2.6.24.3-smp, i686 Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz GenuineIntel, 8 GB RAM,

Pg settings (only relevant):
shared_buffers = 512MB 
work_mem = 32MB
maintenance_work_mem = 256MB
checkpoint_segments = 16
effective_cache_size = 1GB

1. Original test from http://archives.postgresql.org/pgsql-performance/2009-03/msg00340.php

Test data are arrays of length 6 with identical values, which sequentially increased from row to row.

--http://archives.postgresql.org/pgsql-performance/2009-03/msg00340.php
drop table a;
-- insert and create index
select array[n,n,n,n,n,n] as ar into a from generate_series(1,1000000) s n;
create index arr_gin onsee ginInsertEntry(), a using gin ( ar );

truncate a;
drop index arr_gin ;

-- insert after index
create index arr_gin on a using gin ( ar );
insert into a select array[n, n, n, n, n, n] from generate_series(1,1000000) as n;

test=# select * from a limit 2;
      ar
---------------
 {1,1,1,1,1,1}
 {2,2,2,2,2,2}

Logs of 3 runs on HEAD:

DROP TABLE             DROP TABLE           DROP TABLE
Time: 1040.375 ms      Time: 1043.168 ms    Time: 1040.275 ms
SELECT                 SELECT               SELECT
Time: 2058.678 ms      Time: 1841.830 ms    Time: 1809.520 ms
CREATE INDEX           CREATE INDEX         CREATE INDEX
Time: 6517.479 ms      Time: 6492.176 ms    Time: 7650.608 ms
TRUNCATE TABLE         TRUNCATE TABLE       TRUNCATE TABLE
Time: 75.211 ms        Time: 68.449 ms      Time: 66.927 ms
DROP INDEX             DROP INDEX           DROP INDEX
Time: 2.417 ms         Time: 2.627 ms       Time: 2.532 ms
CREATE INDEX           CREATE INDEX         CREATE INDEX
Time: 1.106 ms         Time: 1.088 ms       Time: 1.140 ms
INSERT 0 1000000       INSERT 0 1000000     INSERT 0 1000000
Time: 9600.281 ms      Time: 9309.620 ms    Time: 8826.133 ms

The same for HEAD+rbtree-0.9

DROP TABLE              DROP TABLE           DROP TABLE
Time: 1039.268 ms       Time: 1039.257 ms    Time: 34.951 ms
SELECT                  SELECT               SELECT
Time: 1806.630 ms       Time: 2195.750 ms    Time: 2632.235 ms
CREATE INDEX            CREATE INDEX         CREATE INDEX
Time: 9346.265 ms       Time: 6877.033 ms    Time: 7012.791 ms
TRUNCATE TABLE          TRUNCATE TABLE       TRUNCATE TABLE
Time: 56.326 ms         Time: 41.171 ms      Time: 1225.004 ms
DROP INDEX              DROP INDEX           DROP INDEX
Time: 2.514 ms          Time: 2.649 ms       Time: 2.720 ms
CREATE INDEX            CREATE INDEX         CREATE INDEX
Time: 1.249 ms          Time: 1.696 ms       Time: 1.377 ms
INSERT 0 1000000        INSERT 0 1000000     INSERT 0 1000000
Time: 14326.864 ms      Time: 9293.894 ms    Time: 8357.859 ms

The same for HEAD+rbtree-0.10

DROP TABLE             DROP TABLE           DROP TABLE
Time: 1046.992 ms      Time: 1039.482 ms    Time: 1040.836 ms
SELECT                 SELECT               SELECT
Time: 2005.670 ms      Time: 2092.013 ms    Time: 2068.169 ms
CREATE INDEX           CREATE INDEX         CREATE INDEX
Time: 6750.898 ms      Time: 8019.199 ms    Time: 7984.485 ms
TRUNCATE TABLE         TRUNCATE TABLE       TRUNCATE TABLE
Time: 45.027 ms        Time: 53.622 ms      Time: 50.384 ms
DROP INDEX             DROP INDEX           DROP INDEX
Time: 2.398 ms         Time: 2.600 ms       Time: 2.557 ms
CREATE INDEX           CREATE INDEX         CREATE INDEX
Time: 1.780 ms         Time: 1.352 ms       Time: 1.534 ms
INSERT 0 1000000       INSERT 0 1000000     INSERT 0 1000000
Time: 8783.836 ms      Time: 8971.075 ms    Time: 8446.531 ms

rbtree is used only in bulk insert into index (ins,idx sequence). We can use timings with inserts (idx,ins) to verify test results.

Summary:

The best results for CREATE INDEX are: 6492.176 (Btree+Hack), 6877.033 (rbtree-0.9), 6750.898 (rbtree-0.10). We see that difference between 6492.176 and 6750.898 is about 4%, which is in the accuracy of experiments.

2. Random array test - 100 000 integer arrays, array length is random between 5 and 1000, cardinality is 500 000. Data: http://www.sai.msu.su/~megera/postgres/files/arr-5-1000-500000-100000.sql.gz

   zcat arr-5-1000-500000-100000.sql.gz| psql test
   create index a_idx on tt using gin(a);
         HEAD     HEAD+rbtree-0.9      HEAD+rbtree-0.10
idx  97833.943    108109.422           93022.371
     94076.822    95377.997            94435.548
     92993.065    95167.072            91723.859

Summary:

  There is no visible performance difference between HEAD (btree+hack) and
  HEAD+rbtree-0.10.

3. Real data from Wikipedia links Data: http://www.sai.msu.su/~megera/postgres/files/links2.sql.gz

   zcat links2.sql.gz | psql test
   Indexes will built as follows - btree, gin(idin), gin(idout)
   
test=# \d links2
     Table "public.links2"
 Column |   Type    | Modifiers 
--------+-----------+-----------
 id     | integer   | 
 idout  | integer[] | 
 idin   | integer[] | 
Indexes:
    "id_idx_link2" btree (id)
    "idin_rbtree_idx" gin (idin)
    "idout_rbtree_idx" gin (idout)
          HEAD             HEAD+rbtree-0.10
ins    180160.087        165348.671
btree  32624.773         31304.554 
idin   9006050.013       767966.869
idout  3535996.914       720310.768
   

Summary: HEAD+rbtree-0.10 is substantially (more than 10 times) better that HEAD with btree hack !

Links: