Gin stands for generalized inverted index (not a drink, but genie). See Gin for more information.
PostgreSQL 8.2dev, standard postgresql.conf. Notice, gin uses maintenance_work_mem to accumulate postings in memory during index creation. Default value is 16Mb, we used 64Mb.
Quick results: time to create index (bulk) is about the same for intarray and gin, while search using gin is about an order of magnitude faster than gist__intbig_ops !
Test data for table tt(a integer[]) were generated using RANDOM function for three variables:
=# explain analyze select * from tt where a @> '{2}'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on tt (cost=6.35..349.75 rows=100 width=226) (actual time=6.529..42.784 rows=9371 loops=1) Recheck Cond: (a @> '{2}'::integer[]) -> Bitmap Index Scan on gin_idx (cost=0.00..6.35 rows=100 width=0) (actual time=4.798..4.798 rows=9371 loops=1) Index Cond: (a @> '{2}'::integer[]) Total runtime: 52.566 ms (5 rows) # select count(*) from tt where a @> '{2}'; count ------- 9371 (1 row)
=# explain analyze select * from tt where a @> '{2}'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on tt (cost=6.35..349.75 rows=100 width=227) (actual time=99.690..447.405 rows=9371 loops=1) Filter: (a @> '{2}'::integer[]) -> Bitmap Index Scan on int_idx (cost=0.00..6.35 rows=100 width=0) (actual time=97.742..97.742 rows=9371 loops=1) Index Cond: (a @> '{2}'::integer[]) Total runtime: 457.207 ms (5 rows) Time: 458.190 ms
300,000 documents (108Mb) loaded for about 44s and indexed (bulk mode) - 62s(16Mb) and 66s(64Mb). Index size is 55Mb. Search is very fast:
# explain analyze select count(*) from aa where a @> '{oil}'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=1067.10..1067.11 rows=1 width=0) (actual time=304.686..304.687 rows=1 loops=1) -> Bitmap Heap Scan on aa (cost=9.05..1066.35 rows=300 width=0) (actual time=19.771..283.364 rows=19203 loops=1) Recheck Cond: (a @> '{oil}'::text[]) -> Bitmap Index Scan on txt_idx (cost=0.00..9.05 rows=300 width=0) (actual time=11.705..11.705 rows=19203 loops=1) Index Cond: (a @> '{oil}'::text[]) Total runtime: 304.855 ms (6 rows)
postgres=# \d titles Table "public.titles" Column | Type | Modifiers -----------+--------------------------+----------- date | timestamp with time zone | title | text | fts_index | tsvector | postgres=# \timing Timing is on. postgres=# create index fts_idx on titles using gin(fts_index); CREATE INDEX Time: 236752.569 ms