PostgreSQL Indexs
Postgresql Index:
HEAP:Storage area for storing the whole row of the table.
This is divided into multiple pages (as shown in the above picture) and each page size is by default 8KB. Within each page, each item pointer (e.g. 0,1, 2, ….) points to data within the page.
Index Storage:
This storage stores only key values i.e. columns value contained by index.
This is also divided into multiple pages and each page size is by default 8KB.
Tuple Identifier (TID):
TID is 6 bytes number which consists of two parts.
The first part is 4-byte page number and remaining 2 bytes tuple index inside the page.
The combination of these two numbers uniquely points to the storage location for a particular tuple.
========================================================================
postgresql index types:
=======================
- B-tree
- Hash
- GiST
- sp-Gist
- GIN
- BRIN
=============================================
B-Tree Index:
B-Tree is the default and the most commonly used index type. Specifying a primary key or a unique within a CREATE TABLE statement causes PostgreSQL to create B-Tree indexes. CREATE INDEX statements without the USING clause will also create B-Tree indexes:example:
Before Index:
product=# explain analyze select * from t1 where id=5;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Seq Scan on t1 (cost=0.00..179057.19 rows=1 width=11) (actual time=47.515..6802.694 rows=1 loops=1)
Filter: (id = 5)
Rows Removed by Filter: 9999999
Planning Time: 0.491 ms
Execution Time: 6802.792 ms
(5 rows)
After Index:
product=# create index in_t1_id on t1 using btree(id);
CREATE INDEX
product=# explain analyze select * from t1 where id=5;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Index Scan using in_t1_id on t1 (cost=0.43..8.45 rows=1 width=11) (actual time=25.031..25.039 rows=1 loops=1)
Index Cond: (id = 5)
Planning Time: 5.143 ms
Execution Time: 25.102 ms
(4 rows)
B-tree index features introduced in v12:
----------------------------------------
Reduce locking overhead for B-tree index inserts for improved performance.
Introduce REINDEX CONCURRENTLY to make it easier to rebuild an index without down-time.
Improve performance for index-only scans on indexes with many attributes.
Add a view pg_stat_progress_create_index to report progress for CREATE INDEX and REINDEX.
Hash Index:
Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash if there were unwritten changes. Also, changes to hash indexes are not replicated over streaming or file-based replication after the initial base backup, so they give wrong answers to queries that subsequently use them. For these reasons, hash index use is presently discouraged.example: create index in_t1_id on t1 using hash(id);
GIN Index:
GIN stands for Generalized Inverted Index, commonly referred to as GIN, are most useful when you have data types that contain multiple values in a single column. GIN is designed for handling cases where the items to be indexed are composite values, and the queries to be handled by the index need to search for element values that appear within the composite items.The most common data types that fall into this bucket are:
hStore
Arrays
Range types
JSONB
tsvector
tsquery
which supports indexed queries using these operators:
<@
@>
=
&&
Example:
db=# create table ts(doc text, doc_tsv tsvector);
CREATE TABLE
db=# insert into ts(doc) values
db-# ('Can a sheet slitter slit sheets?'),
db-# ('How many sheets could a sheet slitter slit?'),
db-# ('I slit a sheet, a sheet I slit.'),
db-# ('Upon a slitted sheet I sit.'),
db-# ('Whoever slit the sheets is a good sheet slitter.'),
db-# ('I am a sheet slitter.'),
db-# ('I slit sheets.'),
db-# ('I am the sleekest sheet slitter that ever slit sheets.'),
db-# ('She slits the sheet she sits on.');
INSERT 0 9
db=# update ts set doc_tsv = to_tsvector(doc);
UPDATE 9
db=# explain analyze select doc from ts where doc_tsv @@ to_tsquery('many & slitter');
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Seq Scan on ts (cost=10000000000.00..10000000241.00 rows=1 width=32) (actual time=0.104..0.161 rows=1 loops=1)
Filter: (doc_tsv @@ to_tsquery('many & slitter'::text))
Rows Removed by Filter: 8
Planning Time: 0.136 ms
Execution Time: 0.180 ms
(5 rows)
db=# create index on ts using gin(doc_tsv);
CREATE INDEX
db=# explain analyze select doc from ts where doc_tsv @@ to_tsquery('many & slitter');
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on ts (cost=12.25..16.51 rows=1 width=32) (actual time=0.066..0.067 rows=1 loops=1)
Recheck Cond: (doc_tsv @@ to_tsquery('many & slitter'::text))
Heap Blocks: exact=1
-> Bitmap Index Scan on ts_doc_tsv_idx (cost=0.00..12.25 rows=1 width=0) (actual time=0.057..0.057 rows=1 loops=1)
Index Cond: (doc_tsv @@ to_tsquery('many & slitter'::text))
Planning Time: 2.390 ms
Execution Time: 0.125 ms
(7 rows)
Gist Index:
GiST stands for Generalized Search Tree.GiST indexes allow a building of general tree structures.
GiST indexes are useful in indexing geometric data types and full-text search.
example:point,line,box,circle
which support indexed queries using these operators:
<<
&<
&>
>>
<<|
&<|
|&>
|>>
@>
<@
~=
&&
example::
db=# create table points(p point);
CREATE TABLE
db=# insert into points(p) values
db-# (point '(1,1)'), (point '(3,2)'), (point '(6,3)'),
db-# (point '(5,5)'), (point '(7,8)'), (point '(8,6)');
INSERT 0 6
db=# explain analyze select * from points where p <@ box '(2,1),(7,4)';
QUERY PLAN
--------------------------------------------------------------------------------------------------
Seq Scan on points (cost=0.00..33.13 rows=2 width=16) (actual time=0.047..0.049 rows=2 loops=1)
Filter: (p <@ '(7,4),(2,1)'::box)
Rows Removed by Filter: 4
Planning Time: 1.621 ms
Execution Time: 0.082 ms
(5 rows)
db=# create index on points using gist(p);
CREATE INDEX
db=# explain analyze select * from points where p <@ box '(2,1),(7,4)';
QUERY PLAN
-------------------------------------------------------------------------------------------------
Seq Scan on points (cost=0.00..1.07 rows=1 width=16) (actual time=0.021..0.023 rows=2 loops=1)
Filter: (p <@ '(7,4),(2,1)'::box)
Rows Removed by Filter: 4
Planning Time: 3.361 ms
Execution Time: 0.048 ms
(5 rows)
db=# set enable_seqscan=off;
SET
db=# explain analyze select * from points where p <@ box '(2,1),(7,4)';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Only Scan using points_p_idx on points (cost=0.13..8.15 rows=1 width=16) (actual time=0.269..0.271 rows=2 loops=1)
Index Cond: (p <@ '(7,4),(2,1)'::box)
Heap Fetches: 2
Planning Time: 0.145 ms
Execution Time: 0.313 ms
(5 rows)
db=# explain (costs off) select * from points where p <@ box '(2,1),(7,4)';
QUERY PLAN
----------------------------------------------
Index Only Scan using points_p_idx on points
Index Cond: (p <@ '(7,4),(2,1)'::box)
(2 rows)
example:point,line,box,circle
sp-gist:
SP-GiST is an abbreviation for space-partitioned GiST. SP-GiST supports partitioned search trees,which facilitate development of a wide range of different non-balanced data structures, such as quad-trees, k-d trees, and radix trees (tries).
which support indexed queries using these operators:
<<
>>
~=
<@
<^
>^
example::
==========
postgres=# create table points(p point);
CREATE TABLE
postgres=# insert into points(p) values
postgres-# (point '(1,1)'), (point '(3,2)'), (point '(6,3)'),
postgres-# (point '(5,5)'), (point '(7,8)'), (point '(8,6)');
INSERT 0 6
postgres=# select amop.amopopr::regoperator, amop.amopstrategy
postgres-# from pg_opclass opc, pg_opfamily opf, pg_am am, pg_amop amop
postgres-# where opc.opcname = 'quad_point_ops'
postgres-# and opf.oid = opc.opcfamily
postgres-# and am.oid = opf.opfmethod
postgres-# and amop.amopfamily = opc.opcfamily
postgres-# and am.amname = 'spgist'
postgres-# and amop.amoplefttype = opc.opcintype;
amopopr | amopstrategy
-----------------+--------------
<<(point,point) | 1
>>(point,point) | 5
~=(point,point) | 6
<^(point,point) | 10
>^(point,point) | 11
<@(point,box) | 8
(6 rows)
postgres=# explain (costs off) select * from points where p >^ point '(2,7)';
QUERY PLAN
---------------------------------
Seq Scan on points
Filter: (p >^ '(2,7)'::point)
(2 rows)
postgres=# set enable_seqscan = off;
SET
postgres=# explain (costs off) select * from points where p >^ point '(2,7)';
QUERY PLAN
---------------------------------
Seq Scan on points
Filter: (p >^ '(2,7)'::point)
(2 rows)
postgres=# create index points_quad_idx on points using spgist(p);
CREATE INDEX
postgres=# explain (costs off) select * from points where p >^ point '(2,7)';
QUERY PLAN
-------------------------------------------------
Index Only Scan using points_quad_idx on points
Index Cond: (p >^ '(2,7)'::point)
(2 rows)
Brin Index:
BRIN stands for Block Range Index. BRIN is designed for handling very large tables.example: create index in_t1_id on t1 using brin(id);
Comments
Post a Comment