The Visibility Map is a simple data structure associated with every heap relation (table). It is a "relation fork"; an on-disk ancillary file associated with a particular relation (table or index). Note that index relations (that is, indexes) do not have a visibility map associated with them. The visibility map is concerned with tracking which tuples are visible to all transactions at a high level. Tuples from one transaction may or may not be visible to any given other transaction, depending on whether or not their originating transaction actually committed (yet, or ever, if the transaction aborted), and when that occurred relative to our transaction's current snapshot. Note that the exact behaviour depends on our transaction isolation level. Note also that it is quite possible for one transaction to see one physical tuple/set of values for one logical tuple, while another transaction sees other, distinct values for that same logical tuple, because, in effect, each of the two transaction has a differing idea of what constitutes "now". This is the core idea of MVCC. When there is absolute consensus that all physical tuples (row versions) in a heap page are visible, the page's corresponding bit may be set.

Another relation fork that you may be familiar with is the freespace map. In contrast to the visibility map, there is a FSM for both heap and index relations (with the sole exception of hash index relations, which have none).

The purpose of the freespace map is to quickly locate a page with enough free space to hold a tuple to be stored, or to determine if no such page exists and the relation has to be extended.

In PostgreSQL 8.4, the current freespace map implementation was added. It made the freespace map an on-disk relation fork. The previous implementation required administrators to guestimate the number of relations, and the required freespace map size for each, so that the freespace map existed only in a fixed allocation of shared memory. This tended to result in wasted space due to undersizing, as the core system's storage manager needlessly extended relations.

 [peter@peterlaptop 12935]$ ls -l -h -a
 -rw-------. 1 peter peter 8.0K Sep 28 00:00 12910
 -rw-------. 1 peter peter  24K Sep 28 00:00 12910_fsm
 -rw-------. 1 peter peter 8.0K Sep 28 00:00 12910_vm
 ***SNIP***

The FSM is structured as a binary tree . There is one leaf node per heap page, with non-leaf nodes stores the maximum amount of free space for any of its children. So, unlike EXPLAIN output's node costs, the values are not cumulative.

The visibility map is a simpler structure. There is one bit for each page in the heap relation that the visibility map corresponds to.

The primary practical reason for having and maintaining the visibility map is to optimise VACUUM. A set bit indicates that all tuples on the corresponding heap page are known to be visible to all transactions, and therefore that vacuuming the page is unnecessary. Like the new freespace map implementation, the visibility map was added in Postgres 8.4.

The visibility map is conservative in that a set bit (1) indicates that all tuples are visible on the page, but an unset bit (0) indicates that that condition may or may not be true .

The Visibility Map (and other relation forks)

相關文章