Misplaced Pages

Block nested loop

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Algorithm

A block-nested loop (BNL) is an algorithm used to join two relations in a relational database.

This algorithm is a variation of the simple nested loop join and joins two relations R {\displaystyle R} and S {\displaystyle S} (the "outer" and "inner" join operands, respectively). Suppose | R | < | S | {\displaystyle |R|<|S|} . In a traditional nested loop join, S {\displaystyle S} will be scanned once for every tuple of R {\displaystyle R} . If there are many qualifying R {\displaystyle R} tuples, and particularly if there is no applicable index for the join key on S {\displaystyle S} , this operation will be very expensive.

The block nested loop join algorithm improves on the simple nested loop join by only scanning S {\displaystyle S} once for every group of R {\displaystyle R} tuples. Here groups are disjoint sets of tuples in R {\displaystyle R} and the union of all groups has the same tuples as R {\displaystyle R} . For example, one variant of the block nested loop join reads an entire page of R {\displaystyle R} tuples into memory and loads them into a hash table. It then scans S {\displaystyle S} , and probes the hash table to find S {\displaystyle S} tuples that match any of the tuples in the current page of R {\displaystyle R} . This reduces the number of scans of S {\displaystyle S} that are necessary.

algorithm block_nested_loop_join is
    for each page pr in R do
        for each page ps in S do
            for each tuple r in pr do
                for each tuple s in ps do
                    if r and s satisfy the join condition then
                        yield tuple <r,s>


A more aggressive variant of this algorithm loads as many pages of R {\displaystyle R} as can be fit in the available memory, loading all such tuples into a hash table, and then repeatedly scans S {\displaystyle S} . This further reduces the number of scans of S {\displaystyle S} that are necessary. In fact, this algorithm is essentially a special-case of the classic hash join algorithm.

The block nested loop runs in O ( P r P s / M ) {\displaystyle O(P_{r}P_{s}/M)} I/Os where M {\displaystyle M} is the number of available pages of internal memory and P r {\displaystyle P_{r}} and P s {\displaystyle P_{s}} is size of R {\displaystyle R} and S {\displaystyle S} respectively in pages. Note that block nested loop runs in O ( P r + P s ) {\displaystyle O(P_{r}+P_{s})} I/Os if R {\displaystyle R} fits in the available internal memory.

References

  1. "8.2.1.14 Block Nested-Loop and Batched Key Access Joins". MySQL 5.6 Reference Manual. Oracle Corporation. Retrieved 2 August 2015.
  2. "Block Nested Loop Join". MariaDB. MariaDB Corporation Ab. Retrieved 2 August 2015.
Category: