Misplaced Pages

Flattening transformation

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
The topic of this article may not meet Misplaced Pages's general notability guideline. Please help to demonstrate the notability of the topic by citing reliable secondary sources that are independent of the topic and provide significant coverage of it beyond a mere trivial mention. If notability cannot be shown, the article is likely to be merged, redirected, or deleted.
Find sources: "Flattening transformation" – news · newspapers · books · scholar · JSTOR (January 2023) (Learn how and when to remove this message)

The flattening transformation is an algorithm that transforms nested data parallelism into flat data parallelism. It was pioneered by Guy Blelloch as part of the NESL programming language. The flattening transformation is also sometimes called vectorization, but is completely unrelated to automatic vectorization. The original flattening algorithm was concerned solely with first-order multidimensional arrays containing primitive types, but was extended to handle higher-order and recursive data types in the work on Data Parallel Haskell.

Overview

Flattening works by lifting functions to operate on arrays instead of on single values. For example, a function f : A B {\displaystyle f:A\rightarrow B} is lifted to a function f : [ A ] [ B ] {\displaystyle f':\rightarrow } . This means an expression m a p   f   x {\displaystyle map~f~x} can be replaced with an application of the lifted function: f   x {\displaystyle f'~x} . Intuitively, flattening thus works by replacing all function applications with applications of the corresponding lifted function.

After flattening, arrays are represented as single-dimensional value vector V containing scalar elements, alongside auxiliary information recording the nested structure, typically in the form of a boolean flag vector F. The flag vector indicates, for the corresponding element in the value vector, whether it is the beginning of a new segment. For example, the two-dimensional irregular array A = [ [ 1 , 2 , 3 ] , [ 4 , 5 ] , [ ] , [ 6 ] ] {\displaystyle A=,,,]} can be represented as the data vector V = [ 1 , 2 , 3 , 4 , 5 , 6 , 7 ] {\displaystyle V=} alongside the flag vector F = [ 1 , 0 , 0 , 1 , 0 , 1 , 1 ] {\displaystyle F=} .

This flag vector is necessary in order to correctly flatten nested parallelism. For example, it is used in the flattening of prefix sum to segmented scan.

Flattening can increase the asymptotic work and space complexity of the original program, leading to a much less efficient result.

Usage

Flattening was originally developed for vector machines such as the Connection Machine, and often produces code that is not a good fit for modern multicore CPUs. However, the principles underlying its simpler cases can be found in constructs such as the vmap in Google Jax.

References

  1. Blelloch, Guy (1995). "NESL: A Nested Data-Parallel Language". {{cite journal}}: Cite journal requires |journal= (help)
  2. Data parallel Haskell: a status report
  3. Spoonhower, Daniel; Harper; Blelloch; Gibbons (2008). "Space profiling for parallel functional programs". {{cite journal}}: Cite journal requires |journal= (help)
  4. Bergstrom, Lars; Fluet, Matthew; Rainey, Mike; Reppy, John (2013), "Data-Only Flattening for Nested Data Parallelism", PPoPP, pp. 81–92, doi:10.1145/2442516.2442525, ISBN 978-1-4503-1922-5, S2CID 1005665
Categories: