Z-order (curve)

From wiki.gis.com
(Redirected from Z-order)
Jump to: navigation, search
Four iterations of the Z-order curve.
Lebesgue-3d-step2.png
Z-order curve iterations extended to three dimensions.

Z-order, Morton-order or Morton code first proposed in 1966 by G. M. Morton,[1] is a space-filling curve which is often used in computer science. Due to its good locality-preserving behavior it is used in data structures for mapping multidimensional data to one dimension. The z-value of a point in multidimensions is simply calculated by interleaving the binary representations of its coordinate values. Once the data are sorted into this ordering, any one-dimensional data structure can be used such as binary search trees, B-trees, skip lists or (with low significant bits truncated) hash tables. The resulting ordering can equivalently be described as the order would get from a depth-first traversal of a quadtree; because of its close connection with quadtrees, the Z-ordering can be used to efficiently construct quadtrees and related higher dimensional data structures.[2]

Coordinate values

The figure below shows the Z-values for the two dimensional case with integer coordinates 0 ≤ x ≤ 7, 0 ≤ y ≤ 7 (shown both in decimal and binary). Interleaving the binary coordinate values yields binary z-values as shown. Connecting the z-values in their numerical order produces the recursively Z-shaped curve.

Z-curve.png

Use with one-dimensional data structures for range searching

Although well locality-preserving, for efficient range searches an algorithm is necessary for calculating, from a point encountered in the data structure, the next Z-value which is in the multidimensional search range:

Bigmin.png


In this example, the range being queried (x=2..3, y=2..6) is indicated by the dotted rectangle. Its highest Z-value (MAX) is 45. In this example, the value F=19 is encountered when searching a data structure in increasing Z-value direction, so we would have to search in the interval between F and MAX (hatched area). To speed up the search, one would calculate the next Z-value which is in the search range, called BIGMIN (36 in the example) and only search in the interval between BIGMIN and MAX (bold values), thus skipping most of the hatched area. Searching in decreasing direction is analogous with LITMAX which is the highest Z-value in the query range lower than F. The BIGMIN problem has first been stated and its solution shown in Tropf and Herzog.[3] This solution is also used in UB-trees ("GetNextZ-address"). As the approach does not depend on the one dimensional data structure chosen, there is still free choice of structuring the data, so well known methods such as balanced trees can be used to cope with dynamic data (in contrast for example to R-trees where special considerations are necessary). Similarly, this independence makes it easier to incorporate the method into existing databases.

Applying the method hierarchically (according to the data structure at hand), optionally in both increasing and decreasing direction, yields highly efficient multidimensional range search which is important in both commercial and technical applications, e.g. as a procedure underlying nearest neighbor searches. Z-order is one of the few multidimensional access methods that has found its way into commercial database systems (Oracle database 1995,[4] Transbase 2000 [5]).

As long ago as 1966, G.M.Morton proposed Z-order for file sequencing of a static two-dimensional geographical database. Areal data units are contained in one or a few quadratic frames represented by their sizes and lower right corner Z-values, the sizes complying with the Z-order hierarchy at the corner position. With high probability, changing to an adjacent frame is done with one or a few relatively small scanning steps.

Related structures

As an alternative, the Hilbert curve has been suggested as it has a better order-preserving behavior, but here the calculations are much more complicated, leading to significant processor overhead. BIGMIN source code for both Z-curve and Hilbert-curve were described in a patent by H.Tropf.[6]

For a recent overview on multidimensional data processing, including e.g., nearest neighbor searches, see Hanan Samet's textbook.[7]

Applications in linear algebra

The Strassen algorithm for matrix multiplication is based on splitting the matrices in four blocks, and then recursively each of these blocks in four smaller blocks, until the blocks are single elements (or more practically, until reaching matrices so small that the trivial algorithm is faster). Arranging the matrix elements in Z-order then improves locality, and has the additional advantage (compared to row- or column-major ordering) that the subroutine for multiplying two blocks does not need to know the total size of the matrix, but only the size of the blocks and their location in memory.

See also

  • UB-tree
  • Hilbert curve
  • Hilbert R-tree
  • Spatial index
  • locality preserving hashing
  • Matrix representation

References

  1. Morton, G. M. (1966), A computer Oriented Geodetic Data Base; and a New Technique in File Sequencing, Technical Report, Ottawa, Canada: IBM Ltd. .
  2. Bern, M.; Eppstein, D.; Teng, S.-H. (1999), "Parallel construction of quadtrees and quality triangulations", Int. J. Comp. Geom. & Appl. 9 (6): 517–532, doi:10.1142/S0218195999000303 .
  3. Tropf, H.; Herzog, H. (1981), "Multidimensional Range Search in Dynamically Balanced Trees", Angewandte Informatik 2: 71–77, http://www.vision-tools.com/h-tropf/multidimensionalrangequery.pdf .
  4. Gaede, Volker; Guenther, Oliver (1998), "Multidimensional access methods", ACM Computing Surveys 30 (2): 170–231, doi:10.1145/280277.280279, http://www-static.cc.gatech.edu/computing/Database/readinggroup/articles/p170-gaede.pdf .
  5. Ramsak, Frank; Markl, Volker; Fenk, Robert; Zirkel, Martin; Elhardt, Klaus; Bayer, Rudolf (2000), "Integrating the UB-tree into a Database System Kernel", Int. Conf. on Very Large Databases (VLDB), pp. 263–272, http://www.mistral.in.tum.de/results/publications/RMF+00.pdf 
  6. Template:Citation/patent.
  7. Samet, H. (2006), Foundations of Multidimensional and Metric Data Structures, San Francisco: Morgan-Kaufmann .