Jenks optimization

From wiki.gis.com
(Redirected from Jenks' optimization)
Jump to: navigation, search

Jenks' Optimization

George Frederick Jenks

Jenks optimization, also known as Jenks Natural Breaks Classification, is an algorithm used to classify features using natural breaks in data values. Jenks' optimization, also referred to the Goodness of Variance Fit (GVF), was initially used in choropleth mapping. Choropleth maps are primarily used to display quantitative areal variables.[1]

Jenks Optimization Algorithm in GIS

In GIS, Jenks' optimization is a method of statistical data classification that partitions data into classes using an algorithm which calculates groupings of data values based on the data distribution.[2] Jenks' optimization reduces variance within groups and maximizes variance between groups. The use of the algorithm can be illustrated in four steps:

  • Step 1: The user selects the attribute, x, to be classified and specifies the number of classes required, k.
  • Step 2: A set of k‑1 random or uniform values are generated in the range [min{x},max{x}]. These are used as initial class boundaries.
  • Step 3: The mean values for each initial class are computed and the sum of squared deviations of class members from the mean values is computed. The total sum of squared deviations (TSSD) is recorded.
  • Step 4: Individual values in each class are assigned to adjacent classes by adjusting the class boundaries to verify that the TSSD can be reduced. This iterative process ends when improvement in TSSD falls below a threshold level, i.e. when the within class variance is as small as possible and between class variance is as large as possible. While true optimization is not assured, the entire process can be optionally repeated from Step 1 or 2 and the TSSD values compared.


GVF2.PNG


The equation displayed to the left describes the activity requried in using Jenks' algorithm to compute Goodness of Variance Fit (GVF. This equation can also be expressed as GVF = (SDAM-SDCM) / SDAM
where
SDAM = squared deviations of each observation
SDMC = square deviations for each x from class mean


Further Reading

  1. http://www.ferris.edu/faculty/burtchr/sure329/choropleth_map/sld021.htm
  2. http://www.spatialanalysisonline.com/output/html/Univariateclassificationschemes.html#_Ref116892931
  3. Jenks, George F. 1967. "The Data Model Concept in Statistical Mapping", International Yearbook of Cartography 7: 186-190.

References

  1. Xiao, N.; Choropleth Mapping. Accessed 29 July 2010.
  2. Coulson, M.R.C. 1987. In The Matter Of Class Intervals For Choropleth Maps: With Particular Reference To The Work Of George F Jenks Cartographica 24 (2): 16-39.