Geohash

From wiki.gis.com
Jump to: navigation, search

Geohash is a latitude/longitude geocode system invented by Gustavo Niemeyer when writing the web service at geohash.org, and put into the public domain. It is a hierarchical spatial data structure which subdivides space into buckets of grid shape.

Geohashes offer properties like arbitrary precision and the possibility of gradually removing characters from the end of the code to reduce its size (and gradually lose precision).

As a consequence of the gradual precision degradation, nearby places will often present similar prefixes.

Service

The purpose of the geohash.org service, launched in February 2008, is to offer short URLs which uniquely identify positions on the Earth, so that referencing them in emails, forums, and websites is more convenient.

To obtain the Geohash, the user provides an address to be geocoded, or latitude and longitude coordinates, in a single input box (most commonly used formats for latitude and longitude pairs are accepted), and performs the request.

Besides showing the latitude and longitude corresponding to the given Geohash, users who navigate to a Geohash at geohash.org are also presented with an embedded map, and may download a GPX file, or transfer the waypoint directly to certain GPS receivers. Links are also provided to external sites that may provide further details around the specified location.

For example, the coordinate pair 57.64911,10.40744 produces a hash of u4pruydqqvj, which can be used in the URL http://geohash.org/u4pruydqqvj

Uses

The main usages of Geohashes are

  • as a unique identifier.
  • represent point data e.g. in databases.

Geohashes have also been proposed to be used for geotagging.

Example

Using the hash ezs42 as an example, here is how it is decoded into a decimal latitude and longtitude

Decode from base 32

The first step is decoding it from base 32 using the following character map:

Decimal 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Base 32 0 1 2 3 4 5 6 7 8 9 b c d e f g
 
Decimal 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Base 32 h j k m n p q r s t u v w x y z

This operation results in the bits 01101 11111 11000 00100 00010. Assuming that counting starts at 0 in the left side, and that 0 is even, the even bits are taken for the longitude code (0111110000000), while the odd bits are taken for the latitude code (101111001001).

Decode binary to decimal

Each binary code is then used in a series of divisions, considering one bit at a time, again from the left to the right side. For the latitude value, the interval -90 to +90 is divided by 2, producing two intervals: -90 to 0, and 0 to +90. Since the first bit is 1, the higher interval is chosen, and becomes the current interval. The procedure is repeated for all bits in the code. Finally, the latitude value is the center of the resulting interval. Longitudes are processed in an equivalent way, keeping in mind that the initial interval is -180 to +180.

Finishing the procedure should yield approximately latitude 42.6 and longitude -5.6.

Worked example

Here's a worked example decoding 101111001001 into 42.6. To start with, we know the latitude is somewhere in the range -90 to 90. With no bits, we'd have to guess the latitude was 0, giving us an error of +/- 90. With one bit, we can decide whether its in the range -90 to 0, or 0 to 90. The first bit is high, so we know our latitude is somewhere between 0 and 90. Without any more bits, we'd guess the latitude was 45, giving us an error of +/-45

Each subsequent bit halves this error. This table shows the effect of each bit. At each stage, the relevant half of the range is highlighted in green - a low bit selects the lower range, a high bit the upper range.

The last column shows the latitude, simply the mean value of the range. Each subsequent bit makes this value more precise.

bit min mid max val
1 -90.000 0.000 90.000 45.000
0 0.000 45.000 90.000 22.500
1 0.000 22.500 45.000 33.750
1 22.500 33.750 45.000 39.375
1 33.750 39.375 45.000 42.188
1 39.375 42.188 45.000 43.594
0 42.188 43.594 45.000 42.891
0 42.188 42.891 43.594 42.539
1 42.188 42.539 42.891 42.715
0 42.539 42.715 42.891 42.627
0 42.539 42.627 42.715 42.583
1 42.539 42.583 42.627 42.605

(The numbers in the above table have been rounded to 3 decimal places for clarity).

Limitations

One limitation of the Geohash algorithm is in attempting to utilize it to find points in proximity to each other based on a common prefix. Edge case locations in close proximity to each other but on opposite sides of the Equator or a meridian can result in Geohash codes with no common prefix[1]. Still, this does not prevent from its potential usage in proximity searches.

License and patents

The Geohash geocode has been put in the public domain by its inventor in the public announcement date, in February 26th, 2008.[2]

While comparable algorithms have been successfully patented[3] and had copyright claimed upon[4][5], GeoHash is based on an entirely different algorithm and approach.

See also

External links

References