Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

When you query a traditional (relational) database column, a B-tree index usually handles the search efficiently.
Spatial data, however, is not just a single dimension; it’s at least two (latitude, longitude) and often more (3D coordinates, time, attributes). This makes spatial indexing a multidimensional problem. Without spatial indexing, every query would require scanning the entire dataset something that quickly becomes unmanageable with millions of points.
It’s not something most of us will ever encounter, but with all things involving data it’s good to have an understanding of what possible solutions exist and how you could potentially leverage them.
Spatial indexes must handle:
Unlike a normal index on a text or number field, spatial indexes must consider geometry and spatial relationships between multiple coordinates.
Invented by: Antonin Guttman (1984)
Used in: PostGIS, Oracle Spatial, SQLite/SpatiaLite, MySQL GIS
Concept:
SQL Example: PostGIS R-Tree with GiST
-- Create an R-Tree index on the geometry column
CREATE INDEX idx_parks_geom ON parks USING GIST (geom);Python Example — Query using R-Tree in Shapely + Rtree
from rtree import index from shapely.geometry
import Point
# Create an R-tree index
idx = index.Index()
points = [Point(0,0), Point(1,1), Point(5,5)]
for i, p in enumerate(points):
idx.insert(i, (p.x, p.y, p.x, p.y))
# Query for points within a bounding box r
esults = list(idx.intersection((0,0,2,2)))
print(results) # [0, 1]Used in: GIS tile rendering, game engines, image processing, Google Maps tiling
Concept:
Python Example — QuadTree Point Search:
class QuadTree:
def __init__(self, bounds, capacity):
self.bounds = bounds # (xmin, ymin, xmax, ymax)
self.capacity = capacity
self.points = []
self.divided = False
def insert(self, point):
if not self._in_bounds(point):
return False
if len(self.points) < self.capacity: self.points.append(point)
return True
else:
if not self.divided:
self._subdivide()
return
(
self.nw.insert(point) or
self.ne.insert(point) or
self.sw.insert(point) or
self.se.insert(point)
)
def _in_bounds(self, point):
x, y = point
xmin, ymin, xmax, ymax = self.bounds
return xmin <= x <= xmax and ymin <= y <= ymax
def _subdivide(self):
xmin, ymin, xmax, ymax = self.bounds
xmid = (xmin + xmax) / 2
ymid = (ymin + ymax) / 2
self.nw = QuadTree((xmin, ymid, xmid, ymax), self.capacity)
self.ne = QuadTree((xmid, ymid, xmax, ymax), self.capacity)
self.sw = QuadTree((xmin, ymin, xmid, ymid), self.capacity)
self.se = QuadTree((xmid, ymin, xmax, ymax), self.capacity)
self.divided = TrueUsed in: Elasticsearch, MongoDB, DynamoDB, Redis
Concept:
Python Example — Geohash Encoding
import geohash
lat, lon = 38.8977, -77.0365
# White House
code = geohash.encode(lat, lon, precision=7)
print(code) # e.g., "dqcjqcp"Concept:
Why useful:
Spatial indexing is the reason modern GIS queries return results in milliseconds instead of hours.
Choosing the right index depends on data shape, query patterns, and storage architecture.
If your dataset is growing into billions of points or polygons, you’ll almost always want: