Location Intelligence, Spatial Data Analysis | Pitney Bowes

Seeing the patterns in IoT data using GeoHash

By Rose Winterton

With increasing volumes of data coming from moving devices attached to the Internet of Things, the importance of location intelligence has never been greater.  Plotting a sea of transactional data points on a map just creates confusion; you can’t see the forest for the trees.  One way of making sense of this large volume of data is to start aggregating data into a geospatial grid. 

One regularly used grid type is GeoHash. This is a hierarchical grid structure available in the public domain.  It allows us to take coordinate pairs coming from a mobile device or other sensor and assign them to a unique grid cell – a process sometimes termed spatial binning.  Once the transactional data is associated with the grid ID it can then be enriched with other datasets or the data can easily be aggregated to give statistics.

Think of an example in mobile telecommunications.  Billions of call records across the network can be spatially binned into this GeoHash grid using processing in Hadoop and each associated with a grid ID. We can then take all the data for a single grid ID and look at the average value for that grid cell.  This can then be used to give us a representation of the actual signal quality on the network.  Once the calculations are done the output can be written to vector tile format (mvt) and viewed against a contextual map backdrop. This process is highly effective in Hadoop because it can be done with massive parallelization.

The patterns in the data for high and low signal quality can be correlated against other spatial datasets such as transport networks, points of interest or competitive data.  The information gained from these visual analytics can be used in planning marketing campaigns, network optimization or customer service.  In addition once the steady state coverage is determined new data can be assessed against that in real time.  If a transaction indicates a dropped call in an area where signal coverage is usually good then the carrier can proactively choose to flag the record to see if there is a fault on the network and also any suggested next best action  for the customer.

Check out our latest Spectrum for Big data release including GeoHash gridding.