SQL Server Spatial Data Types

Posted: June 30, 2017 in Geo Coordinates, Open Geospatial Consortium (OGC), Spatial Data Type, SQL Server, SQL Server 2014
Tags:

SQL Server Spatial

Author:  Cole Francis, Architect

THE PROBLEM

So, wrap your head around this for a minute.  Let’s suppose that you’re a small company that sells casualty insurance to property owners in South Florida.

Your business is thriving, but you feel like you’ve completely saturated the market in that region, so now you want to expand your offerings to a small territory in central Florida.

After conducting rigorous research on the demographic data, you realize that hurricane insurance along the central-east coast is white hot right now.  You conclude that this is the result of booming home turnovers within that area.

The demographic area that your team has decided to pursue starts at West Palm Beach and extends all the way down to Miami.  What’s more, most of the activity in the area appears to be occurring between the coast and 10 miles inland.

So, you assemble your Sales and Marketing Team, and you provide them with the results of your fact finding.  Then you ask them to formulate a strategy that will allow the company to maximize their insurance sales efforts within that area.

Days later, the team reassembles and tells you what they found out about the area.  It turns out they recommend targeting a small territory of homes, which have recently sold along the A1A in West Palm Beach along Ocean Blvd, suffering from sudden turnover.

The team also explains that the sudden turnover is predominantly due to a large concentration of aging homeowners in the area that are selling their large homes and are opting for smaller living arrangements.

Much to everyone’s glee, the area also comes with a strong per capita household income, has a low crime rate, and has an average occurrence of natural disasters.

The entire team is excited about pursuing the new region per the demographic data, so their next step is to carefully map out the latitude/longitude coordinates of the area using their favorite mapping website.  Here are the coordinates they’ve used to construct their target market area.  Do you notice how they form a nice little polygon?

A1A

Next, you march down to the Palm Beach County Clerk’s Office and request an Excel Spreadsheet containing the addresses and latitude/longitude coordinates for all new and existing homes sales in the area for the past ninety days.

Of course, the spreadsheet the county offers lists far more addresses and geocoordinates than the small demographic region that your business is targeting.

Therefore, it’s up to your company to pare down the county’s results to only include addresses that are inside your target demographic polygon.  Realizing the manual complexity of this effort, you hand the data and other project artifacts over to your technical team to figure out.

Regardless, in most cases you can easily tell if a home’s latitude/longitude coordinates are well within the acceptable range by just visually inspecting them.

To this point, what I’m showing you in the picture below are the coordinates of those homes that are on the fringe of acceptability, meaning we can’t easily tell if the residences are inside or outside our target demographic polygon using a simple visual inspection.

Given this, you’ll need a quick way to process this data and pare the results down to only those households that the Sales and Marketing Team wants to pursue in the defined area.  So, how can you do this?

Well, you can either manually check them in Google Maps, but this would mean that you either have a really small set of data or a whole lot of time on your hands.  😁

Or, you can try a more automated approach using a platform like SQL Server.  Why SQL Server you ask?  The answer really lies in that there are two spatial data types in SQL Server that can help us quickly solve this problem.  The data types are called Geography and Geometry.

It’s important that you understand the difference between these two data type objects, including what they are and how they are used, because they’re similar but definitely not the same.

THE GEOGRAPHY DATATYPE

Although the geography data type sounds like the right fit for what we’re about to accomplish, it actually compounds the problem.  This is because the Geography data type is used for terrestrial spatial data covering the convex surface of the Earth.

Because of its ellipsoidal nature, any polygons that you use to define an area cannot exceed a single hemisphere and must specify the correct ring orientation.

It’s for these very reasons that simply drawing a polygon somewhere on its surface doesn’t give you enough information to make accurate determinations about geocoordinates that get fed into it.

For instance, if I were to draw an area around the Equator of the globe, and then I were to ask you if a specific latitude/longitude coordinate fell inside or outside the boundary I just drew, you wouldn’t have enough information to answer my question.

Why you might ask?  It’s because you would have to know if I was targeting the northern or southern ring of the Equator, the western or eastern hemisphere, and whether the polygon I constructed was meant to include or exclude the target latitude/longitude coordinates.

THE GEOMETRY DATATYPE

When I think of the SQL Server geometry data type and apply it to this problem, one of the first things that comes to my mind is a 1991 book, “Inventing the Flat Earth”, written by retired University of California Professor, Jeffrey Burton Russell.  In the book, Russell discusses how the “flat Earth” myth was disseminated by early 19th century writers like Antoinne-Jean Letronne, and others of course.

In the case of this data type, Microsoft SQL Server takes an opposite viewpoint of Russell.   Instead, they provide a geometry data type that allows us to construct the problem and solution using what you might think of as a “steroidal planar Earth object”, which conforms to the Open Geospatial Consortium (OGC).

I actually coined the term “Steroidal Flat Earth Object”, because the object’s range extends far beyond the -90 to 90 latitudes and -180 to 180 longitude maximums defining the Earth’s geographic range.

Because it’s just a very large, single-dimension plane, it’s not necessary to define if the target coordinates lie to the West or East of the Prime Meridian when using this data type.

However, there are still some basic rules that need to be followed in order to construct a well-formed polygon.  One is that the sequence order in which the vertices get added is important, just as I mentioned earlier in this article.  If you get the sequence wrong, then your polygon could have issues like the example shown below.

PolygonVertices

The other important rule is that you must make sure that your final coordinate is equal to your first coordinate in order to officially close the loop on your polygon.  If you don’t close the loop on the polygon, then SQL Server throws an error when you try to execute the code.

But, once you get things right, the result is a built-in SQL Server math function that offers an accurate determination on whether a point (i.e. a single coordinate) or a line (i.e. one or more coordinates) intersects or lies inside the perimeter of a well-defined polygon.

Pretty nifty, huh?  So, here’s the solution…

THE SOLUTION (COPY & PASTE THIS INTO SQL SERVER, AND THEN RUN IT)

--create a variable table
declare @coordinates table
(
id int identity (1,1),
coordinate geometry
)
declare @recordCount as int

-- @area represents the demographic area
-- *note: The sequence in how you add each point to construct
-- the polygon is very important.
declare @area as geometry

--turn off the verbose SQL Server logging
set nocount on

-- enter the fringe lat/long County coordinates into the table (this process would normally be automated).
insert into @coordinates (coordinate)
select geometry::Point(26.688379,-80.034762,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.684662,-80.037804,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.679250,-80.035808,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.674490,-80.037095,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.675722,-80.039778,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.700533,-80.034413,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.675402,-80.036230,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.694902,-80.037331,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.692697,-80.039852,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.693809,-80.041225,4326)
insert into @coordinates (coordinate)
select geometry::Point(26.677084,-80.037010,4326)

-- set up the demographic area by constructing a polygon of coordinates.-- *note: the final coordinate MUST match the first coordinate in order-- to close the polygon.  If it’s left open, then it won’t work.
-- *note: the sequence of the coordinates used to construct the polygon is also important.
set @area = geometry::STGeomFromText('POLYGON(( 26.697583 -80.033367, 26.691103 -80.033635, 26.681781 -80.035083, 26.679250 -80.035094, 26.679116 -80.035287, 26.674653 -80.035791, 26.672870 -80.035952, 26.668939 -80.035920, 26.668498 -80.035748, 26.665876 -80.036177, 26.664601 -80.036467, 26.661456 -80.036467, 26.661101 -80.038184, 26.661417 -80.038431, 26.663517 -80.038420, 26.665228 -80.038855, 26.665391 -80.038791, 26.666254 -80.038855, 26.666340 -80.038764, 26.666891 -80.038764, 26.667706 -80.038657, 26.669571 -80.038802, 26.670300 -80.038684, 26.670923 -80.039001, 26.671575 -80.039108, 26.671652 -80.039044, 26.673430 -80.039151, 26.673507 -80.039516, 26.675731 -80.039773, 26.683496 -80.038325, 26.686564 -80.037338, 26.687676 -80.037005, 26.689516 -80.037048, 26.691011 -80.037863, 26.692219 -80.039322, 26.692161 -80.040084, 26.693081 -80.040041, 26.693110 -80.038957, 26.694452 -80.038914, 26.694682 -80.038807, 26.694720 -80.040910, 26.694375 -80.041125, 26.694394 -80.042058, 26.697583 -80.033367))', 4326)

select @recordCount = count(id) FROM @coordinates
while (@recordCount > 0)
begin 
     declare @forwardPoint geometry 
     declare @identity int 
     select Top(1) @identity=id, @forwardPoint=coordinate from @coordinates 
     set @forwardPoint = @forwardPoint.MakeValid();

     if (@forwardPoint.STIntersection(@area).ToString() <> 'GEOMETRYCOLLECTION EMPTY') and (@forwardPoint.STIntersection(@area).ToString() <> 'GEOMETRYCOLLECTION EMPTY') 
     begin 
          print 'Id ' + cast(@recordCount as varchar(10)) + ' is inside the target demographic area for lat/lng ' + cast(@forwardPoint as varchar(100)) 
     end 
     else 
     begin 
          print 'Id ' + cast(@recordCount as varchar(10)) + ' is outside the target demographic area for lat/lng ' + cast(@forwardPoint as varchar(100)) 
     end
     
     delete from @coordinates where id = @identity select @recordCount = count(id) FROM @coordinates
 end

-- turn verbose logging back on
set nocount on


THE END RESULT IS THIS

The Sales Team will now pursue every household (represented as a lat/lng coordinate below) that falls inside the target demographic area.  Coordinates that fall outside the target area will be ignored by the Sales Team for now:

Id 11 is inside the target demographic area for lat/lng POINT (26.688379 -80.034762)
Id 10 is inside the target demographic area for lat/lng POINT (26.684662 -80.037804)
Id 9 is inside the target demographic area for lat/lng POINT (26.67925 -80.035808)
Id 8 is inside the target demographic area for lat/lng POINT (26.67449 -80.037095)
Id 7 is outside the target demographic area for lat/lng POINT (26.675722 -80.039778)
Id 6 is outside the target demographic area for lat/lng POINT (26.700533 -80.034413)
Id 5 is inside the target demographic area for lat/lng POINT (26.675402 -80.03623)
Id 4 is inside the target demographic area for lat/lng POINT (26.694902 -80.037331)
Id 3 is inside the target demographic area for lat/lng POINT (26.692697 -80.039852)
Id 2 is outside the target demographic area for lat/lng POINT (26.693809 -80.041225)
Id 1 is inside the target demographic area for lat/lng POINT (26.677084 -80.03701)

For more information on Microsoft SQL Server Spatial Data Types, click here.

Hi, I’m Cole Francis, a Solution Architect for The PSC Group in Schaumburg, IL.  I’ve been successfully designing, developing, and delivering custom software solutions for an impressive and extensive list of well-branded clients for over twenty years.

Thanks for reading and keep on coding! 🙂

Advertisements

Your Feedback is Welcome

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s