For US addresses, the geocode that we return is based on public data that comes from the US Census Bureau and is known as TIGER data. For international addresses, we use Loqate's data, which includes rooftop accuracy for many countries. The TIGER data we use for US addresses will interpret various data sets to give you street-level accuracy (rather than rooftop accuracy), and will be less expensive than private data. Wikipedia has some excellent examples of how the interpreted results work.
We have about 38,683,796 unique lat/long coordinates for the US. Within that set, here is a breakdown of the various levels of geocode precision:
|Zip9||9-digit ZIP precision. Usually block-level.||33,361,331||86%|
|Zip8||8-digit ZIP precision.||4,412,599||12%|
|Zip7||7-digit ZIP precision.||726,485||1.5%|
|Zip6||6-digit ZIP precision.||141,395||0.4%|
|Zip5||5-digit ZIP precision. Usually city- or facility-level.||41,986||0.1%|
Geocoding is a challenging task. The complexity is greatly compounded if absolute precision is required. There are companies that provide privately collected, more precise data, but they are significantly more expensive. We recommend determining what level of precision is needed for your project and then comparing that to your budget.
SmartyStreets has worked with other organizations to improve the accuracy of coordinate data. While our base data set comes from public sources, we've aggregated those results with better, more accurate resources.
We don't recommend using our data to find somebody's front door. Even though Zip9 precision is often good enough to get you there intuitively, the pin may not drop precisely on the front door. Why?
Private geocode data comes from various sources, such as Google, Bing, Navteq, Garmin, and other such services. They each make their own maps and gather their own data. They compete for accuracy, resolution, update frequency, and tend to have very precise data. Generally this data is gathered from a fleet of vehicles that travel on public streets gathering map/GPS data as they go. Delivery services gather their own over time as they arrive at individual doorsteps. Yet others may overlay high-resolution images onto a known map and then manually pick out the structures and assign the corresponding geocode.
SmartyStreets is not in this business, but we're still happy to gather high-quality data and provide it to you.
Even the highest-quality, curated data set may have errors. If a property has two structures on it, a modest house and a large barn, the barn may get tagged as the principal structure because of its size.
Further, assigning a precise latitude and longitude doesn't always automatically match up with a physical address. The mailbox for a house can be several hundred feet away from the actual house.
How expensive is this curated data we're talking about? It depends. Prices can range from several thousand to hundreds of thousands of dollars per year, depending on your usage volume. Most data comes with strings attached, like a restrictive Acceptable Use Policy or attribution. Either of those might hurt your business model. If you absolutely need that precision, then go for it. If you can do with block-level accuracy most of the time, then our data should probably suffice.