Extract addresses from text

Find and verify addresses in paragraphs of text, posts, tweets, emails, etc.
This is an experimental beta feature.  See the readme and FAQ.

Try a sample

(may find more addresses, but uses more lookups)
(maintain congruent input/output lines)


This will use API lookups on your current subscription.  More info




Results

Extracted what looks like 0 addresses
Skip to bulk output



Bulk Output

Copy and paste the text below into a CSV text file, which can then be opened by a spreadsheet program.


(This output format is subject to change.)     Go up

What's this?

We've been working on some pretty cool things lately, including finding addresses in arbitrary data. Submit text—paragraphs, emails, web pages, and more—to identify valid addresses and other segments that look like addresses.

Recommended uses

  • Small-to-medium-sized snippets of plain text
  • Email messages
  • Transcripts
  • Articles
  • Documents
  • Web pages
  • Clusters of URLs with addresses
  • Tweets
  • Forum posts

Not recommended

  • Messy spreadsheet data
    (try our list service instead)
  • Whole CSV files
    (choose just the necessary columns of data)
  • Severely malformed content
    (if a human can't figure it out, neither can a computer)
  • Non-English text/addresses
    (we're not a translation service)
  • Extra data with lots of numbers
    (noisier data means more false positives)


This tool requires a modern browser such as Chrome, Firefox, Safari 6+, or Internet Explorer 9+.

Using the API

You can use this service with our API from your own code. Just try the quick-start tutorial and read its documentation in our KB.


Subscription usage

Your current LiveAddress API subscription is used to validate what appear to be addresses in your text. Lookups are deducted from your account based on how many segments of your input text look like addresses. Sometimes, finding one valid address may use more than one lookup. Having long inputs and more snippets of text that look like they might be addresses will use more lookups on your account. So you may want to keep an eye on your usage, and remember you can upgrade your plan at any time, all the way up to unlimited.


Inspect the results

Valid addresses are returned with the substring (segment of text) in which the address was found. That snippet may, in its entirety, be the whole address. It may also contain the address which validated.

Invalid addresses are also returned with their associated segments. In some cases, that snippet may not contain any address at all. Other times, it may contain a valid address inside it that LiveAddress wasn't able to figure out. And yet other times, it may contain an address which could not be verified because it is invalid.

Input strings with UTF-8 (Unicode, or "fancy") characters are allowed, but the character positioning may not be precise, since strings are converted to a more basic ASCII encoding during the extraction process. So if your input string contains Unicode characters, addresses can still be extracted, but their associated segments may not match up perfectly.


Policies and guarantees

This service is still experimental. It is not covered by our Service Level Agreement and we can provide no guarantees at this time, except that the valid address candidates which are returned are indeed valid addresses.

According our Terms of Service, we may monitor requests and inputs, totally anonymously, solely to improve this service.

We can make no guarantees as to the amount of lookups that will be deducted from your subscription, as that depends entirely upon your input. We suggest that you try a few small samples from your actual use cases and monitor your subscription usage to get a general idea.


Feedback

Please contact us with your suggestions, or to tell us why you love it or find it useful.

FAQ



Maximum input length?

Right now, it's 64 KB. We hope to be able to increase this in the future.

Why did it think [some text] was an address?

Because it has some characteristics that identify it as a possible address. US addresses come in all forms, shapes, and sizes, and this extractor -- particularly in aggressive mode -- will tend to be more liberal in choosing addresses from text. It's a fact: false positives happen. Sometimes it'd even be hard for a human to tell if a segment could be an address or not. LiveAddress does its best, and certainly is much faster than manually going through the input yourself. Accuracy ultimately depends on the complexity of your input. Note that the contents of spreadsheet files (CSV, Excel, etc.) are often full of extra data which can confuse LiveAddress extraction. For that type of data, use our list service.

Why did one address validate twice, or as two addresses?

Sometimes segments which are recognized as addresses contain more than one address, or contain extra text not actually part of the address, especially if addresses are close together or there is "noise" (extra address-like data) in or around the address. Sometimes a "dual address" is found, which may be totally valid according to the USPS. In order to compensate for this as much as possible, and to provide as many results to you as possible, LiveAddress will validate an address two different ways, just in case.

Why didn't it find an address?

There could be many reasons. Did you try in aggressive mode? It will use more lookups on your account, but it's more likely to find it. If aggressive mode still doesn't find it, the address may be too long, may be missing a state or ZIP code, or may be surrounded with or contain too much "noise." In some cases, turning aggressive mode off will be more accurate, especially if your input is pretty clean and the addresses are more obvious.

How many lookups will this use per address?

That entirely depends upon your input and settings. While we can't be sure because each case is so different, a simple paragraph with a basic address in it might use 1-4 lookups, depending on whether aggressive mode is enabled. Complex input that's harder to decipher may use more lookups. While it's impossible to know beforehand, you can get a good idea for your specific use case by trying small snippets of what your input will look like and watching the usage on your account.