Area Code Data Retrieval

Last week I was tasked with populating a database with area codes/prefix combinations and the geographic location they map to. This was an interesting task that required me to retrieve data from a foreign API. For those of you that are not sure what I am talking about when I say area code/prefix let me take a second to explain. A telephone number consists of 10 digits (not everywhere but for the most part) the first three digits are the area code and the next three digits are the prefix.  This of course does not take into account international country codes like 001 that get prepended to a telephone number. When you have a particular area code/ prefix combination you can use it to figure out which State/Province and City the owner of that combination resides in.  If you would like to research this further visit the North American Numbering Plan (NANP) website.

In order to populate my database I used two separate API’s: local calling guide’s XML query interface, and Telephone Number Identification (tnID) search functionality.

To start, I downloaded a list of all of the area codes used and the countries they belonged to. I got this list from the NANP website but thinking back on it now I could have just went from numbers 200 – 999 and got that information from  the API’s. (This of course would have taken more time since not all of those area codes are in use.) I then wanted to use that list to get all of the available prefixes for each of the area codes and then finally the city they mapped onto.

The approach I took in getting the data was a bit flawed. I used local calling’s query interface to first get a list of rate centers per area code, then I used the rate center’s exch code to get all of the prefixes available for that area code. Finally I used their xmllocalprefix function to get the city information. You can imagine that this is a lot of data. You need to go through each area code, retrieve a list of rate centers and then retrieve the city information. I believe it took up to a minute to get all of the data for a single area code. This is definitely a long time however i was thinking I would create a script to do this automatically – press go once, wait some time and done. Boy was I wrong. Also I needed all of these steps because the local calling’s API did not provide a more direct way of getting all of the prefixes per area code.

First Attempt

My initial script consisted of an HTML form with a textarea and a submit button. The idea was that I would copy and paste the area code/ country information (“416 – Canada \n 905- Canada…”) from the NANP list I mentioned above and then press submit and let my PHP script do the work. Essentially, a POST request was sent with all of the area codes and then the PHP script would go through each area code and get the city information in the manner I described above. I learned quickly that if a POST request takes two long to process it times out! Leaving only 3 or so of the area codes processed.

Second Attempt

In order to get around the POST timeout I decided to do a PHP Header redirect after each area code has processed. Since the redirect lost the area code – country data that was in the textarea, I had to use a SESSION variable to store that information. I now had two separate files. The first file initialized the SESSION variable if it wasn’t already initialized, then it called the second file. The second file processed the next area code in the SESSION, removed it from the session, and then called the first file. This seemed like a good way to do things however it resulted in a “too many redirects” error 😦 On the upside sometimes I was able to process up to 15 area codes at a time which was a big improvement from the first script.

Third and Final Attempt

After my first attempt failed I took a step back and thought about what to do next. I figured out that I should use some JavaScript magic to make it appear like something else is happening. After all when you browse a website the server never complains about too many clicks. I edited the first file. I added a document onLoad event. Now when the document loaded it would display some information on the screen before it loaded the second page. The first piece of information was what area code was just processed and the second piece was what area code will be processed next. This was brilliant since it actually let me know what was going on behind the scenes. Before this i was using SQL select statement on the database to see what data actually got stored. This flow worked perfectly. No actual errors. However I still was not getting what I wanted. Apparently after a couple of redirects PHP’s SESSION variable gets whipped (most likely some PHP config variables needed editing). Which meant that after 15 or so go around  my area code SESSION variable would get re-initialized and the script would attempt to store the data for the very first area code.  This really sucked and since I have already taken way to long to complete my task I decided to split my area code data into smaller chunks which meant running the script a dozen times or so.

tnID

Telephone Number Identification (tnID) search functionality would have probably been a better source for my data. I didn’t end up using it to store data into the database because I did not figure out how to use it until I was done my task. It wasn’t a total loss because I did end up comparing what I had in the database with what the search results returned.

Data Cleanup

After some investigations into the database I had I notices that some items did not make sense. For example I had a lot of ‘Washington Zone 1’. I needed to clean this up but I wasn’t about to spend a whole day on doing so. This time I has an advantage. I know both the area code and the prefix. After some googling I stumbled on peoplesearchaffiliates.com and their API. So I made a new script. I still had two files. The first file indicated which State needed to get updated as well as listed information on which prefixes were already checked. The second one, a PHP file got the results from the api (http://api.peoplesearchaffiliates.com/cgi-bin/rpd-api.cgi?phone=%5Bentirephonenumber%5D), updated the database and redirected to the first file with updated querysting parameters. The query string parameters indicated which prefix needed to be updated next. With this process, there were no time-outs or errors.

 

View all of my blogs