Saturday, April 18, 2009

Social Security Numbers

Assuming that you have a legitimate need to capture social security numbers (human resources app?), you may want to validate and format them consistently. The actual code for both languages combined is a little bit too long to post, so I'll just talk about the algorithm.

A social security number is a nine-digit number and can be matched with the following regular expression: ^\d{3}\-?\d{2}\-?\d{4}$. If it can't pass this test, it's not a valid social security number. However, passing this simple test doesn't guarantee validity, so we need to keep checking.

  • No digit group can consist of only zeros, and the first group cannot be 666. We check for these errors with the following regular expression: ((000|666)\-?\d{2}\-?\d{4}|\d{3}\-?00\-?\d{4}|\d{3}\-?\d{2}\-?0000.

  • Numbers from 987-65-4320 to 987-65-4329 are reserved for use in advertisements, and other previously legitimate numbers have been invalidated because of use in advertisments. We check for these errors with the following regular expression: 987\-?65\-?432\d{1}|042\-?10\-?3580|062\-?36\-?0749|078\-?05\-?1120|095\-?07\-?3645|128\-?03\-?6045|135\-?01\-?6629|141\-?18\-?6941|165\-?(16|18|20|22|24)\-?7999|189\-?09\-?2294|212\-?09\-?(7694|9999|219\-?09\-?9999|306\-?30\-?2348|308\-?12\-?5070|468\-?28\-?8779|549\-?24\-?1889)

  • Last but not least, the first three numbers are never higher than 772 (well, not yet; this could change in the future). For this I used a simple string conversion and numeric comparison.

The complete, commented source code is available on Snipplr:

Next week we'll head north of the border to my country and see what the Canadian equivalent of the Social Security Number is, and how we can validate them.


Deryck said...
This comment has been removed by the author.
Deryck said...

It seems the Drupal policy on outside code restricts to GPL only. Would you be willing to change the license over to GPL?

Scott said...

That's the biggest reason why I didn't license my code under GPL in the first place: I didn't want to restrict people from using it in non-GPL works.

I know that's not the answer you want to hear, but I do have a solution for you. The code is not very complicated, so you could probably write your own equivalent using my code only as reference.

What I recommend for attribution is just a link back to this blog post. I personally find it helpful when going through someone's code (or even my own several months later) to see where the heck something came from.