I'm designing a database table and once again asking myself the same stupid question: How long should the firstname field be?
Does anyone have a list of reasonable lengths for the most common fields, such as first name, last name, and email address?
W3C's recommendation:
If designing a form or database that will accept names from people
with a variety of backgrounds, you should ask yourself whether you
really need to have separate fields for given name and family name.
… Bear in mind that names in some cultures can be quite a lot longer
than your own. … Avoid limiting the field size for names in your
database. In particular, do not assume that a four-character
Japanese name in UTF-8 will fit in four bytes – you are likely to
actually need 12.
https://www.w3.org/International/questions/qa-personal-names
For database fields, VARCHAR(255)
is a safe default choice, unless you can actually come up with a good reason to use something else. For typical web applications, performance won't be a problem. Don't prematurely optimize.
I just queried my database with millions of customers in the USA.
The maximum first name length was 46. I go with 50. (Of course, only 500 of those were over 25, and they were all cases where data imports resulted in extra junk winding up in that field.)
Last name was similar to first name.
Email addresses maxed out at 62
characters. Most of the longer ones
were actually lists of email
addresses separated by semicolons.
Street address maxes out at 95
characters. The long ones were all
valid.
Max city length was 35.
This should be a decent statistical spread for people in the US. If you have localization to consider, the numbers could vary significantly.
UK Government Data Standards Catalogue details the UK standards for this kind of thing.
It suggests 35 characters for each of Given Name and Family Name, or 70 characters for a single field to hold the Full Name, and 255 characters for an email address. Amongst other things..
Some probably correct column lengths
Min Max
Hostname 1 255
Domain Name 4 253
Email Address 7 254
Email Address [1] 3 254
Telephone Number 10 15
Telephone Number [2] 3 26
HTTP(S) URL w domain name 11 2083
URL [3] 6 2083
Postal Code [4] 2 11
IP Address (incl ipv6) 7 45
Longitude numeric 9,6
Latitude numeric 8,6
Money[5] numeric 19,4
[1] Allow local domains or TLD-only domains
[2] Allow short numbers like 911 and extensions like 16045551212x12345
[3] Allow local domains, tv:// scheme
[4] http://en.wikipedia.org/wiki/List_of_postal_codes. Use max 12 if storing dash or space
[5] http://stackoverflow.com/questions/224462/storing-money-in-a-decimal-column-what-precision-and-scale
A long rant on personal names
A personal name is either a Polynym (a name with multiple sortable components), a Mononym (a name with only one component), or a Pictonym (a name represented by a picture - this exists due to people like Prince).
A person can have multiple names, playing roles, such as LEGAL, MARITAL, MAIDEN, PREFERRED, SOBRIQUET, PSEUDONYM, etc. You might have business rules, such as "a person can only have one legal name at a time, but multiple pseudonyms at a time".
Some examples:
names: [
{
type:"POLYNYM",
role:"LEGAL",
given:"George",
middle:"Herman",
moniker:"Babe",
surname:"Ruth",
generation:"JUNIOR"
},
{
type:"MONONYM",
role:"SOBRIQUET",
mononym:"The Bambino" /* mononyms can be more than one word, but only one component */
},
{
type:"MONONYM",
role:"SOBRIQUET",
mononym:"The Sultan of Swat"
}
]
or
names: [
{
type:"POLYNYM",
role:"PREFERRED",
given:"Malcolm",
surname:"X"
},
{
type:"POLYNYM",
role:"BIRTH",
given:"Malcolm",
surname:"Little"
},
{
type:"POLYNYM",
role:"LEGAL",
given:"Malik",
surname:"El-Shabazz"
}
]
or
names:[
{
type:"POLYNYM",
role:"LEGAL",
given:"Prince",
middle:"Rogers",
surname:"Nelson"
},
{
type:"MONONYM",
role:"SOBRIQUET",
mononym:"Prince"
},
{
type:"PICTONYM",
role:"LEGAL",
url:"http://upload.wikimedia.org/wikipedia/en/thumb/a/af/Prince_logo.svg/130px-Prince_logo.svg.png"
}
]
or
names:[
{
type:"POLYNYM",
role:"LEGAL",
given:"Juan Pablo",
surname:"Fernández de Calderón",
secondarySurname:"García-Iglesias" /* hispanic people often have two surnames. it can be impolite to use the wrong one. Portuguese and Spaniards differ as to which surname is important */
}
]
Given names, middle names, surnames can be multiple words such as "Billy Bob" Thornton
, or Ralph "Vaughn Williams"
.
I would say to err on the high side. Since you'll probably be using varchar, any extra space you allow won't actually use up any extra space unless somebody needs it. I would say for names (first or last), go at least 50 chars, and for email address, make it at least 128. There are some really long email addresses out there.
Another thing I like to do is go to Lipsum.com and ask it to generate some text. That way you can get a good idea of just what 100 bytes looks like.
I pretty much always use a power of 2 unless there is a good reason not to, such as a customer facing interface where some other number has special meaning to the customer.
If you stick to powers of 2 it keeps you within a limited set of common sizes, which itself is a good thing, and it makes it easier to guess the size of unknown objects you may encounter. I see a fair number of other people doing this, and there is something aesthetically pleasing about it. It generally gives me a good feeling when I see this, it means the designer was thinking like an engineer or mathematician. Though I'd probably be concerned if only prime numbers were used. :)
I wanted to find the same and the UK Government Data Standards mentioned in the accepted answer sounded ideal. However none of these seemed to exist any more - after an extended search I found it in an archive here: http://webarchive.nationalarchives.gov.uk/+/http://www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastandards.aspx. Need to download the zip, extract it and then open default.htm in the html folder.
firstname : 35
lastname : 35
email : 255
url : 60+ according to server and browser
city : 45
address : 90
Just looking though my email archives, there are a number of pretty long "first" names (of course what is meant by first is variable by culture). One example is Krishnamurthy - which is 13 letters long. A good guess might be 20 to 25 letters based on this. Email should be much longer since you might have firstname.lastname@somedomain.com. Also, gmail and some other mail programs allow you to use firstname.lastname+sometag@somedomain.com where "sometag" is anything you want to put there so that you can use it to sort incoming emails. I frequently run into web forms that don't allow me to put in my full email address without considering any tags. So, if you need a fixed email field maybe something like 25.25+15@20.3 in characters for a total of 90 characters (if I did my math right!).
These might be useful to someone;
youtube max channel length = 20
facebook max name length = 50
twitter max handle length = 15
email max length = 255
http://www.interoadvisory.com/2015/08/6-areas-inside-of-linkedin-with-character-limits/
I usually go with:
Firstname: 30 chars
Lastname: 30 chars
Email: 50 chars
Address: 200 chars
If I am concerned about long fields for the names, I might sometimes go with 50 for the name fields too, since storage space is rarely an issue these days.
If you need to consider localisation (for those of us outside the US!) and it's possible in your environment, I'd suggest:
Define data types for each component of the name - NOTE: some cultures have more than two names! Then have a type for the full name,
Then localisation becomes simple (as far as names are concerned).
The same applies to addresses, BTW - different formats!
it is varchar right? So it then doesn't matter if you use 50 or 25, better be safe and use 50, that said I believe the longest I have seen is about 19 or so. Last names are longer