“What column should I use for X and Y?”
This is a common question that web developers ask themselves whenever they’ve been given the task of creating a database-driven application.
Hopefully, this article will act as a useful guide for some of the more common problems that developers will face.
Storing a person’s name.
According to the UK Government Data Standards Catalogue, the maximum length of a person’s full name is 70 characters. That’s 35 characters for a given name and 35 characters for a surname.
In MySQL, this translates into a VARCHAR(70) column, which is a reasonable length for a person’s full name. In the vast majority of cases, you won’t even come close to hitting the maximum limit.
However, if you’re the type of person that worries about these things and you’re inclined to play it extra safe, then just set the column to VARCHAR(100) and be done with it. Anything above 100 characters would be ridiculous.
Storing a person’s date of birth.
A person’s date of birth should be stored in a DATE column. In most cases, you should shy away from storing the user’s numerical age.
I’ve often come across websites that ask me to enter my age instead of my date of birth. More than a decade later, my now-dormant profile still says that I am 21 years old.
Creating an auto-incrementing primary key.
Before choosing a primary key type, you should take a look at MySQL’s documentation page on Integer Types.
Pay attention to each type’s maximum value and ask yourself the question: How many rows will this table hold?
Will you really need to use a BIGINT? Is a TINYINT enough for my needs? In a lot of cases, beginner web developers will over-estimate the number of rows that their tables will hold.
Storing phone numbers.
Firstly, a phone number isn’t really a number. In reality, it is actually just a collection of digits. There are a few things to bear in mind here:
- Phone numbers can have extension numbers.
- It is unlikely that you will ever need to carry out a mathematical equation on a phone number.
- There are country codes and area codes.
- Some countries do not have area codes (Malta, for example).
- Leading zeros may prove problematic if you are storing phone numbers in a numerical column.
Most developers agree that using a text-based column is the best solution.
Of course, all of this depends on your application. A web application that sends SMS messages is going to need far more accuracy than a submission form that asks for contact information.
Storing money.
In the vast majority of cases, a DECIMAL(10, 2) column will do just fine. However, if you are after some extra precision, then a DECIMAL(10, 4) column will give you an extra 2 digits after the decimal mark (leading to results such as 101.1939 and 19.1292).
Note that we use DECIMAL because it is MySQL’s data type for exact arithmetic (unlike FLOAT).
Storing latitude and longitude values.
Once again, we can use DECIMAL because it is used for exact arithmetic.
As you probably already know, a latitude value can be between -90 and +90 degrees, whereas a longitude value can be between -180 and +180 degrees. This means that a latitude value can be stored in a DECIMAL(10, 8) column and a longitude value can be stored in a DECIMAL(11, 8) column.
Note: If your application deals heavily with geographical distances, you might want to take a look at MySQL’s Spatial Extensions.
Storing IP addresses.
You can store IP addresses in an UNSIGNED INT column. Use the MySQL function INET_ATON to convert an IP such as 127.0.0.1 into an integer that represents its numeric value.
Then, as soon as you want to transform the resulting integer back into 127.0.0.1, you can use the function INET_NTOA. NOTE: As pointed out in the comments section below, this may prove to be problematic as IPV6 becomes more widespread.
Storing email addresses.
Email addresses should stored in a VARCHAR(255) column. I say 255 because that is the number recommended by RFC 5321.
Storing large amounts of text.
When it comes to storing large amounts of text in MySQL, the TEXT data type is your friend. Note that there are different types of TEXT columns and that each one has a maximum length:
- TINYTEXT will store up to 255 characters.
- TEXT will store up to 65,535 characters.
- MEDIUMTEXT has a maximum length of 16,777,215 characters.
- LONGTEXT will store up to 4,294,967,295.
If you find that your text is being truncated, it’s probably because you chose the wrong TEXT type.
Storing passwords.
Passwords can be stored in a VARCHAR or CHAR column. The length of the column will depend on the type of hashing algorithm that you are using. Examples:
- MD5: VARCHAR(32)
- SHA1: VARCHAR(40)
- BCRYPT: VARCHAR(60)
Note that MD5 and SHA1 are not considered to be safe for password hashing!
Storing true or false values.
Storing boolean values in MySQL is pretty simple. If you are using version 5.0.3 or above, you can make use of the BIT data type. Otherwise, you can use an unsigned TINYINT(1) column, with 0 being FALSE and 1 being TRUE.
Storing files in MySQL.
Storing files in a database is a hotly-contested topic among developers and database administrators. However, if storing files in a database is a must for you, then you can use the BLOB data type.
Storing a person’s height and weight.
When it comes to storing height and weight, you’ll need to stick to a certain measurement type. In most cases, it is best to use cm for height and kg for weight.
Whether your users prefer entering stones, lbs, or feet doesn’t really matter, simply because you can calculate and convert those numbers on the server side or client side before interacting with the database.
For example, converting KG into Pounds is pretty easy with PHP:
function kgToLb($kg){ return $kg * 2.20462 }
Because of how easy it is to convert between measurements, locale should not be the deciding factor in what type of column you use. In most cases, an integer value that represents CM or KG will meet your requirements.
Storing gender.
Luckily enough, there’s an international standard for storing human sexes. Why re-create the wheel?
- 0: Not Known.
- 1: Male.
- 2: Female.
- 9: Not applicable.
These can be stored in an UNSIGNED TINYINT column.
Storing International ZIP codes / POST codes.
International ZIP codes can be a mess. They can be of varying lengths with various different formats.
Some will have spaces. Some will be completely numerical. Other countries don’t even have ZIP codes. To play it safe, you can store ZIP codes in a VARCHAR(20) column.
Usernames.
Typically, a VARCHAR(20) will do just fine for an alphanumerical username.
Addresses.
In most cases, breaking addresses up into smaller components is a better solution than just lumping the entire address into a singular text-based column.
Taking the latter approach will make it extremely difficult to determine city names, state names, etc. Splitting the address up into smaller components allows for better accuracy and categorization:
- House number / name
- Address Line 1 / Street name
- Address Line 2
- Address Line 3
- City / Town / Village
- State / County / Province
- ZIP / Postal Code
- Country
One nit pick – if you use NET_ATON and integers to store IP’s it will bite you as IP6 becomes more prevalent.