I've got a messages table in MySQL which records messages between users. Apart from the typical ids and message types (all integer types) I need to save the actual message text as either VARCHAR or TEXT. I'm setting a front-end limit of 3000 characters which means the messages would never be inserted into the db as longer than this.
Is there a rationale for going with either VARCHAR(3000) or TEXT? There's something about just writing VARCHAR(3000) that feels somewhat counter-intuitive. I've been through other similar posts on Stack Overflow but would be good to get views specific to this type of common message storing.
Disclaimer: I'm not a MySQL expert ... but this is my understanding of the issues.
I think TEXT is stored outside the mysql row, while I think VARCHAR is stored as part of the row. There is a maximum row length for mysql rows .. so you can limit how much other data you can store in a row by using the VARCHAR.
Also due to VARCHAR forming part of the row, I suspect that queries looking at that field will be slightly faster than those using a TEXT chunk.
TEXT
andBLOB
is stored off the table with the table just having a pointer to the location of the actual storage.VARCHAR
is stored inline with the table.VARCHAR
is faster when the size is reasonable, the tradeoff of which would be faster depends upon your data and your hardware, you'd want to benchmark a realworld scenario with your data.Update Whether
VARCHAR
orTEXT
is stored inline, or off-record depends on data size, columns size, row_format, and MySQL version. It does not depend on "text" vs "varchar".Short answer: No practical, performance, or storage, difference.
Long answer:
There is essentially no difference (in MySQL) between
VARCHAR(3000)
(or any other large limit) andTEXT
. The former will truncate at 3000 characters; the latter will truncate at 65535 bytes. (I make a distinction between bytes and characters because a character can take multiple bytes.)For smaller limits in
VARCHAR
, there are some advantages overTEXT
.CHARACTER SET
.INDEXes
are limited in how big a column can be indexed. (767 or 3072 bytes; this is version and settings dependent)SELECTs
are handled in two different ways -- MEMORY (faster) or MyISAM (slower). When 'large' columns are involved, the slower technique is automatically picked. (Significant changes coming in version 8.0; so this bullet item is subject to change.)TEXT
datatypes (as opposed toVARCHAR
) jump straight to MyISAM. That is,TINYTEXT
is automatically worse for generated temp tables than the equivalentVARCHAR
. (But this takes the discussion in a third direction!)VARBINARY
is likeVARCHAR
;BLOB
is likeTEXT
.Rebuttal to other answers
The original question asked one thing (which datatype to use); the accepted answer answered something else (off-record storage). That answer is now out of date.
When this thread was started and answered, there were only two "row formats" in InnoDB. Soon afterwards, two more formats (
DYNAMIC
andCOMPRESSES
) were introduced.The storage location for
TEXT
andVARCHAR()
is based on size, not on name of datatype. For an updated discussion of on/off-record storage of large text/blob columns, see this .Just to clarify the best practice:
Text format messages should almost always be stored as TEXT (they end up being arbitrarily long)
String attributes should be stored as VARCHAR (the destination user name, the subject, etc...).
I understand that you've got a front end limit, which is great until it isn't. *grin* The trick is to think of the DB as separate from the applications that connect to it. Just because one application puts a limit on the data, doesn't mean that the data is intrinsically limited.
What is it about the messages themselves that forces them to never be more then 3000 characters? If it's just an arbitrary application constraint (say, for a text box or something), use a
TEXT
field at the data layer.Can you predict how long the user input would be?
The preceding answers don't insist enough on the main problem: even in very simple queries like
a temporary table can be required, and if a
VARCHAR
field is involved, it is converted to aCHAR
field in the temporary table. So if you have in your table say 500 000 lines with aVARCHAR(65000)
field, this column alone will use 6.5*5*10^9 byte. Such temp tables can't be handled in memory and are written to disk. The impact can be expected to be catastrophic.Source (with metrics): https://nicj.net/mysql-text-vs-varchar-performance/ (This refers to the handling of
TEXT
vsVARCHAR
in "standard"(?) MyISAM storage engine. It may be different in others, e.g., InnoDB.)