I have ~10 different document types which share 10-15 common fields. But each document type has additional fields, 3 of them up to 30-40 additional fields.
I was considering to use a different mapping type for each document type. But if I correctly understand how mappings work, ElasticSearch will internally use one mapping with 150-200 fields. Because no document has a value for each field, I will end up with a lot of sparse data.
According to this article (Index vs. Type) ElasticSearch is (was?) not very good in dealing with sparse data, so that would be an argument for having a separate index for each document type. But some document types only have very little documents, so it would be overkill to have a separate index for them.
My question: How bad are sparse documents? Or am I better off with a separate index for each type even though some indexes will only contain a few documents?
Yes, different types within an index share the same mapping structure. Each type just have a “_type” field to every document that is automatically used for filtering when searching on a specific type.
Citing from Index Vs Type
Fields that exist in one type will also consume resources for documents of types where this field does not exist. This is a general issue with Lucene indices: they don’t like sparsity.
As you may be aware that each separate index has its own overhead and types don't gel well with sparse documents.
I would suggest
Keep in mind that you should keep a reasonable number of shards in your cluster, which can be achieved by reducing the number of shards for indices that don’t require a high write throughput and/or will store low numbers of documents.
There are various implications between choosing Index or a Type. It depends on the computing power of your nodes, how many documents each type will store and so on.
If you say each index will contain only few documents, then I would recommend to go with types, because each index will end up creating separate shards - which would be an overkill for the small set of documents.
You could refer to this SO Answer as well.