Using Django I have a field which is of type JSONField. I am wanting to get a distinct count on a nested key/value in the json. With a normal field you can just do soemthing like the following
model.objects.values('field_name')\
.annotate(total=Count('field_name')).order_by('-total')
This does not work for a JSONField.
Example Model
class Pet(models.Model):
data = JSONField()
Example of data
{
'name':'sparky',
'animal':'dog',
'diet':{
'breakfast':'biscuits',
'dinner':'meat',
}
}
Trying
Pet.objects.values('data__diet__dinner')\
.annotate(total=Count('data__diet__dinner')).order_by('-total')
Exception of
TypeError: unhashable type: 'list'
What is the correct way to execute this?
You can use jsonb_extract_path_text via a Func object as an alternative to the field transform:
Pet.annotate(dinner=Func(
F('data'), Value('diet'), Value('dinner'),
function='jsonb_extract_path_text')) \
.values('dinner') \
.annotate(total=Count('dinner'))
The reason why the field transform data__diet__dinner
fails is an error within Django when you go deeper than just one level into the json structure and use GROUP BY
in the SQL. The first level (name
, animal
, diet
) should work fine.
The reason seems to be that for nested transforms, Django changes the SQL syntax used, switching from a single value to a list to specify the path into the json structure.
This is the syntax used for non-nested json transforms (= first level):
"appname_pet"."data" -> 'diet'
And this is the syntax used for nested transforms (deeper than first level):
"appname_pet"."data" #> ARRAY['diet', 'dinner']
While constructing the query, Django chokes on that list while working out the required GROUP BY
clauses. This does not seem to be an inevitable restriction; the support for transforms is quite new, and this is possibly one of the kinks that haven't been worked out yet. So if you open a Django ticket, this might just work a few versions down the line.