I've been battling with some SQL and can't seem to get my head around it.
I have two tables, one with the list of categories and another with all my articles.
What i'm trying to do is find how many articles are present for each category.
Here is the SQL I have so far
SELECT DISTINCT COUNT( po.post_Cat_ID ) AS Occurances, ca.cat_Title
FROM Posts po, Categories ca
WHERE ca.cat_ID = LEFT( po.post_Cat_ID, 2 )
The reason I use LEFT is to only get the main categories as I have listed categories as the following... for example
Science = 01
Medicine = 0101
Sport = 02
Posts on say asprin would therefore have a cat_ID as 0101. (LEFT would then trim 0101, 0102, 0103 etc to just 01). Basically im not interested in the subcategories.
Thanks in advance
Result
SELECT DISTINCT COUNT( po.post_Cat_ID ) AS Occurances, ca.cat_Title
FROM Posts po, Categories ca
WHERE ca.cat_ID = LEFT( po.post_Cat_ID, 2 )
GROUP BY LEFT( po.post_Cat_ID, 2 )
p.s. thanks @nullpointer, it works for the moment, i'll look into restructuring for other readers heres the link again
http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
Let me suggest you to restructure the schema instead. What you want here is to represent a hierarchical structure (categories), which is not really straightforward to do with relational databases. Two common solutions are the adjacency list and the nested set.
The adjacency list is more a straightforward tree-like structure. You'll have a
categories
table like:Unfortunately this model is hard to work with using SQL. Instead, we can the nested set approach. Here every node has
lft
andrgt
values node which will be between the parent'slft
andrgt
values. In your example you'll have:So in order to retrieve a count for a certain category, you can simply query the count of nodes that have a
lft
andrgt
value in between the category you want. For example:Assuming your
article
table looks like:This is discussed in more detail at:
http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
I'll propose another solution: use tags rather than categories. You can use multiple tags for a given article and simply get the count of all articles matching a certain tag. This will be a lot easier to work with and also give you a lot more flexibility.
To accomplish this, you'll need a many-to-many relationship between articles and tags, which is usually implemented with a junction table:
To tag an article, you simply
INSERT
multiple entries into thearticles_tags
table with the correctarticle_id
andtag_id
. Then you can useJOIN
s as usual to get what you want.Add a column to Categories which gives the main category that each category is in (with main categories giving themselves). So:
Select from this on cat_id = main_cat_id to find main categories; join back onto itself on left.cat_id = right.main_cat_id to find the child categories, then onto posts on cat_id = cat_id. Group by left.cat_id and project over cat_id and count(*).
I tried this in PostgreSQL 8.4, and i don't see why this wouldn't work in MySQL, as the query is pretty basic. My tables:
My query (grouping by title rather than ID):
UPDATE: I also had a shot at making this work with a string operation, as the OP tried. The query (in standard-compliant SQL as accepted by PostgreSQL, rather than MySQL's dialect) is:
Which works fine. I can't offer a meaningful comparison as to speed, but the query plan for this did look a bit simpler than that for the two-way join.