I'm migrating a database a new one to change One-to-Many relationships to Many-to-Many (and to improve the column naming scheme). [Edit: I've created a SQLFiddle for this.]
oldDB newDB
========================= =======================================
individuals people
- individual_id - id
- individual_name_first - first_name
- individual_name_last - last_name
- individual_name_other - additional_identifier
- individual_position - role
- individual_group_code - (replaced with people-groups table)
(There are duplicate rows
in this table for individ's
who are in more than one
group.)
groups groups
- (no id in oldDB) - id
- group_code - short_name
- group_name - full_name
people_groups
- id
- person_id
- group_id
- start_date
- end_date
Specifically, I'm having trouble with creating the linking table between people
and groups
.
I've already created the people
and groups
tables:
CREATE TABLE IF NOT EXISTS people (
id int(11) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
first_name varchar(50) NOT NULL,
last_name varchar(50) NOT NULL,
additional_identifier varchar(50) DEFAULT NULL COMMENT 'In case of duplicate first and last names',
role varchar(50) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE newDB.people ADD UNIQUE `name` (last_name, first_name, additional_identifier);
INSERT INTO newDB.people
(id, first_name, last_name, role)
SELECT
individual_id, individual_name_last, individual_name_first, individual_position, COUNT(*)
FROM
oldDB.individuals
GROUP BY
individual_name_last, individual_name_first;
CREATE TABLE IF NOT EXISTS newDB.groups(
id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
full_name VARCHAR(255) NOT NULL UNIQUE,
short_name VARCHAR(255) NOT NULL UNIQUE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO newDB.groups
(full_name, short_name)
SELECT
group_name, group_code
FROM
oldDB.groups;
Next I will CREATE
the newDB.people-groups
table, but first I'm making sure I can SELECT
the right values:
SELECT
newDB.groups.id 'group id',
newDB.people.id 'person id',
individual_group_code 'group short name',
individual_name_last 'last name',
individual_name_first 'first name'
FROM
oldDB.individuals
LEFT JOIN
newDB.groups ON(
newDB.groups.short_name = oldDB.individuals.individual_group_code
)
LEFT JOIN newDB.people ON (
newDB.people.last_name = oldDB.individuals.individual_name_last
AND
newDB.people.first_name = oldDB.individuals.individual_name_first
)
GROUP BY
individual_name_last ASC,
individual_name_first ASC,
individual_group_code
The first LEFT JOIN
is just for displaying the group short names for easy verification. The second LEFT JOIN
is important: it is supposed to allow for the 'person id'
output to be pulled from newDB.people.id
. Instead, I'm just getting NULL in that column for all rows of output. Everything else is displaying correctly. What am I missing?
Edit:
Desired output
Here is what I am hoping to get. (I generated it by replacing newDB.people.id 'person id'
with oldDB.individuals.individual_id 'person id'
. To exemplify the problem with it is that person 925 and 1232 are the same person in two different groups. The new database simplifies that to having person 925.)
Actual output
Edit 2:
Here is a SQLFiddle that does work. Why doesn't it work in my phpmyadmin?
You are performing a
group by
(3 columns) with 5 columns of non-aggregated columns in the select list. Also, not that it matters, there are no aggregates in the column output.MySQL treats that as a
distinct
(for the 3 columns) and brings back the first row it encounters in the MRU cache, and if no cache, the first ones encountered in the clustered index or physical ordering to satisfy the 2 non-grouped-by columns.In other words, it is a user error. A snafu. I recommend cleaning up you intention with the
GROUP BY
.Somewhat related, read a recent answer of mine Here related to
ONLY_FULL_GROUP_BY
. See at the bottom of that link is MySQL Handling of GROUP BY which in my view is a gloss over of the real problems and non-standards that MySQL allowed which rendered unexpected and hard to explain data from violations of that Standard.So what did the MySQL dev team do? They implemented the standard by default (starting in version 5.7) to disallow the types of queries you just performed.
Edit1
Your query, with no
GROUP BY
but with anorder by newGroups.id,people.id
, on a version 5.7.14 server: