Dynamic SQL column value duplicate and difference

2019-01-29 11:04发布

问题:

I have n-columns that may contain unique or duplicated values meaning each column may have different values or the same ones.

What I want to achieve is an MS Access script that can analyze n-columns that could do the following:

  • If the columns if only one has a value and the rest are blank then the final value will be the only one seen (Note: this one is already accomplished).

    +-----+----+----+-----+
    |Var1 |Var2|Var3|Final|
    +-----+----+----+-----+
    |1    |    |    |1    |
    +-----+----+----+-----+
    |     |    |3   |3    |
    +-----+----+----+-----+
    |     |yes |    |yes  |
    +-----+----+----+-----+
    
  • If there are more than one different values then ask the user to choose between these values (Update: precedence of variables instead of user selection ie. Var1 > Var2 > ... > VarN).

    +----+----+----+-----+
    |Var1|Var2|Var3|Final|
    +----+----+----+-----+
    |1   |3   |    |?    |
    +----+----+----+-----+
    |ok  |    |3   |?    |
    +----+----+----+-----+
    |    |yes |5   |?    |
    +----+----+----+-----+
    
  • If there are more than one same values then proceed to use that as the final value.

    +----+----+----+-----+
    |Var1|Var2|Var3|Final|
    +----+----+----+-----+
    |1   |1   |    |1    |
    +----+----+----+-----+
    |ok  |    |ok  |ok   |
    +----+----+----+-----+
    |    |yes |yes |yes  |
    +----+----+----+-----+
    

回答1:

On reviewing the answers to your previous questions on this topic they seem to rely pretty heavily on IIf(Var1 <> Var2, ... constructs. If you really want an n-column solution for n of any non-trivial size the first thing you need to do is convert your columns into rows. For sample data [Table1]:

RowID  Var1  Var2  Var3  Final
-----  ----  ----  ----  -----
    1  1                      
    2              3          
    3        yes              
    4  1     3                
    5  ok          3          
    6        yes   5          
    7  1     1                
    8  ok          ok         
    9        yes   yes        

Create a saved query in Access named [Unpivoted]:

SELECT RowID, "Var1" AS var, Var1 AS val FROM Table1
UNION ALL
SELECT RowID, "Var2" AS var, Var2 AS val FROM Table1
UNION ALL
SELECT RowID, "Var3" AS var, Var3 AS val FROM Table1

That query returns the following, and will be the basis from which we work:

RowID  var   val
-----  ----  ---
    1  Var1  1  
    2  Var1     
    3  Var1     
    4  Var1  1  
    5  Var1  ok 
    6  Var1     
    7  Var1  1  
    8  Var1  ok 
    9  Var1     
    1  Var2     
    2  Var2     
    3  Var2  yes
    4  Var2  3  
    5  Var2     
    6  Var2  yes
    7  Var2  1  
    8  Var2     
    9  Var2  yes
    1  Var3     
    2  Var3  3  
    3  Var3     
    4  Var3     
    5  Var3  3  
    6  Var3  5  
    7  Var3     
    8  Var3  ok 
    9  Var3  yes

Our first step will be to pull out the values where RowID has just one distinct non-Null value. We'll save this query as [NoSelectionRequired]

SELECT RowID, First(val) AS FirstOfval
FROM 
    (
        SELECT DISTINCT RowID, val
        FROM Unpivoted
        WHERE val IS NOT NULL
    )
GROUP BY RowID
HAVING COUNT(*) = 1

That returns

RowID  FirstOfval
-----  ----------
    1  1         
    2  3         
    3  yes       
    7  1         
    8  ok        
    9  yes       

So we can cover Cases 1 and 3 with the following query (see also Edit1, below)

UPDATE Table1 SET Final = DLookup("FirstOfval", "NoSelectionRequired", "RowID=" & RowID)

After we run that query [Table1] now looks like this:

RowID  Var1  Var2  Var3  Final
-----  ----  ----  ----  -----
    1  1                 1    
    2              3     3    
    3        yes         yes  
    4  1     3                
    5  ok          3          
    6        yes   5          
    7  1     1           1    
    8  ok          ok    ok   
    9        yes   yes   yes  

Now... for Case 2:

You need to provide a mechanism for the users to pick the values they want. One way you could do that would be to use a [Selections] table...

...that you could populate with a statement like this

INSERT INTO Selections ([RowID], [Value])
SELECT RowID, val 
FROM Unpivoted
WHERE val IS NOT NULL
    AND RowID IN
        (
            SELECT RowID FROM
                (
                    SELECT DISTINCT RowID, val
                    FROM Unpivoted
                    WHERE val IS NOT NULL
                )
            GROUP BY RowID
            HAVING COUNT(*) > 1
        )
ORDER BY RowID, val

so your [Selections] table contains

RowID  Value  Selected
-----  -----  --------
    4  1      False   
    4  3      False   
    5  3      False   
    5  ok     False   
    6  5      False   
    6  yes    False   

Slap that into a continuous form with checkboxes for the [Selected] field and a bit of logic behind it to only allow one selection per RowID, then once the user has made their choices:

RowID  Value  Selected
-----  -----  --------
    4  1      True    
    4  3      False   
    5  3      True   
    5  ok     False    
    6  5      False   
    6  yes    True    

then you can update the remaining rows in [Table1] (see also Edit1, below)

UPDATE Table1 SET Final = DLookup("Value","Selections","RowID=" & RowID & " AND Selected")
WHERE RowID IN (SELECT RowID FROM Selections)

and there you have it:

RowID  Var1  Var2  Var3  Final
-----  ----  ----  ----  -----
    1  1                 1    
    2              3     3    
    3        yes         yes  
    4  1     3           1    
    5  ok          3     3    
    6        yes   5     yes  
    7  1     1           1    
    8  ok          ok    ok   
    9        yes   yes   yes  

Edit1 - re: comment

For larger datasets the two queries that write the results to the [Final] column may take a long time to run because they are not terribly efficient. In that case the solution would be to dump the results to a temporary table and then use that to update the main table.

For the first update (Cases 1 and 3), instead of doing

UPDATE Table1 SET Final = DLookup("FirstOfval", "NoSelectionRequired", "RowID=" & RowID)

we would do...

SELECT * INTO zzzTemp
FROM NoSelectionRequired

...followed by...

UPDATE 
    Table1 
    INNER JOIN 
    zzzTemp 
        ON Table1.RowID = zzzTemp.RowID 
SET Table1.Final = [zzzTemp].[FirstOfval]

For the second update (Case 2), instead of doing

UPDATE Table1 SET Final = DLookup("Value","Selections","RowID=" & RowID & " AND Selected")
WHERE RowID IN (SELECT RowID FROM Selections)

we would do...

SELECT * INTO zzzTemp
FROM Selections
WHERE Selected

...followed by...

UPDATE 
    Table1 
    INNER JOIN 
    zzzTemp 
        ON Table1.RowID = zzzTemp.RowID 
SET Table1.Final = [zzzTemp].[Value]


回答2:

  1. Solve all three situations, the code below takes care of precedence, empty and possible null values in the content. INTO temp will save the query result into a table named temp.

    SELECT id, var1, var2, 
        IIf(Len(var1 & "") = 0, var2, var1) AS merged
    INTO temp
    FROM myTable;
    
  2. Transfer the records into the source table.

    UPDATE myTable 
        INNER JOIN temp 
          ON [myTable].ID=temp.ID 
    SET myTable.merged= [temp].[merged];
    

The question "Combine values from n-fields with precedence" helped!