need help adding column to one table using functio

2019-07-12 17:56发布

I am trying to add a column "wOBA" to a table "starting_pitcher_stats" in MySQL using sequel-pro. Below is the code for a function that performs arithmetic operations on nine variables in the "starting_pitcher_stats" table. In particular, the function gathers the values for a number of variables, applies different weights (coefficients) to some of them (numerator below) and divides that sum by the addition and subtraction of a few more variables. All of these variables reside in the "starting_pitcher_stats" table. The arithmetic operations are expressed in the below formula (the coefficients are the values that are multiplied by each of the variables in numerator below):

wOBA=(.69*walks_a + .72*HBP + .89*singles_a + 1.27*doubles_a + 1.62*triples_a+ 2.10*HR_a)/(at_bats+walks_a+SF+HBP)

Each weight varies by year. The different weights for each year come from the table "GUTS".

The first dilemma I'm having is getting the correct code for the function. The second is the correct code syntax to use to actually call up this function and populate the new column with correct weighted wOBA values for each game of each year(season) for each "Starting_Pitcher".

The function is created with the code below and is listed as a function "wOBA" in my list of functions and procedures. However, the little wheel/knob next to the function name in sequel pro is greyed-out for some reason. It's not until I find out the correct code to invoke it, that I'll know if there are any errors.

Please ask for any more information that I can provide to clarify anything.

Thank you in advance.

DELIMITER $$
    CREATE FUNCTION wOBA(wOBA DECIMAL(10,3))
    RETURNS DECIMAL(10,3)
    BEGIN
        DECLARE wOBA decimal(10,3);
        SET wOBA = 0;
        SELECT cast((SELECT SUM(weighted_BB) as wBB_sum 
            FROM (
                SELECT g.wBB*SUM(if(e.event_CD=14,1,0)) as weighted_BB 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as walks_a)  
            + (SELECT SUM(weighted_HBP) as wHBP_sum 
            FROM (
                SELECT g.wHBP*SUM(if(e.event_CD=16,1,0)) as weighted_HBP 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as HBP)     
            + (SELECT SUM(weighted_1B) as w1B_sum 
            FROM (
                SELECT g.w1B*SUM(if(e.event_CD=20,1,0)) as weighted_1B 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as singles_a)       
            + (SELECT SUM(weighted_2B) as w2B_sum 
            FROM ( 
                SELECT g.w2B*SUM(if(e.event_CD=21,1,0)) as weighted_2B 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as doubles_a)       
            + (SELECT SUM(weighted_3B) as w3B_sum 
            FROM (
                SELECT g.w3B*SUM(if(e.event_CD=22,1,0)) as weighted_3B 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as triples_a)       
            + (SELECT SUM(weighted_HR) as wHR_sum 
            FROM (
                SELECT g.wHR*SUM(if(e.event_CD=23,1,0)) as weighted_HR 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as HR_a) as decimal(10,3))
            /
            cast(SUM(if(e.ab_fl="T",1,0)) 
                + SUM(if(e.event_cd=14,1,0)) 
                + SUM(if(e.SF_fl="T",1,0)) 
                + SUM(if(e.event_cd=16,1,0)) as unsigned) INTO wOBA 
            FROM events e
            WHERE e.PIT_ID=Starting_Pitcher AND PIT_START_FL = "T"
            LIMIT 1;
        RETURN wOBA;
    END
    $$
    DELIMITER ;

Darwin, Here are two screen shots of the events table. The first is of the structure and the second is some of the content (as not all of the content fits into the shot):

[structure of events table[1]

content of events table

Here are screenshots of the structure and contents of the GUTS table.

enter image description here

enter image description here

Here is a screenshot of the events table structure showing the fields (and their definitions)in the function:

enter image description here

Update:

UPDATE retrosheet.starting_pitcher_stats 
SET starting_pitcher_stats.wOBA =(SELECT
(
   (g.wBB * SUM(IF(e.event_cd = 14, 1, 0)))
   + (g.wHBP * SUM(IF(e.event_cd = 16, 1, 0)))
   + (g.w1B  * SUM(IF(e.event_cd = 20, 1, 0)))
   + (g.w2B  * SUM(IF(e.event_cd = 21, 1, 0)))
   + (g.w3B  * SUM(IF(e.event_cd = 22, 1, 0)))
   + (g.wHR  * SUM(IF(e.event_cd = 23, 1, 0)))
   )
   /
   (
     SUM(IF(e.ab_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 14, 1, 0))
   + SUM(IF(e.sf_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 16, 1, 0))
  ) AS wOBA
  FROM events AS e, GUTS AS g
  WHERE e.YEAR_ID = g.SEASON_ID
    AND e.PIT_START_FL= 'T'
    AND e.PIT_ID = Starting_Pitcher)

The query just keeps running. I'll keep tweaking it.

update: screenshots of starting_pitcher_stats table enter image description here

enter image description here

Update:

Ok, I'm trying to just create a wOBA column as part of a new table with columns for the other components of wOBA.

But, the query just goes on forever. How can I make the run-time shorter?

DROP TABLE IF EXISTS starting_pitcher_wOBA;
CREATE TABLE starting_pitcher_wOBA 
SELECT
a.YEAR_ID
,
a.GAME_ID
,
a.PIT_ID
,
a.wBB
,
a.wHBP
,
a.w1B
,
a.w2B
,
a.w3B
,
a.wHR
,
a.u_walks_a
,
a.HBP
,
a.singles_a
,
a.doubles_a
,
a.triples_a
,
a.HR_a
,
a.at_bats
,
a.sacrifice_flies_a
,
a.wOBA
FROM
(
SELECT 
g.YEAR_ID
,
h.GAME_ID
,
e.PIT_ID
,
g.wBB
,
g.wHBP
,
g.w1B
,
g.w2B
,
g.w3B
,
g.wHR
,
SUM(IF(e.event_cd = 14, 1, 0)) AS u_walks_a
,
SUM(IF(e.event_cd = 16, 1, 0)) AS HBP
,
SUM(IF(e.event_cd = 20, 1, 0)) AS singles_a
,
SUM(IF(e.event_cd = 21, 1, 0)) AS doubles_a
,
SUM(IF(e.event_cd = 22, 1, 0)) AS triples_a
,
SUM(IF(e.event_cd = 23, 1, 0)) AS HR_a
,
SUM(IF(e.ab_fl = 'T',   1, 0)) AS at_bats
,
SUM(IF(e.sf_fl = 'T',   1, 0)) AS sacrifice_flies_a
,
(
(
   (g.wBB * SUM(IF(e.event_cd = 14, 1, 0))) 
   + (g.wHBP * SUM(IF(e.event_cd = 16, 1, 0))) 
   + (g.w1B  * SUM(IF(e.event_cd = 20, 1, 0))) 
   + (g.w2B  * SUM(IF(e.event_cd = 21, 1, 0))) 
   + (g.w3B  * SUM(IF(e.event_cd = 22, 1, 0))) 
   + (g.wHR  * SUM(IF(e.event_cd = 23, 1, 0))) 
   )
   /
   (
     SUM(IF(e.ab_fl = 'T',   1, 0)) 
   + SUM(IF(e.event_cd = 14, 1, 0)) 
   + SUM(IF(e.sf_fl = 'T',   1, 0)) 
   + SUM(IF(e.event_cd = 16, 1, 0)) 
  ) 
 )  AS wOBA
FROM events AS e, GUTS AS g, game AS h
WHERE e.PIT_START_FL= 'T' 
GROUP BY g.YEAR_ID, h.GAME_ID,e.PIT_ID) AS a
INNER JOIN GUTS AS g
ON 
a.YEAR_ID=g.YEAR_ID
INNER JOIN game AS h
ON
a.GAME_ID = h.GAME_ID
INNER JOIN events AS e
ON
a.PIT_ID = e.PIT_ID

1条回答
Explosion°爆炸
2楼-- · 2019-07-12 18:44

We'll start by cleaning up the query. You should always try to perform your calculations across each row when possible rather than performing multiple vertical subqueries, as this avoids the DBMS making multiple passes over the same table.

SELECT
  (
   ( (g.wbb  * SUM(IF(e.event_cd = 14, 1, 0)))
   + (g.whbp * SUM(IF(e.event_cd = 16, 1, 0)))
   + (g.w1b  * SUM(IF(e.event_cd = 20, 1, 0)))
   + (g.w2b  * SUM(IF(e.event_cd = 21, 1, 0)))
   + (g.w3b  * SUM(IF(e.event_cd = 22, 1, 0)))
   + (g.whr  * SUM(IF(e.event_cd = 23, 1, 0)))
   )
   /
   (
     SUM(IF(e.ab_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 14, 1, 0))
   + SUM(IF(e.sf_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 16, 1, 0))
   )
  ) AS woba
  FROM events e, guts g
  WHERE e.year_id = g.season_id
    AND e.pit_start_fl = 'T'
    AND e.pit_id = starting_pitcher
  GROUP BY g.season;

Assuming I haven't dropped a comma somewhere, this will return a column woba for each year for the specified starting pitcher.

Note that I've joined the tables on e.year_id instead of SUBSTRING(e.game_ID,4,4); this avoids the overhead of calling SUBSTRING() on each record. This sort of thing seems minor, but it can add up quickly over a large table.

That should be enough to get you started.

查看更多
登录 后发表回答