可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Main Table:

CREATE TABLE ca_ger.cln_trans_base (
  store_code VARCHAR(25),
  year_week varchar(25),
  division VARCHAR(25) encode lzo,
  market_share_code VARCHAR(25) encode bytedict,
  pon   VARCHAR(25) encode lzo,
  osp_price REAL encode bytedict
)
diststyle key distkey(store_code)
sortkey(store_code,year_week); 

COMMIT;

Second table:

CREATE TABLE ca_ger.divisions(
  store_code VARCHAR(25),
  year_week varchar(25),
  division VARCHAR(25) encode lzo
)
diststyle key distkey(store_code)
sortkey(store_code,year_week); 

COMMIT; 

alter table ca_ger.cln_Trans_base
add column division(varchar 25);

Join clause:

update ca_ger.cln_trans_base 
 set division=b.division 
from ca_Ger.cln_trans_base a 
  join divisions b on a.year_week=b.year_week;

After this, I ran vacuum and analyze. After this, I checked stats table and rows are doubled.

回答1:

In Amazon Redshift,

Update=DELETE+INSERT

Hence your rows size is doubled. As you might be updating all the rows.

After every Update or on scheduled basis, you should be doing full Vaccum as suggested in one of comment.

VACUUM FULL cln_trans_base

It will reduce the size to half. Hope it will help you.

回答2:

Extending @a_horse's suggestion, the UPDATE statement should not repeat the table being updated. Instead, use:

UPDATE ca_ger.cln_trans_base a
SET division = b.division 
FROM divisions b
WHERE a.year_week = b.year_week;

See: PostgreSQL UPDATE Join with A Practical Example

RedShift table rows are duplicated after updating

问题:

回答1:

回答2:

收藏的人(0)

RedShift table rows are duplicated after updating

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮