-->

How to load the first half records in one file and

2019-06-03 12:06发布

问题:

I have tried expression transformation so far along with aggregate transformation to get the maximum value of the sequence number.Source is flat file

回答1:

The way you are designing would require reading the source twice in the mapping, one to get the total number of records (max sequence as you called it) and then another one to read the detail records and pass them to target1 or target2.

You can simplify it by passing the number of records as a mapping parameter.

Either way, to decide when to route to a target - you can count the number of records read by keeping a running total in a variable port, incrementing every time a row passes thru the expression and checking against the (record count)/2.



回答2:

If you don't really care about first half and second half and all you need is two output files equal in size, you can:

  1. number the rows (with a rank transformation or a variable port),
  2. then route even and odd rows to two different targets.


回答3:

If you can, write a Unix (assuming your platform is Unix) shell script to do a head of the first file with half the file size in lines (use wc of the file with the right param as the param to head after dividing it by 2) and direct the output to a 3rd file. Then do a tail on the second file also using wc as just described and >> the output to the 3rd file you created. These would be pre-session commands. You'd use that 3rd file as the source file for your session. It'd look something like this (untested, but it gets the general idea across):

halfsize=`wc -l filename`
halfsize=$((halfsize/2))
head -n $halfsize filename > thirdfile
halfsize=`wc -l filename2`
halfsize=$((halfsize/2))
tail -n $halfsize filename2 >> thirdfile


回答4:

prior to writing to the target you keep counts in an expression. then connect this expression to a router. The router should have 2 groups

group1 count1 <= n/2 then route it to Target1 group2 count1 > n/2 then route it to Target2

Or

MOD(nextval/2) will send alternative records to alternative targets. I guess it won't send first half to 1st target and 2nd half to 2nd target.



标签: informatica