I have some files in linux. For example 2 and i need shuffling the files in one file.
For example
$cat file1
line 1
line 2
line 3
line 4
line 5
line 6
line 7
line 8
and
$cat file2
linea one
linea two
linea three
linea four
linea five
linea six
linea seven
linea eight
And later that i shuffling the two files i can obtain something like:
linea eight
line 4
linea five
line 1
linea three
line 8
linea seven
line 5
linea two
linea one
line 2
linea four
line 7
linea six
line 1
line 6
You should use shuf
command =)
cat file1 file2 | shuf
Or with Perl :
cat file1 file2 | perl -MList::Util=shuffle -wne 'print shuffle <>;'
Sort:
cat file1 file2 | sort -R
Shuf:
cat file1 file2 | shuf
Perl:
cat file1 file2 | perl -MList::Util=shuffle -e 'print shuffle<STDIN>'
BASH:
cat file1 file2 | while IFS= read -r line
do
printf "%06d %s\n" $RANDOM "$line"
done | sort -n | cut -c8-
Awk:
cat file1 file2 | awk 'BEGIN{srand()}{printf "%06d %s\n", rand()*1000000, $0;}' | sort -n | cut -c8-
Just a note to OS X users who use MacPorts: the shuf
command is part of coreutils
and is installed under name gshuf
.
$ sudo port install coreutils
$ gshuf example.txt # or cat example.txt | gshuf
Here's a one-liner that doesn't rely on shuf
or sort -R
, which I didn't have on my mac:
while read line; do echo $RANDOM $line; done < my_file | sort -n | cut -f2- -d' '
This iterates over all the lines in my_file
and reprints them in a randomized order.
I would use shuf
too.
another option, gnu sort has:
-R, --random-sort
sort by random hash of keys
you could try:
cat file1 file2|sort -R
You don't need to use pipes here. Sort alone does this with the file(s) as parameters. I would just do
sort -R file1
or if you have multiple files
sort -R file1 file2
This worked for me. It employs the Fisher-Yates shuffle.
randomize()
{
arguments=("$@")
declare -a out
i="$#"
j="0"
while [[ $i -ge "0" ]] ; do
which=$(random_range "0" "$i")
out[j]=${arguments[$which]}
arguments[!which]=${arguments[i]}
(( i-- ))
(( j++ ))
done
echo ${out[*]}
}
random_range()
{
low=$1
range=$(($2 - $1))
if [[ range -ne 0 ]]; then
echo $(($low+$RANDOM % $range))
else
echo "$1"
fi
}
It is clearly biased rand (like half the time the list will start with the first line) but for some basic randomization with just bash builtins I guess it is fine? Just print each line yes/no then print the rest...
shuffle() {
local IFS=$'\n' tail=
while read l; do
if [ $((RANDOM%2)) = 1 ]; then
echo "$l"
else
tail="${tail}\n${l}"
fi
done < $1
printf "${tail}\n"
}