I am looking for a way to split a string in bash over a delimiter string, and place the parts in an array.
Simple case:
#!/bin/bash
b="aaaaa/bbbbb/ddd/ffffff"
echo "simple string: $b"
IFS='/' b_split=($b)
echo ;
echo "split"
for i in ${b_split[@]}
do
echo "------ new part ------"
echo "$i"
done
Gives output
simple string: aaaaa/bbbbb/ddd/ffffff
split
------ new part ------
aaaaa
------ new part ------
bbbbb
------ new part ------
ddd
------ new part ------
ffffff
More complex case:
#!/bin/bash
c=$(echo "AA=A"; echo "B=BB"; echo "======="; echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";)
echo "more complex string"
echo "$c";
echo ;
echo "split";
IFS='=======' c_split=($c) ;# <---- LINE TO BE CHANGED
for i in ${c_split[@]}
do
echo "------ new part ------"
echo "$i"
done
Gives output:
more complex string
AA=A
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split
------ new part ------
AA
------ new part ------
A
B
------ new part ------
BB
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
C
------ new part ------
------ new part ------
CC
DD
------ new part ------
D
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
EEE
FF
I would like the second output to be like
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ new part ------
EEE
FF
I.e. to split the string on a sequence of characters, instead of one. How can I do this?
I am looking for an answer that would only modify this line in the second script:
IFS='=======' c_split=($c) ;# <---- LINE TO BE CHANGED
Added some in the example text because of this comment:
EDIT: I added a suggestion that isn't sensitive for some delimiter in the text. However this isn't using a "one line split" that OP was asking for, but this is how I should have done it if I would do it in bash, and want the result in an array.
script.sh (NEW):
script.sh (OLD, with "one line split"):
(I stool the idea with awk from @Kent and adjusted it a bit)
Output:
I'm not using
-e
forecho
, to getAA=\\nA
to not do a newlineFollowing script tested in bash:
the script: (named
t.sh
)output:
note the echo statement in that for loop, if you remove the option
-e
you will see:take
-e
or not depends on your requirement.do it with awk:
output:
Here's an approach that doesn't fumble when the data contains literal backslash sequences, spaces and other:
Note that the string is actually split on "=======" as requested, so the line feeds become part of the data (causing extra blank lines when "echo" adds its own).
IFS
disambiguationIFS
mean Input Field Separators, aslist of characters that could be used as separators
.By default, this is set to
\t\n
, meaning that any number (greater than zero) of space, tabulation and/or newline could be oneseparator
.So the string:
Leading and trailing separators would be ignored and this string will contain only 3 parts:
blah
,foo=bar
andbaz
.Splitting a string using
IFS
is possible if you know a valid field separator not used in your string.But this work only while string do not contain
§
.You could use another character, like
IFS=$'\026';c_split=(${c//=======/$'\026'})
but anyway this may involve furter bugs.You could browse character maps for finding one who's not in your string:
but I find this solution a little overkill.
Splitting on spaces (or without modifying IFS)
Under bash, we could use this bashism:
In fact, this syntaxe
${varname//
will initiate a translation (delimited by/
) replacing all occurences of/
by a space, before assigning it to an array
b_split
.Of course, this still use
IFS
and split array on spaces.This is not the best way, but could work with specific cases.
You could even drop unwanted spaces before splitting:
or exchange thems...
Splitting line on
strings
:So you have to not use
IFS
for your meaning, but bash do have nice features:Let see:
Nota: Leading and trailing newlines are not deleted. If this is needed, you could:
instead of simply
=======
.Or you could rewrite split loop for keeping explicitely this out:
Any case, this match what SO question asked for (: and his sample :)
Finaly creating an
array
Do this finely:
Some explanations:
export -a var
to definevar
as an array and share them in childs${variablename%string*}
,${variablename%%string*}
result in the left part of variablename, upto but without string. One%
mean last occurence of string and%%
for all occurences. Full variablename is returned is string not found.${variablename#*string}
, do same in reverse way: return last part of variablename from but without string. One#
mean first occurence and two##
man all occurences.Nota in replacement, character
*
is a joker mean any number of any character.The command
echo "${c%%$'\n'}"
would echo variable c but without any number of newline at end of string.So if variable contain
Hello WorldZorGluBHello youZorGluBI'm happy
,All this is explained in the manpage:
Step by step, the splitting loop:
The separator:
Declaring
c_split
as an array (and could be shared with childs)While variable c do contain at least one occurence of
mySep
Trunc c from first
mySep
to end of string and assign topart
.Remove leading newlines
Remove trailing newlines and add result as a new array element to
c_split
.Reassing c whith the rest of string when left upto
mySep
is removedDone ;-)
Remove leading newlines
Remove trailing newlines and add result as a new array element to
c_split
.Into a function:
Usage:
where array name is
$splitted_array
by default and delimiter is one single space.You could use: