how to parse a text file using shell [duplicate]

2019-05-25 09:40发布

问题:

This question already has an answer here:

  • Code for parsing a key/value in in file from shell script 6 answers

I have a text file 'builds.txt' with a sample data as follows. I just want to know the MARX_BUILD number i.e 12. How can I achieve it. I tried grep MARX_BUILD builds.txt which is giving MARX_BUILD=12.

AM_BUILD_NO=1500
KJI_BUILD_NO=374
LINE_BUILD_NO=365
MARX_BUILD_NO=12

回答1:

You're already part of the way there, so one option would be pipe your output to cut:

grep MARX_BUILD builds.txt | cut -d= -f2

This splits the line on the = and prints the second field.

Personally, I'd do the whole thing in one go using awk:

awk -F= '/MARX_BUILD/ { print $2 }' builds.txt

This does the same but uses a single tool instead of two.



回答2:

Pure bash only

There is 3 way of doing this without the need of external binaries (like sed, grep, awk or others).

1. Eval, (but eval is evil)

If you trust the origin of the file, you could source them:

source builds.txt
echo $KJI_BUILD_NO
374

2. Read whole file into an associated array

This method is more securised, if a little stronger.

declare -A AArray
while IFS== read var val ;do
    [[ "$var" =~ ^[A-Za-z_]*$ ]] && AArray[$var]=$val
  done <builds.txt
echo ${AArray[MARX_BUILD_NO]}
12

3. Only one field

Like previous, this method is more secure than eval and lighter if you have to acces only one field in a config file, you could:

ans=$(<builds.txt)
ans=${ans#*LINE_BUILD_NO=}
ans=${ans%%[${IFS}]*}
echo $ans
365

or

field=KJI_BUILD_NO
ans=$(<builds.txt);ans=${ans#*${field}=};ans=${ans%%[${IFS}]*};echo $ans
374

Time comparission

I prefer avoid forks in order to increase execution script speed.

Let show some difference (on my host), there is a little classment by speed:

  1. Sourcing (but care about trusting source file)!

    time for ((i=1000;i--;));do . builds.txt ;done;echo $MARX_BUILD_NO
    0.044s 
    12
    
  2. Associative array

    declare -A AArray
    time for ((i=1000;i--;)) ;do
        while IFS== read var val ;do
            [[ "$var" =~ ^[a-zA-Z0-9_]*$ ]] && AArray[$var]=$val
          done <builds.txt
      done ;echo ${AArray[MARX_BUILD_NO]}
    0.356s 
    12
    
  3. grep -Po

    time for ((i=1000;i--;));do ans=$(
        grep -Po '^MARX_BUILD_NO=\K\d*$' builds.txt);done;echo $ans
    1.406s 
    12
    
  4. sed

    time for ((i=1000;i--;));do ans=$(
        sed -ne 's/MARX_BUILD_NO=//p' builds.txt);done;echo $ans
    1.454s 
    12
    
  5. awk

    time for ((i=1000;i--;));do ans=$(
        awk -F= '/MARX_BUILD_NO/{print $2}' builds.txt);done;echo $ans
    2.089s 
    12
    
  6. grep | cut

    time for ((i=1000;i--;));do ans=$(
        grep MARX_BUILD_NO builds.txt | cut -d '=' -f2)
      done;echo $ans
    2.292s 
    12
    

Clearly, last method take a lot of time because of fork and pipe, where two new, separated process have to be run for each variable assignment.



回答3:

If your grep supports PCRE, you can do:

grep -Po '^MARX_BUILD_NO=\K\d*$' builds.txt

Or sed:

sed -n 's/^MARX_BUILD_NO=\([0-9]*\)$/\1/p' builds.txt


回答4:

If your grep does not support PCRE you can just

grep MARX_BUILD_NO builds.txt | cut -d '=' -f2


标签: linux bash sh