Combine two particular lines using sed

2019-09-15 04:35发布

I have the following input file that you might recognize as a debian Packages file:

Package: nimbox-apexer-sales
Version: 1.0.0-201007241449
Architecture: i386
Maintainer: Ricardo Marimon <rmarimon@nimbox.com>
Installed-Size: 124
Depends: nimbox-apexer-root
Filename: binary/nimbox-apexer-sales_1.0.0-201007241449_i386.deb
Size: 68880
MD5sum: c4538f2913d76b57110ba73d0b87cc16
Section: base
Priority: optional
Description: Sales Application for NiMbox.

Package: nimbox-tomcat
Version: 6.0.26-5
Architecture: i386
Maintainer: Ricardo Marimon <rmarimon@nimbox.com>
Installed-Size: 6144
Depends: sun-java6-jdk
Filename: binary/nimbox-tomcat_6.0.26-5_i386.deb
Size: 5490024
MD5sum: 5f2ccbe6137af2842e1c81bc217444e3
Section: base
Priority: optional
Description: Tomcat Servlet Application Server for NiMbox
 NiMbox requires a servlet application server in order to work.  The current
 NiMbox implementation requires a Tomcat Servlet Application.

The file actually has many of these entries and I want to get the following file

nimbox-apexer-sales 1.0.0-201007241449
nimbox-tomcat 6.0.26-5

Where the Package and the Version are separated by a tab so that I can later use cut to get them. I'm pretty sure this can be done with sed. I went over the sed one liners but this is probably a bit more complex. Any ideas?

6条回答
疯言疯语
2楼-- · 2019-09-15 05:08

Here is a sed version:

  sed -ne 's/Package: \(.*\)/\1/p' 
      -ne 's/Version: \(.*\)/\1/p' < filename
      | sed 'N;s/\n/ /g'
查看更多
叛逆
3楼-- · 2019-09-15 05:11

Using RPMs, the solution would have been:

rpm -qa --queryformat "%{NAME}\t%{VERSION}\n"

Too bad for the sed challenge.

查看更多
冷血范
4楼-- · 2019-09-15 05:12

When working with Debian Packages files, you might find grep-dctrl useful. It's incredibly flexible in both the ways it allows to limit the data it outputs, as well as in how to output it. Instead of trying to parse the Packages file format myself, I'd just ask grep-dctrl to do it for me, and print only the bits if information I'm actually interested in:

$ grep-dctrl -n -s Package,Version nimbox /var/lib/apt/lists/..._Packages

That would give you something like:

nimbox-apexer-sales
1.0.0-201007241449

nimbox-tomcat
6.0.26-5

With that, it's only a matter of joining the right lines together, which is easy enough with, for example, perl:

$ ... |perl -pi -0e's/(?<!^)\n(?!\n)/ /mg; s/\n\n/\n/g'
nimbox-apexer-sales 1.0.0-201007241449
nimbox-tomcat 6.0.26-5

or any set of other standard UNIX tools you happen to like.

It's certainly possible to go directly from the Packages file format to what you want, but using tools specialized for the job seems like a good idea to me.

查看更多
Melony?
5楼-- · 2019-09-15 05:19

This might work for you:

sed '/Package:/!d;N;s/^[^ ]* //mg;y/\n/\t/' filename
nimbox-apexer-sales     1.0.0-201007241449
nimbox-tomcat   6.0.26-5

Also if you notice the same information can be gathered from the Filename: line:

sed '/Filename:/!d;s,.*/\([^_]*\)_\([^_]*\).*,\1\t\2,' filename
nimbox-apexer-sales     1.0.0-201007241449
nimbox-tomcat   6.0.26-5

This might be GNU sed specific!

查看更多
Deceive 欺骗
6楼-- · 2019-09-15 05:25

Pure sed solution (using FreeBSD sed on Mac OS X):

# See: 
# http://sed.sourceforge.net/sedfaq3.html#s3.3: ... (6) Relentless ...
# http://sed.sourceforge.net/sed1line.txt: ... # if a line begins with ...

sed -n '/^Package:/{
:a
N
/\nVersion:/!ba
p
}' file |
sed -E -e :a -e $'$!N;s/\\nVersion: */\t/;ta' -e 'P;D' |
sed -e 's/^Package: *//'
查看更多
女痞
7楼-- · 2019-09-15 05:27

Assuming that your file name is test.txt:

grep -P '^Package: |^Version:' test.txt  | awk '{ print $2 }' | sed -e 'N;s/\n/ /'

Where:

  1. grep -P '^Package: |^Version:' - greps for lines beginning with 'Package: ' or 'Version: '
  2. awk '{ print $2 }' - strips 'Package: ' and 'Version: ' substrings from the result
  3. sed -e 'N;s/\n/ /' - joins every other line
查看更多
登录 后发表回答