I have a number of files which each contain reviews of a hotel and I would like to write a script which counts the number of reviews per file. An example of one file name would be hotel_73757
. The text in each file is laid out as follows:
<Overall Rating>3.5
<Avg. Price>$260
<URL>http://www.tripadvisor.com/ShowUserReviews-g31310-d73757-r23009196-Wyndham_Phoenix-Phoenix_Arizona.html
<Author>TexasSharvi
<Content>the new updo is ... it's great!
<Date>Dec 26, 2008
<No. Reader>-1
<No. Helpful>-1
<Overall>4
<Value>4
<Rooms>4
<Location>4
<Cleanliness>5
<Check in / front desk>5
<Service>-1
<Business service>4
<Author>ChrisLongo
<Content>Just Dirty... Will never stay at any Wyndham hotel again.
<Date>Dec 24, 2008
<No. Reader>1
<No. Helpful>1
<Overall>1
<Value>1
<Rooms>1
<Location>1
<Cleanliness>1
<Check in / front desk>1
<Service>1
<Business service>-1
This then repeats with a single line gap between each review, every review has the same fields. I was thinking of checking the number of times "Author" appears in each file would this work? Thanks in advance
you can use
grep
andwc
to get the number of lines containing the word 'Author' appear in the file:grep will filter only the Author lines, wc -l will count them
Just use
If you really want to make a script of this:
Make it executable with:
And run it with:
or