why isn't this awk script behaving as expected

2019-08-04 11:24发布

I have the following test script

 /^[^a-zA-Z0-9]/  {
    DATEd[$3] = $1
    } 
   END { 
        print "        \"data\": ["
        for (i = 0 ; i <= 5; i ++ ) {
            { print "            [" i ", \"" DATEd[i] "\"],"}
        }
        print "        ]"
}

And are reading from this text file

2011-01-22 22:12 P16A22_110114072915 22 1312 75 13.55 1399
2011-01-22 22:12 P16A22_110114072915 22 1312 75 13.55 1399 
2011-01-22 22:12 P16A22_110114072915 22 1312 75 13.55 1399 
2011-01-22 22:12 P16A22_110114072915 22 1312 75 13.55 1399
2011-01-22 22:12 P16A22_110114072915 22 1312 75 13.55 1399 
2011-01-22 22:12 P16A22_110114072915 22 1312 75 13.55 1399

But it doesn't print out what I want it to, I want it to print out

    "data": [
        [0, "2011-01-22"],
        [1, "2011-01-22"],
        [2, "2011-01-22"],
        [3, "2011-01-22"],
        [4, "2011-01-22"],
        [5, "2011-01-22"],
    ]

When it in fact are only printing out

"data": [
    [0, ""],
    [1, ""],
    [2, ""],
    [3, ""],
    [4, ""],
    [5, ""],
]

So why is "DATEd[$3] = $1" empty?

Also how do I check the length of an array? DATEd.length doesn't work in this case.

Thanks

EDIT_______________________________________________

So from the help of @Fredrik and @geekosaur I have come somewhere with this, now to some last questions

1) The script now looks like this

 /[a-zA-Z0-9]/  {
    DATEd[NR-1] = $1
    } 
   END { 
        print "        \"data\": ["

        for (i in DATEd) {
            { print "            [" i ", \"" DATEd[i] "\"],"}
        }
        print "        ]"
}

And gives the following output

"data": [
    [4, "2011-01-26"],
    [5, "2011-01-27"],
    [6, "2011-01-28"],
    [0, "2011-01-22"],
    [1, "2011-01-23"],
    [2, "2011-01-24"],
    [3, "2011-01-25"],
]

But I want it to look like this

"data": [
[0, "2011-01-22"],
[1, "2011-01-23"],
[2, "2011-01-24"],
[3, "2011-01-25"],
[4, "2011-01-26"],
[5, "2011-01-27"],
[6, "2011-01-28"]
]

I.E be sorted and removing the last ',' character before the final closing ']' character. Is this possible to accieve in a easy way? =)

Thanks =)

EDIT 3 Final Outcome_______________________________________

Used a combination of @geekosaur and @Fredrik contribution's =)

{
    DATEd[NR-1] = $1; len++
}
   END { 
        print "        \"data\": ["

        #for (i in DATEd) {
        for (i = 0 ; i <= len-1; i ++ ) {
            { print "            [" i ", \"" DATEd[i] "\"],"}
        }
        print "        ]"
}

标签: awk gawk
2条回答
放荡不羁爱自由
2楼-- · 2019-08-04 11:50

In the absence of an -F option, $3 will be P16A22_110114072915 (or would be if your selector regex were correct). What value do you actually want there? Do you perhaps want NR?

awk is not object oriented; and its array support is, to be kind, lacking. You'll need to track the length of the array yourself. (Just to give you an idea of how limited awk's array support is: you can't assign an array. You have to assign individual indexes or use split().)

查看更多
Bombasti
3楼-- · 2019-08-04 12:01

As a start, your regex is wrong, /^[^a-zA-Z0-9]/ means to match the start of a line and NOT followed by a letter or a number. None of the lines have that setup, hence, your array DATe is empty.

Secondly, your array is not indexed by 0-5 but instead the content of $3 (if you fix your regex)

There is no built in function to get the length of an array, but it's simple to implement one.

Array example

function array_length(a) {
    for (i in a) n++
    return n
}

{
    DATEd[NR] = $1
}
END {
    for (i in DATEd) {
        print i, DATEd[i]
    }
    print "Number of items", array_length(DATEd)

    # copy indices
    j = 1
    for (i in DATEd) {
        ind[j] = i    # index value becomes element value
        j++
    }
    n = asort(ind)    # index values are now sorted
    for (i = 1; i <= n; i++)
        print i, DATEd[ind[i]]
}

Gives:

4 2011-01-22
5 2011-01-22
6 2011-01-22
1 2011-01-22
2 2011-01-22
3 2011-01-22
Number of items 6
1 2011-01-22
2 2011-01-22
3 2011-01-22
4 2011-01-22
5 2011-01-22
6 2011-01-22

See the gnu awk manual for an description of arrays

Too loop through all elements of an array, use this construct (see link above)

 for (var in array)
   body
查看更多
登录 后发表回答