I have Tweets from various times a day about companies, and I want to group them all by day. I have already done this. However, I want to sort them not from 00:00 until 23:59, but instead from 16:00 until 15:59 (because of the NYSE open hours).
Tweets (Negative, Neutral and Positive is for the sentiment):
Company,Datetime_UTC,Negative,Neutral,Positive,Volume
AXP,2013-06-01 16:00:00+00:00,0,2,0,2
AXP,2013-06-01 17:00:00+00:00,0,2,0,2
AXP,2013-06-02 05:00:00+00:00,0,1,0,1
AXP,2013-06-02 16:00:00+00:00,0,2,0,2
My code:
Tweets$Datetime_UTC <- as.Date(Tweets$Datetime)
Sent <- aggregate(list(Tweets$Negative, Tweets$Neutral, Tweets$Positive), by=list(Tweets$Company, Tweets$Datetime_UTC), sum)
colnames(Sent) <- c("Company", "Date", "Negative", "Neutral", "Positive")
Sent <- Sent[order(Sent$Company),]
Output of that code:
Company,Date,Negative,Neutral,Positive
AXP,2013-06-01,0,4,0
AXP,2013-06-02,0,3,0
How I'd want it to be (considering that a day should start at 16:00):
Company,Date,Negative,Neutral,Positive
AXP,2013-06-02,0,5,0
AXP,2013-06-03,0,2,0
As you can see, my code almost works. I just want to sort after different time windows.
How to do this? One idea would be to just add +8h to every single Datetime_UTC
, which would change 16:00 into 00:00. After this, I could just use my code. Would that be possible?
Thanks in advance!! :-)