I have song.txt
file
*****
[1]"The snow glows white on the mountain tonight
Not a footprint to be seen."
[2]"A kingdom of isolation,
and it looks like I'm the Queen"
[3]"The wind is howling like this swirling storm inside
Couldn't keep it in;
Heaven knows I've tried"
*****
[4]"Don't let them in,
don't let them see"
[5]"Be the good girl you always have to be
Conceal, don't feel,
don't let them know"
[6]"Well now they know"
*****
I would like to loop over the lyrics and fill in the elements of each list as
each element in the list contains a character vector, where each element of the vector is a word in the song.
like
[1] "The" "snow" "glows" "white" "on" "the" "mountain" "tonight" "Not" "a" "footprint"
"to" "be" "seen." "A" "kingdom" "of" "isolation," "and" "it" "looks" "like" "I'm" "the"
"Queen" "The" "wind" "is" "howling" "like" "this" "swirling" "storm" "inside"
"Couldn't" "keep" "it" "in" "Heaven" "knows" "I've" "tried"
[2]"Don't" "let" "them" "in,""don't" "let" "them" "see" "Be" "the" "good" "girl" "you"
"always" "have" "to" "be" "Conceal," "don't" "feel," "don't" "let" "them" "know"
"Well" "now" "they" "know"
First I made an empty list with words <- vector("list", 2)
.
I think that I should first put the text into one long character vector where in relation to the delimiters *****
start and stop. with
star="\\*{5}"
pindex = grep(star, page)
After this what should I do?
It sounds like what you want is strsplit
, run (effectively) twice. So, starting from the point of "a single long character string separated by **** and spaces" (which I assume is what you have?):
list_of_vectors <- lapply(strsplit(song, split = "\\*{5}"), function(x) {
#Split each verse by spaces
split_verse <- strsplit(x, split = " ")
#Then return it as a vector
return(unlist(split_verse))
})
The result should be a list of each verse, with each element consisting of a vector of each word in that verse. Iff you're not dealing with a single character string in the read-in object, show us the file and how you're reading it in ;).
To get it into the format you want, maybe give this a shot. Also, please update your post with more information so we can definitively solve your problem. There are a few areas of your posted question that need some clarification. Hope this helps.
## writeLines(text <- "*****
## The snow glows white on the mountain tonight
## Not a footprint to be seen.
## A kingdom of isolation,
## and it looks like I'm the Queen
## The wind is howling like this swirling storm inside
## Couldn't keep it in;
## Heaven knows I've tried
## *****
## Don't let them in,
## don't let them see
## Be the good girl you always have to be Conceal,
## don't feel,
## don't let them know
## Well now they know
## *****", "song.txt")
> read.song <- readLines("song.txt")
> split.song <- unlist(strsplit(read.song, "\\s"))
> star.index <- grep("\\*{5}", split.song)
> word.index <- sapply(2:length(star.index), function(i){
(star.index[i-1]+1):(star.index[i]-1)
})
> lapply(seq(word.index), function(i) split.song[ word.index[[i]] ])
## [[1]]
## [1] "The" "snow" "glows" "white" "on" "the" "mountain"
## [8] "tonight" "Not" "a" "footprint" "to" "be" "seen."
## [15] "A" "kingdom" "of" "isolation," "and" "it" "looks"
## [22] "like" "I'm" "the" "Queen" "The" "wind" "is"
## [29] "howling" "like" "this" "swirling" "storm" "inside" "Couldn't"
## [36] "keep" "it" "in;" "Heaven" "knows" "I've" "tried"
## [[2]]
## [1] "Don't" "let" "them" "in," "don't" "let" "them" "see" "Be"
## [10] "the" "good" "girl" "you" "always" "have" "to" "be" "Conceal,"
## [19] "don't" "feel," "don't" "let" "them" "know" "Well" "now" "they"
## [28] "know"