I uploaded a .txt
file in to R
as follows: Election_Parties <- readr::read_lines("Election_Parties.txt")
. Let's say the following text was in the file:
BOLIVIA
P17-Nationalist Revolutionary Movement-Free Bolivia Movement (Movimiento Nacionalista Revolucionario
P19-Liberty and Justice (Libertad y Justicia [LJ])
P20-Tupak Katari Revolutionary Movement (Movimiento Revolucionario Tupak Katari [MRTK])
COLOMBIA
P1-Democratic Aliance M-19 (Alianza Democratica M-19 [AD-M19])
P2-National Popular Alliance (Alianza Nacional Popular [ANAPO])
P3-Indigenous Authorities of Colombia (Autoridades Indígenas de Colombia)
In words: After every empty line, a new country starts. I would like to convert this text file into a dataframe where the country name becomes a vector and the list of parties becomes a vector.
Desired output:
Bolivia P1-Nationalist Revolutionary Movement-Free Bolivia Movement (Movimiento Nacionalista
Bolivia P19-Liberty and Justice (Libertad y Justicia [LJ])
Bolivia P20-Tupak Katari Revolutionary Movement (Movimiento Revolucionario Tupak Katari [MRTK])
Colombia P1-Democratic Aliance M-19 (Alianza Democratica M-19 [AD-M19])
Colombia P2-National Popular Alliance (Alianza Nacional Popular [ANAPO])
Colombia P3-Indigenous Authorities of Colombia (Autoridades Indígenas de Colombia)
I would if possible like the solution to be based on the header.
EDIT: I just realised that every new country starts with P1
, so a solution could also be based on that.
If your separator is always "", then once you have your text in a vector; use that as a demarcator and do cumsum to separate them into groups.
You can see BOLIVIA goes into 1, COLOMBIA goes into 2
We just apply a function to each group and make a dataframe