Need to create a crosstab with 2 categorical facto

2020-05-03 10:44发布

问题:

I have 2 categorical variables income level and Temporary visa status and the count for each combination.

All i need is a crosstab for creating a proportion crosstab bar chart to get the proportion of different temporary visa categories within an income level

library(readxl)
Crosstab_Temporary_visas_income <- read_excel("C:/Users/axelp/Documents/RMIT/Semester 2/Data Visualisation/Assignment 3/Crosstab Temporary visas income.xls")

str(Crosstab_Temporary_visas_income)

margin.table(Crosstab_Temporary_visas_income,1) #Row marginals

Error in margin.table(Crosstab_Temporary_visas_income, 1) : 
  'x' is not an array

> str(Crosstab_Temporary_visas_income)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   9 obs. of  6 variables:
 $ Income                  : chr  "Negative / nil income" "$1– 299" "$30 - 649" "$650– 999" ...
 $ Temporary Work (Skilled): num  405 2364 6496 19248 41595 ...
 $ Student                 : num  2169 33846 104569 27140 6737 ...
 $ New Zealand Citizen     : num  2446 16045 51337 104133 98986 ...
 $ Working Holiday Maker   : num  515 3670 18119 24476 7869 ...
 $ Other Temporary visa    : num  887 5325 24234 31975 16269 ...
structure(list(...1 = c("0", "$1– 299", "$30 - 649", "$650– 999"
), `Temporary Work (Skilled)` = c(405, 2364, 6496, 19248), Student = c(2169, 
33846, 104569, 27140), `New Zealand Citizen` = c(2446, 16045, 
51337, 104133), `Working Holiday Maker` = c(515, 3670, 18119, 
24476), `Other Temporary visa` = c(887, 5325, 24234, 31975)), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame"))

I used the table function on the imported csv to create a crosstab but all i get is more than 6000 matrix slices

回答1:

There was a problem when you read your data in because the row names were read as a column labeled "...1". Usually R will recognize a row names if there is one fewer column name than the number of columns. Nothing will work until you fix that.

library(tidyverse)
CTVI <- structure(list(...1 = c("0", "$1– 299", "$30 - 649", "$650– 999"), 
`Temporary Work (Skilled)` = c(405, 2364, 6496, 19248), Student = c(2169, 
33846, 104569, 27140), `New Zealand Citizen` = c(2446, 16045, 
51337, 104133), `Working Holiday Maker` = c(515, 3670, 18119, 
24476), `Other Temporary visa` = c(887, 5325, 24234, 31975)),  
row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"))

Now we need to delete the first column, use it for the row names, and convert the tibble to a matrix since some table functions such as addmargins and margin.table do not accept tibbles:

CTVI.mat <- as.matrix(CTVI[, -1])
rownames(CTVI.mat) <- unlist(CTVI[, 1])
CTVI.mat <- CTVI.mat[, -1]
names(dimnames(CTVI.mat)) <- c("Income", "Visa")

Now we can compute margins or proportions:

margin.table(CTVI.mat, 1) 
addmargins(as.matrix(CTVI.mat))
round(prop.table(as.matrix(CTVI.mat), 1), 3)


标签: r crosstab