I have to check the data which contain "strikethrough" format when importing excel file in R
Do we have any method to detect them ? Welcome for both R and Python approach
I have to check the data which contain "strikethrough" format when importing excel file in R
Do we have any method to detect them ? Welcome for both R and Python approach
R-solution
the
tidyxl
-package can help you...example test.xlsx, with data on A1:A4 of the first sheet. Below is an excel-screenshot:
I found a method below:
'# Assuming the column from 1 - 10 has value : A , the 5th A contains "strikethrough"
But it doesn't tell the location (this case is the row numbers),which is hard for knowing where contains "strikethrough" when there is a lot of result , how can i vectorize the result of statement ?
I present below a small sample program that filters out text with strikethrough applied, using the openpyxl package (I tested it on version 2.5.6 with Python 3.7.0). Sorry it took so long to get back to you.
I tested it on a new workbook with the default worksheets, with the letters a,b,c,d,e in the first five rows of column A, where I had applied strikethrough formatting to b and d. This program filters out the cells in columnA which have had strikethrough applied to the font, and then prints the cell, row and values of the remaining ones. The col_idx property returns the 1-based numeric column value.