remove non-ASCII characters (using Microsoft.Offic

2019-07-15 05:57发布

I am trying to remove all non-ascii characters from excel / csv file. After reading online and searching I found a post which gave me the code xlWorksheet.UsedRange.Replace("[^\\u0000-\\u007F]" to remove the characters but everytime but the characters still exists in file.

Also I get a dialog box stating

We couldn't find anything to replace. Click Options for more ways to search.

FYI: It's possible the data you're trying to replace is in a protected sheet. Excel can't replace data in protected sheets.

Not sure how to proceed further. I have been looking and reading online but have found nothing useful so far.

Thanks for the help.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading.Tasks;
using Excel = Microsoft.Office.Interop.Excel;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            Excel.Application xlApp = new Excel.Application();
            Excel.Workbook xlWorkbook = xlApp.Workbooks.Open(@"C:\Users\username\Desktop\Error Records.csv");
            Excel.Worksheet xlWorksheet = xlWorkbook.Sheets[1];
            Excel.Range xlRange = xlWorksheet.UsedRange;

            int lastUsedRow = xlWorksheet.Cells.Find("*", System.Reflection.Missing.Value,
                System.Reflection.Missing.Value, System.Reflection.Missing.Value,
                Excel.XlSearchOrder.xlByRows, Excel.XlSearchDirection.xlPrevious,
                false, System.Reflection.Missing.Value, System.Reflection.Missing.Value).Row;

            int lastUsedColumn = xlWorksheet.Cells.Find("*", System.Reflection.Missing.Value,
                System.Reflection.Missing.Value, System.Reflection.Missing.Value,
                Excel.XlSearchOrder.xlByColumns, Excel.XlSearchDirection.xlPrevious,
                false, System.Reflection.Missing.Value, System.Reflection.Missing.Value).Column;

//            int lastColumnCount = lastUsedColumn;
//;
//            for (int i = 1; i <= lastUsedColumn; i++)
//            {
//                for (int j = 1; j <= lastUsedRow; j++)
//                {
//                    xlWorksheet.Cells[j, (lastColumnCount+1)] = "Testing data 134";
//                }
//            }

            xlWorksheet.Cells[1, (lastUsedColumn + 1)] = "Title";
            xlWorksheet.UsedRange.Replace("[^\\u0000-\\u007F]", string.Empty);

            xlWorkbook.Save();
            //cleanup
            GC.Collect();
            GC.WaitForPendingFinalizers();

            //rule of thumb for releasing com objects:
            //  never use two dots, all COM objects must be referenced and released individually
            //  ex: [somthing].[something].[something] is bad

            //release com objects to fully kill excel process from running in the background
            Marshal.ReleaseComObject(xlRange);
            Marshal.ReleaseComObject(xlWorksheet);

            //close and release
            xlWorkbook.SaveAs("C:\\Users\\username\\Desktop\\Errors_four.csv".Trim(), Excel.XlFileFormat.xlCSV);
            xlWorkbook.Close();
            Marshal.ReleaseComObject(xlWorkbook);

            //quit and release
            xlApp.Quit();
            Marshal.ReleaseComObject(xlApp);

        }
    }
}

标签: c# .net excel
1条回答
Melony?
2楼-- · 2019-07-15 06:14

foreach cell in each range, you can use the following function to replace the current cell string value with cleaned up ascii. I'm not aware of any ascii conversion functions that are native to the excel interop library. I am curious, do you have any examples that you can provide of what you are attempting to convert?

Keep in mind also that there are FUNCTIONS and then there are VALUES in the excel sheets. It isn't clear in your question which you are trying to work with. You mentioned CSV which makes me think that these are purely VALUES operations.

public string ReturnCleanASCII(string s)
{
    StringBuilder sb = new StringBuilder(s.Length);
    foreach(char c in s.ToCharArray())
    {
       if((int)c > 127) // you probably don't want 127 either
          continue;
       if((int)c < 32)  // I bet you don't want control characters 
          continue;
       if(c == ',')
          continue;
       if(c == '"')
          continue;
       sb.Append(c);
    }
    return sb.ToString();
}

Here is an example usage. Keep in mind that you will need to figure out how to index the cells yourself, this example works with cell 1,1 only. Also, two helpful tips: the cells are 1's indexed, and, it may be faster if you call Value2 instead of Value.

// get the value from a cell
string dirty = excelSheet.Cells[1, 1].Value.ToString(); // Value2 may be faster!

// convert to clean ascii
string clean = ReturnCleanASCII(dirty);

// set the cell value
excelSheet.Cells[1, 1].Value = clean;
查看更多
登录 后发表回答