automation of Doc to PDF in c#

2019-04-09 16:15发布

问题:

I have got about 200 word documents that I need to pdf.

Obviously, I cannot pdf them one by one as, first it will take ages, second I am sure it is not good practice to do so.

I need to find a way to automate that conversion, since we will need to this again and again.

I use C#, but the solution does not necessarily have to be in c#, but it is preferred.

I have had a look at few libraries such as PDfCreator, Office 2007 add-in, ITextSharp, and so forth and there is not any clear answer on the forums.

PDFCreator has c# sample, but it does only work with txt files. Office 2007 add in does not have document locking capabilities which a must on the automation.

has anyone implemented such scenario before? I would like you hear your suggestions.

Thanks in advance

regards

回答1:

You can try the method in this blog post:

http://angrez.blogspot.com/2007/06/create-pdf-in-net-using-pdfcreator.html



回答2:

I'm doing this to automate the conversion of our doc and docx documents to pdf:

private bool ConvertDocument(string file)
{
    object missing = System.Reflection.Missing.Value;

    OW.Application word = null;
    OW.Document doc = null;

    try
    {
        word = new OW.Application();
        word.Visible = false;
        word.ScreenUpdating = false;

        Object filename = (Object)file;

        doc = word.Documents.Open(ref filename, ref missing,
            ref missing, ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing);
        doc.Activate();

        if (Path.GetExtension(file) == ".docx")
            file = file.Replace(".docx", ".pdf");
        else
            file = file.Replace(".doc", ".pdf");

        object fileFormat = OW.WdSaveFormat.wdFormatPDF;

        doc.ExportAsFixedFormat(file, OW.WdExportFormat.wdExportFormatPDF, false, OW.WdExportOptimizeFor.wdExportOptimizeForPrint,
            OW.WdExportRange.wdExportAllDocument, 1, 1, OW.WdExportItem.wdExportDocumentContent, true, true, OW.WdExportCreateBookmarks.wdExportCreateNoBookmarks,
            true, true, false, ref missing);
    }
    catch(Exception ex)
    {
        return false;
    }
    finally
    {
        if (doc != null)
        {              
            object saveChanges = OW.WdSaveOptions.wdDoNotSaveChanges;
            ((OW._Document)doc).Close(ref saveChanges, ref missing, ref missing);
            doc = null;
        }

        if (word != null)
        {
            ((OW._Application)word).Quit(ref missing, ref missing, ref missing);
            word = null;
        }
    }

    return true;
}

where OW is an alias for Microsoft.Office.Interop.Word.



回答3:

Have you check this MSDN article?


Edit:

Notice that this "How To" samples will not work as-is because:

  1. For some reasons it runs over the program parameters (ConvertDocCS.exe [sourceDoc] [targetDoc] [targetFormat]) in line #77, #81 & #82.
  2. I converted the project to VS 2010 and had to re-reference Microsoft.Office.Core. It's a COM reference called Microsoft Office 12.0 Object Library.
  3. The sample do not except a relative path.

I'm sure you will manage to overcome those obstacles :)


One last thing. If you are working with .NET 4 you don't need to send all those annoying Missing.Value thanks to the wonder of optional parameters.



回答4:

You may try Aspose.Words for .NET to convert DOC files to PDF. It can be used in any .NET application with C# or VB.NET like any other .NET assembly. It also work on any Windows OS and in 32/64-bit systems.

Disclosure: I work as developer evangelist at Aspose.



回答5:

As HuBeZa said, if Word is installed on your workstation, you can use Word Automation to open your files one by one and save them as PDF. All you need is referencing the COM component "Microsoft Word Object Library" and play with the classes of this assembly.

The execution time will probably a bit long, but your conversions will be automated.



回答6:

We can set fonts for word automation, I applied single font to all generated documents from my solution for same application- and saved my time to manually go in each template and set the font separately for each tag and heading and etc...

 using (WordprocessingDocument wordProcessingDocument = WordprocessingDocument.Open(input, true))
                {
                    // Get all content control elements
                    List<DocumentFormat.OpenXml.OpenXmlElement> elements =
                        wordProcessingDocument.MainDocumentPart.Document.Body.ToList();
                    // Get and set the style properties of each content control
                    foreach (var itm in elements)
                    {
                        try
                        {
                            List<RunProperties> list_runProperties = 
                                  itm.Descendants<RunProperties>().ToList();
                            foreach (var item in list_runProperties)
                            {
                                if (item.RunFonts == null)
                                    item.RunFonts = new RunFonts();

                                item.RunFonts.Ascii = "Courier New";
                                item.RunFonts.ComplexScript = "Courier New";
                                item.RunFonts.HighAnsi = "Courier New";
                                item.RunFonts.Hint = FontTypeHintValues.ComplexScript;
                            }
                        }
                        catch (Exception)
                        {
                            //continue for other tags in document 
                            //throw;
                        }
                    }
                    wordProcessingDocument.MainDocumentPart.Document.Save();
                }


回答7:

I think straight answer to this is no!!! but it is possible through workaround what i suggest is use imagemagik or some library and see if it can provide images of your word doc and then use these images in itextsharp to create pdf