Calculating Molecular Weight Using Excel

2020-03-27 05:02发布

问题:

I have come across a bit of problem here. I have a spreadsheet with about 9,000 organic compounds and I am trying to compute the molecular weight of all of them.

Normally, this would be easy: it's simply the number of elements in the molecular formula multiplied by the element's molecular weight and then you add them all up. The problem is, the spreadsheet has the molecular formulas listed out as a string.

For example, the molecular weight for "acetonitrile" is listed in a column as: C2H3N.

What I would like to do is write a function that scans that cell's contents and says, "Okay, every time I come across something that is text, look at the numbers immediately following it until you hit another text and then stop. Then, take that number and multiply by that particular element's molecular weight" (I will take care of the summation of the molecular weights later because I feel that is is the easy part).

Is this possible to do with Excel's built in functions, or do I have to use VBA (which I really don't have experience with). Any help here would be greatly appreciated.

回答1:

While your request is marginally possible through some pretty complex (and CPU intensive) formulas using nothing but native Excel functions, a VBA User Defined Function or UDF would be vastly more appropriate. I'm not a chemist so please excuse the additions to your single sample I've provided as they were stolen shamelessly from an Internet page. TBH, I'm not even sure if I have half of the terminology correct.

     

Step 1 - Create a table of molecular weights and name it

You are going to require some form of cross-reference to retrieve the molecular weights from the element's periodic symbols. Here is what I scraped together. I'll supply a link to the full table of data in a sample workbook below.

     

With that on a worksheet named Element Data, go to Formulas ► Defined Names ► Name Manger and give the cross-reference matrix a defined name.

     

Here I've used a formula (=OFFSET('Element Data'!$A$1,0,0,COUNTA( 'Element Data'!$A:$A),6)) to define the range but the size of the data is fairly static so a cell range reference should be more than sufficient.

Step 2 - Add the code for a User Defined Function

Tap Alt+F11 and when the VBE opens, immediately use the pull-down menus to Insert ► Module (Alt+I+M). Paste the following into the new pane titled something like Book1 - Module1 (Code).

Public Function udf_Molecular_Weight(sCMPND As String) As Double
    Dim sTMP As String, i As Long, sEL As String, sSB As String
    Dim dAW As Double, dAWEIGHT As Double, dSUB As Long
    sTMP = sCMPND: dAWEIGHT = 0: sSB = "0": sEL = vbNullString
    Do While CBool(Len(sTMP))
        sSB = "0": sEL = vbNullString
        If Asc(Mid(sTMP, Application.Min(2, Len(sTMP)), 1)) > 96 Then
            sEL = Left(sTMP, 2)
        Else
            sEL = Left(sTMP, 1)
        End If
        sTMP = Right(sTMP, Len(sTMP) - Len(sEL))
        Do While IsNumeric(Left(sTMP, 1))
            sSB = sSB & Int(Left(sTMP, 1))
            sTMP = Right(sTMP, Len(sTMP) - 1)
        Loop
        'Debug.Print sEL & ":" & (Int(sSB) - (Not CBool(Int(sSB))))
        dAWEIGHT = dAWEIGHT + Application.VLookup(sEL, ThisWorkbook.Names("tblPeriodic").RefersToRange, 6, False) * (Int(sSB) - (Not CBool(Int(sSB))))
    Loop
    udf_Molecular_Weight = dAWEIGHT
End Function

Public Function udf_Styled_Formula_Alt(sCMPND As String) As String
    Dim sb As Long, sCOMPOUND As String
    sCOMPOUND = sCMPND
    For sb = 0 To 9
        sCOMPOUND = Replace(sCOMPOUND, sb, ChrW(8320 + sb))
    Next sb
    udf_Styled_Formula_Alt = sCOMPOUND
End Function

Public Function udf_Unstyled_Formula_Alt(sCMPND As String) As String
    Dim sb As Long, sCOMPOUND As String
    sCOMPOUND = sCMPND
    For sb = 0 To 9
        sCOMPOUND = Replace(sCOMPOUND, ChrW(8320 + sb), sb)
    Next sb
    udf_Unstyled_Formula_Alt = sCOMPOUND
End Function

Only the first of those is pertinent to your posted question. The latter two stylize the compound's chemical formula with Unicode subscript characters and reverse the process.

When you have completed the paste, tap Alt+Q to return to your worksheet. These UDF functions can now be used within your workbook just as any native Excel function can. The syntax is as simple as I could muster.

=udf_Molecular_Weight(<single cell with compound formula in plain text>)

For your sample compound (in the data image above) this would be,

=udf_Molecular_Weight(B2)

... or,

=udf_Molecular_Weight("C2H3N")

With 9000+ of these, I suspect you'll use the former method. Fill down as necessary. While this UDF is vastly more efficient than convoluted array formulas using INDIRECT and other native worksheet functions, they are not magic. Test the formula on a few hundred rows before committing to the 9000+ so you know what to expect. The other two UDFs work in much the same fashion should you choose to put them to use.

BRIEF EXPLANATION:

By 'variable declarations', I'm guessing you actually mean 'variable assignments'. I tend to write fairly tight code and I've taken what others would put into up to 4 code lines into a single line by stacking the zeroing of the variables with a colon. I turn this,

sTMP = sCMPND
dAWEIGHT = 0
sSB = "0"
sEL = vbNullString

... into this,

sTMP = sCMPND: dAWEIGHT = 0: sSB = "0": sEL = vbNullString

The variables need to be reset before reentering the loops but it's a mundane task so I simply cram all four assignments into a single line.

The two Do While ... Loop crawl through the string that was passed into the function character by character. The inner loop deals exclusively with numbers. Each pass through the loop truncates the string from the left, shortening it by one or more characters and collecting those characters as either the symbol of a element or the number associated with its use in the organic compound. Eventually there is nothing left to truncate (length=0) and that is where CBool(Len(sTMP)) becomes False and the loop ends. The inner loop performs much the same way but collects numeric digits until it reaches no length or an alphabetic character. After an element (and a possible numeric modifier) has been collected, the molecular weight for that element within the compound is calculated with a VLOOKUP against the molecular weight table and added to a growing number. When all elements and their associated number has been gathered and added into the grand total, the total is returned as the result of the function.



回答2:

@Jeeped has a wonderful VBA solution to this. I posted a non-VBA solution to a related question at How to count up elements in excel. It's very easy to extend to this problem.

Place each element in a separate column, with its atomic mass above it.

This formula will calculate the weight of each atom in the molecule:

=B$1*
 MAX(IFERROR(IF(FIND(B$2&ROW($2:$100),$A3),ROW($2:$100),0),0),
     IFERROR(IF(FIND(B$2&CHAR(ROW($66:$91)),$A3&"Z"),1,0),0)
 )

Enter as an array formula: Ctrl + Shift + Enter.

The total molecular weight would be the sum of the weights.

Example: