Internet Explorer VBA - Retrieve text from Class b

2019-06-10 03:51发布

问题:

I am working on an Excel VBA project to extract some information from a webpage and bring it into an Excel wrokbook. This is a screenshot of the webpage I am working with:

What I am looking to do is extract text based on two criteria: Name and post date. For example, I have the name Kaelan and the post date of 11/16/2016. I want to extract the amount of $365.

This is an example of the HTML code:

<td class="tdCamperFamilyLedgerTableColumnPostDate tdBorderTop" id="tdCamperFamilyLedgerTableColumnPostDate_CamperFamilyLedgerRowControl_14816465">
   <div class="divListTableBodyCell" id="tdColumnPostDateCell">
      <table class="tblListTableBodyCell">
         <tr>
            <td>
               <div class="divListTableBodyLabel">11/16/2016</div>
            </td>
         </tr>
      </table>
   </div>
</td>
<td class="tdCamperFamilyLedgerTableColumnAmount tdBorderTop" id="tdCamperFamilyLedgerTableColumnAmount_CamperFamilyLedgerRowControl_14816465">
   <div class="divListTableBodyCell" id="tdColumnAmountCell">
      <table class="tblListTableBodyCell">
         <tr>
            <td>
               <div class="divListTableBodyLabel">$ 365.00</div>
            </td>
         </tr>
      </table>
   </div>
</td>

The names are ID's in the HTML that are unique and I am able to extract them. The rest of the information (post date, effective date, quantity, and amount) all have the same class name in the entire table: divListTableBodyLabel.

Now I know I can use ie.document.getElementsByClassName("divListTableBodyLabel") and whatever number at the end to pick out the nth instance of that class. The problem is that n is dynamic in my case. Is there a way that I can search for the date (11/16/2016) and then use that as my starting point to move (2) 3 spots over to retrieve the amount of $365?

Is there a way I can limit my extract/search only to a portion of the HTML below the name Kaelan?

I am working my way through learning HTML and VBA, but I wanted to know if what I am seeking to is even possible or if there might be a better way to tackle the problem?

EDIT

This is the HTML code that lies above the code I posted previously:

<div class="familyLedgerAmountCategory" id="id_4541278">
    <table>
        <tr>
            <td class="tdCategoryRow">
                <div class="cmFloatLeft divExpandToggle expanded" id="divCategoryToggle_id_4541278"></div>
                <div class="cmFloatLeft" id="divCategoryLabel_id_4541278" style="width: 430px;">
                    Kaelan
                </div><span style="margin-left: 5px;">$ 465.00</span>
            </td>
        </tr>
        <tbody>
            <tr class="trListTableBody LedgerExisting" id="CamperFamilyLedgerRowControl_14816465">
                <td class="tdCamperFamilyLedgerTableColumnDescription tdBorderTop" id="tdCamperFamilyLedgerTableColumnDescription_CamperFamilyLedgerRowControl_14816465">
                    <div class="divListTableBodyCell" id="tdColumnDescriptionCell">
                        <table class="tblListTableBodyCell">
                            <tr>
                                <td>
                                    <div class="divListTableBodyLabel">
                                        <a class="aColumnDescriptionCell" id="aColumnDescriptionCell_CamperFamilyLedgerRowControl_14816465" name="aColumnDescriptionCell_CamperFamilyLedgerRowControl_14816465" target="_self" title="Click to view details">2017 Super Early Bird Teen Camp - Tuition</a>
                                    </div>
                                </td>
                            </tr>
                        </table>
                    </div>
                </td>
                <td class="tdCamperFamilyLedgerTableColumnPostDate tdBorderTop" id="tdCamperFamilyLedgerTableColumnPostDate_CamperFamilyLedgerRowControl_14816465">
                    <div class="divListTableBodyCell" id="tdColumnPostDateCell">
                        <table class="tblListTableBodyCell">
                            <tr>
                                <td>
                                    <div class="divListTableBodyLabel">
                                        11/16/2016
                                    </div>
                                </td>
                            </tr>
                        </table>
                    </div>
                </td>
                <td class="tdCamperFamilyLedgerTableColumnEffective tdBorderTop" id="tdCamperFamilyLedgerTableColumnEffective_CamperFamilyLedgerRowControl_14816465">
                    <div class="divListTableBodyCell" id="tdColumnEffectiveCell">
                        <table class="tblListTableBodyCell">
                            <tr>
                                <td>
                                    <div class="divListTableBodyLabel">
                                        11/15/2016
                                    </div>
                                </td>
                            </tr>
                        </table>
                    </div>
                </td>
                <td class="tdCamperFamilyLedgerTableColumnQty tdBorderTop" id="tdCamperFamilyLedgerTableColumnQty_CamperFamilyLedgerRowControl_14816465">
                    <div class="divListTableBodyCell" id="tdColumnQtyCell">
                        <table class="tblListTableBodyCell">
                            <tr>
                                <td>
                                    <div class="divListTableBodyLabel">
                                        1
                                    </div>
                                </td>
                            </tr>
                        </table>
                    </div>
                </td>
                <td class="tdCamperFamilyLedgerTableColumnAmount tdBorderTop" id="tdCamperFamilyLedgerTableColumnAmount_CamperFamilyLedgerRowControl_14816465">
                    <div class="divListTableBodyCell" id="tdColumnAmountCell">
                        <table class="tblListTableBodyCell">
                            <tr>
                                <td>
                                    <div class="divListTableBodyLabel">
                                        $ 365.00
                                    </div>
                                </td>
                            </tr>
                        </table>
                    </div>
                </td>
                <td class="tdCamperFamilyLedgerTableColumnAction tdBorderTop" id="tdCamperFamilyLedgerTableColumnAction_CamperFamilyLedgerRowControl_14816465"></td>
            </tr>
        </tbody>
    </table>
</div>

So I know I can go up the chain to extract specific items, but the problem in this situation is that each row of the table has a unique ID: id="tdCamperFamilyLedgerTableColumnPostDate_CamperFamilyLedgerRowControl_14816465 that I have no way to know what it is. It's a sequential number that runs across the site based on posting date from what I've looked at so there's no way for me to know exactly what that number is. I know the ID for a name such as Kaelan, but below that in the HTML code is an ID for each row that I have no way of knowing. This is the reason that I feel like I have to have a dynamic starting point unless there's another way to think/tackle this problem?