Extract and save information of an xml file with p

2020-05-09 22:43发布

问题:

Python novice here who has an XML structure which looks like this:

<!-- ====================================================================== -->

    <person id="10007071">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >31</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >all</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3676634</attribute>
            <attribute name="employed" class="java.lang.Boolean" >true</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >1156680400001</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >false</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >false</attribute>
            <attribute name="sex" class="java.lang.String" >m</attribute>
        </attributes>
        <plan score="-0.525" selected="yes">
            <activity type="outside" link="398700" facility="outside_15" x="653054.0233505964" y="6857528.792600333" end_time="06:58:53" >
            </activity>
            <leg mode="car" dep_time="06:58:53" trav_time="00:11:55">
                <route type="links" start_link="398700" end_link="255203" trav_time="00:11:55" distance="12314.30498323443" vehicleRefId="10007071">398700 398731 506155 506168 398730 517874 279323 284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 190032 526059 525761 525778 525779 450362 63873 63870 63871 350067 350066 85890 202345 202323 202322 85868 569745 569762 535571 535243 616420 7195 584893 205956 205957 205958 536023 150529 150530 392831 392832 392833 37140 476291 107074 107075 74149 74150 74151 74152 646460 646461 646462 190088 190089 190090 276937 276938 276939 276940 276941 276942 477763 270067 132825 277662 277663 181902 181923 132840 132838 132836 132834 245291 245289 245287 245285 666635 666637 666638 666639 666640 666641 344713 344711 344709 344707 344705 142088 149714 149716 251612 251610 251608 251606 428868 223363 223365 149718 283093 259788 428828 81196 260062 614779 614781 614783 614785 255201 255202 255203</route>
            </leg>
            <activity type="work" link="255203" facility="43250" x="652768.9" y="6863857.8" start_time="07:02:50" end_time="19:22:50" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="car" dep_time="19:22:50" trav_time="00:13:13">
                <route type="links" start_link="255203" end_link="398730" trav_time="00:13:13" distance="9083.291323242862" vehicleRefId="10007071">255203 640528 343439 24347 674168 531169 531167 531165 531163 531161 531159 531157 531155 414406 414407 268416 490715 233459 283092 149717 223364 264530 241890 196912 196910 391392 260834 409045 409046 598185 145996 368783 368785 368787 368789 525236 497882 538200 538202 385480 385487 164061 144907 443455 385499 385500 385501 76440 85934 85935 171962 85949 66249 66250 294493 203666 626505 626506 626507 620017 202848 610048 594253 594254 294494 484736 165207 675329 255383 293919 494873 215203 494882 494884 250728 134511 134509 537157 376845 376843 376841 376839 592779 178715 412036 412037 369862 581948 204682 210451 159662 170159 159663 641997 641996 139065 129610 557816 525663 46435 46436 46426 46421 284422 506155 506168 398730</route>
            </leg>
            <activity type="outside" link="398730" facility="outside_10" x="653013.2075560454" y="6857532.214432823" end_time="19:31:40" >
            </activity>
        </plan>

    </person>

The entire file is huge (2.5gb with many more person id's), which is why I need to work with parsing, so far this had been iterparse. What I want is a data frame, which shows the person id as well as the (summed up, if possible) trav_time of all the legs of this person. I'm struggling to access this information for each of the person id's.

I've tried multiple ways, the following two are according to my understanding closest to a possible solution.

first:

tree = ET.iterparse(gzip.open('V0_1pm/output_plans.xml.gz', 'r'))
traveltimes = defaultdict(list)
for xml_event, elem in tree:
        for person in elem:
            for plan in person:
                for leg in plan:
                    if leg.tag == "trav_time":
                        traveltimes[elem.attrib["trav_time"]]
                    elem.clear()
traveltimes = pd.DataFrame.from_dict(traveltimes, orient='index')                          
traveltimes

second:

tree = ET.iterparse(gzip.open('V0_1pm/output_plans.xml.gz', 'r'))
traveltimes = defaultdict(list)
for xml_event, elem in tree:
    attributes = elem.attrib
    if elem.tag == "trav_time":
            traveltimes[attributes["trav_time"]]
    elem.clear()
traveltimes = pd.DataFrame.from_dict(traveltimes, orient='index')                        
traveltimes

Thank you very much for your help and tips!

Update

Expansion of the code to replicate the data structure

    <person id="10002042">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >86</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >some</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3674945</attribute>
            <attribute name="employed" class="java.lang.Boolean" >false</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >2601700100002</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >true</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >false</attribute>
            <attribute name="sex" class="java.lang.String" >f</attribute>
        </attributes>
        <plan score="-0.13749999999999998" selected="yes">
            <activity type="outside" link="284251" facility="outside_1" x="653218.0059491959" y="6857536.564730054" end_time="09:49:38" >
            </activity>
            <leg mode="car_passenger" dep_time="09:49:38" trav_time="00:02:36">
                <route type="links" start_link="284251" end_link="63873" trav_time="00:02:36" distance="3117.285137236383" vehicleRefId="null">284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 190032 526059 525761 525778 525779 450362 63873</route>
            </leg>
            <activity type="outside" link="63873" facility="outside_2" x="656055.3097541996" y="6859009.979613776" end_time="09:52:18" >
            </activity>
            <leg mode="outside" dep_time="09:52:18" trav_time="00:00:00">
                <route type="generic" start_link="63873" end_link="85890" trav_time="00:00:00" distance="746.7439307235369"></route>
            </leg>
            <activity type="outside" link="85890" facility="outside_3" x="656635.5166858744" y="6859480.071535116" end_time="09:53:00" >
            </activity>
            <leg mode="car_passenger" dep_time="09:53:00" trav_time="00:01:21">
                <route type="links" start_link="85890" end_link="47652" trav_time="00:01:21" distance="1499.4956773327315" vehicleRefId="null">85890 202345 202323 202322 85868 569745 569762 535571 535243 616420 7195 408601 47652</route>
            </leg>
            <activity type="outside" link="47652" facility="outside_4" x="657143.7893766644" y="6860882.64702696" end_time="10:41:02" >
            </activity>
            <leg mode="outside" dep_time="10:41:02" trav_time="00:00:00">
                <route type="generic" start_link="47652" end_link="466140" trav_time="00:00:00" distance="16.659217552989976"></route>
            </leg>
            <activity type="outside" link="466140" facility="outside_5" x="657155.3197720037" y="6860894.671149082" end_time="10:43:55" >
            </activity>
            <leg mode="car_passenger" dep_time="10:43:55" trav_time="00:01:32">
                <route type="links" start_link="466140" end_link="85887" trav_time="00:01:32" distance="1841.175613889593" vehicleRefId="null">466140 666788 205956 205957 205958 315381 584891 7193 150557 535291 535555 569763 569764 569744 202426 202425 202424 535572 85887</route>
            </leg>
            <activity type="outside" link="85887" facility="outside_6" x="656620.921626125" y="6859492.595666251" end_time="10:45:38" >
            </activity>
            <leg mode="outside" dep_time="10:45:38" trav_time="00:00:00">
                <route type="generic" start_link="85887" end_link="63872" trav_time="00:00:00" distance="744.9330931635377"></route>
            </leg>
            <activity type="outside" link="63872" facility="outside_7" x="656043.6710628852" y="6859021.737831518" end_time="10:46:13" >
            </activity>
            <leg mode="car_passenger" dep_time="10:46:13" trav_time="00:02:37">
                <route type="links" start_link="63872" end_link="46435" trav_time="00:02:37" distance="3138.4720080186116" vehicleRefId="null">63872 63869 332997 332998 85873 525752 525750 525764 435247 635803 572374 572375 210451 159662 170159 159663 641997 641996 139065 129610 557816 525663 46435</route>
            </leg>
            <activity type="outside" link="46435" facility="outside_8" x="653338.6697731011" y="6857579.601421991" end_time="10:48:56" >
            </activity>
            <leg mode="outside" dep_time="10:48:56" trav_time="00:00:00">
                <route type="generic" start_link="46435" end_link="46426" trav_time="00:00:00" distance="187.1198640488319"></route>
            </leg>
            <activity type="outside" link="46426" facility="outside_9" x="653160.1865588573" y="6857523.409022551" end_time="10:49:17" >
            </activity>
            <leg mode="car_passenger" dep_time="10:49:17" trav_time="00:00:04">
                <route type="links" start_link="46426" end_link="398730" trav_time="00:00:04" distance="131.48553148334906" vehicleRefId="null">46426 46421 284422 506155 506168 398730</route>
            </leg>
            <activity type="outside" link="398730" facility="outside_10" x="653013.2075560454" y="6857532.214432823" end_time="10:49:27" >
            </activity>
        </plan>

    </person>

<!-- ====================================================================== -->

    <person id="10002043">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >90</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >some</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3674946</attribute>
            <attribute name="employed" class="java.lang.Boolean" >false</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >2400810100001</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >false</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >false</attribute>
            <attribute name="sex" class="java.lang.String" >m</attribute>
        </attributes>
        <plan score="-0.2636111111111111" selected="yes">
            <activity type="outside" link="284251" facility="outside_1" x="653218.0059491959" y="6857536.564730054" end_time="08:29:24" >
            </activity>
            <leg mode="car" dep_time="08:29:24" trav_time="00:02:54">
                <route type="links" start_link="284251" end_link="63873" trav_time="00:02:54" distance="3117.285137236383" vehicleRefId="10002043">284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 190032 526059 525761 525778 525779 450362 63873</route>
            </leg>
            <activity type="outside" link="63873" facility="outside_2" x="656055.3097541996" y="6859009.979613776" end_time="08:32:04" >
            </activity>
            <leg mode="outside" dep_time="08:32:04" trav_time="00:00:00">
                <route type="generic" start_link="63873" end_link="85890" trav_time="00:00:00" distance="746.7439307235369"></route>
            </leg>
            <activity type="outside" link="85890" facility="outside_3" x="656635.5166858744" y="6859480.071535116" end_time="08:32:46" >
            </activity>
            <leg mode="car" dep_time="08:32:46" trav_time="00:01:40">
                <route type="links" start_link="85890" end_link="47652" trav_time="00:01:40" distance="1499.4956773327315" vehicleRefId="10002043">85890 202345 202323 202322 85868 569745 569762 535571 535243 616420 7195 408601 47652</route>
            </leg>
            <activity type="outside" link="47652" facility="outside_4" x="657143.7893766644" y="6860882.64702696" end_time="09:35:48" >
            </activity>
            <leg mode="outside" dep_time="09:35:48" trav_time="00:00:00">
                <route type="generic" start_link="47652" end_link="466140" trav_time="00:00:00" distance="16.659217552989976"></route>
            </leg>
            <activity type="outside" link="466140" facility="outside_5" x="657155.3197720037" y="6860894.671149082" end_time="09:42:26" >
            </activity>
            <leg mode="car" dep_time="09:42:26" trav_time="00:02:00">
                <route type="links" start_link="466140" end_link="85887" trav_time="00:02:00" distance="1841.175613889593" vehicleRefId="10002043">466140 666788 205956 205957 205958 315381 584891 7193 150557 535291 535555 569763 569764 569744 202426 202425 202424 535572 85887</route>
            </leg>
            <activity type="outside" link="85887" facility="outside_6" x="656620.921626125" y="6859492.595666251" end_time="09:44:09" >
            </activity>
            <leg mode="outside" dep_time="09:44:09" trav_time="00:00:00">
                <route type="generic" start_link="85887" end_link="63872" trav_time="00:00:00" distance="744.9330931635377"></route>
            </leg>
            <activity type="outside" link="63872" facility="outside_7" x="656043.6710628852" y="6859021.737831518" end_time="09:44:44" >
            </activity>
            <leg mode="car" dep_time="09:44:44" trav_time="00:03:00">
                <route type="links" start_link="63872" end_link="46435" trav_time="00:03:00" distance="3138.4720080186116" vehicleRefId="10002043">63872 63869 332997 332998 85873 525752 525750 525764 435247 635803 572374 572375 210451 159662 170159 159663 641997 641996 139065 129610 557816 525663 46435</route>
            </leg>
            <activity type="outside" link="46435" facility="outside_8" x="653338.6697731011" y="6857579.601421991" end_time="09:47:28" >
            </activity>
            <leg mode="outside" dep_time="09:47:28" trav_time="00:00:00">
                <route type="generic" start_link="46435" end_link="46426" trav_time="00:00:00" distance="187.1198640488319"></route>
            </leg>
            <activity type="outside" link="46426" facility="outside_9" x="653160.1865588573" y="6857523.409022551" end_time="09:47:49" >
            </activity>
            <leg mode="car" dep_time="09:47:49" trav_time="00:00:14">
                <route type="links" start_link="46426" end_link="398730" trav_time="00:00:14" distance="131.48553148334906" vehicleRefId="10002043">46426 46421 284422 506155 506168 398730</route>
            </leg>
            <activity type="outside" link="398730" facility="outside_10" x="653013.2075560454" y="6857532.214432823" end_time="09:55:48" >
            </activity>
            <leg mode="outside" dep_time="09:55:48" trav_time="00:00:00">
                <route type="generic" start_link="398730" end_link="284251" trav_time="00:00:00" distance="204.84459212547162"></route>
            </leg>
            <activity type="outside" link="284251" facility="outside_1" x="653218.0059491959" y="6857536.564730054" end_time="09:59:24" >
            </activity>
            <leg mode="car" dep_time="09:59:24" trav_time="00:02:07">
                <route type="links" start_link="284251" end_link="525753" trav_time="00:02:07" distance="2349.4934769631172" vehicleRefId="10002043">284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 362748 643661 525753</route>
            </leg>
            <activity type="outside" link="525753" facility="outside_11" x="655306.9611509901" y="6858641.834279304" end_time="10:35:48" >
            </activity>
            <leg mode="outside" dep_time="10:35:48" trav_time="00:00:00">
                <route type="generic" start_link="525753" end_link="133164" trav_time="00:00:00" distance="70.96782044637413"></route>
            </leg>
            <activity type="outside" link="133164" facility="outside_12" x="655356.203591104" y="6858692.93822857" end_time="10:44:25" >
            </activity>
            <leg mode="car" dep_time="10:44:25" trav_time="00:02:45">
                <route type="links" start_link="133164" end_link="46435" trav_time="00:02:45" distance="2594.925451303471" vehicleRefId="10002043">133164 133165 525784 525781 159395 582076 84099 84100 525760 435247 635803 572374 572375 210451 159662 170159 159663 641997 641996 139065 129610 557816 525663 46435</route>
            </leg>
            <activity type="outside" link="46435" facility="outside_8" x="653338.6697731011" y="6857579.601421991" end_time="10:46:48" >
            </activity>
            <leg mode="outside" dep_time="10:46:48" trav_time="00:00:00">
                <route type="generic" start_link="46435" end_link="46426" trav_time="00:00:00" distance="187.1198640488319"></route>
            </leg>
            <activity type="outside" link="46426" facility="outside_9" x="653160.1865588573" y="6857523.409022551" end_time="10:47:09" >
            </activity>
            <leg mode="car" dep_time="10:47:09" trav_time="00:00:14">
                <route type="links" start_link="46426" end_link="398730" trav_time="00:00:14" distance="131.48553148334906" vehicleRefId="10002043">46426 46421 284422 506155 506168 398730</route>
            </leg>
            <activity type="outside" link="398730" facility="outside_10" x="653013.2075560454" y="6857532.214432823" end_time="10:47:19" >
            </activity>
        </plan>

    </person>

<!-- ====================================================================== -->

    <person id="10004136">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >41</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >some</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3675631</attribute>
            <attribute name="employed" class="java.lang.Boolean" >false</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >2403610200001</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >false</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >false</attribute>
            <attribute name="sex" class="java.lang.String" >f</attribute>
        </attributes>
        <plan score="-0.0375" selected="yes">
            <activity type="outside" link="284251" facility="outside_1" x="653218.0059491959" y="6857536.564730054" end_time="19:22:27" >
            </activity>
            <leg mode="car" dep_time="19:22:27" trav_time="00:02:07">
                <route type="links" start_link="284251" end_link="525753" trav_time="00:02:07" distance="2349.4934769631172" vehicleRefId="10004136">284251 660231 129607 129599 139064 641998 641663 159806 170160 85864 635804 572378 435246 362748 643661 525753</route>
            </leg>
            <activity type="outside" link="525753" facility="outside_11" x="655306.9611509901" y="6858641.834279304" end_time="19:24:31" >
            </activity>
        </plan>

    </person>
<!-- ====================================================================== -->

    <person id="10004137">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >53</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >some</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3675632</attribute>
            <attribute name="employed" class="java.lang.Boolean" >true</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >1157470400001</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >true</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >true</attribute>
            <attribute name="sex" class="java.lang.String" >m</attribute>
        </attributes>
        <plan score="-1.518611111111111" selected="yes">
            <activity type="outside" link="31240" facility="outside_13" x="652838.038196341" y="6858295.183610428" end_time="07:34:00" >
            </activity>
            <leg mode="access_walk" dep_time="07:34:00" trav_time="00:00:39">
                <route type="generic" start_link="31240" end_link="pt_StopPoint:59298" trav_time="00:00:39" distance="46.250835788845635"></route>
            </leg>
            <activity type="pt interaction" link="31240" x="652838.038196341" y="6858295.183610428" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="07:34:39" trav_time="00:02:21">
                <route type="enriched_pt" start_link="pt_StopPoint:59298" end_link="pt_StopPoint:59666" trav_time="00:02:21" distance="515.6409073075592">{"inVehicleTime":120.0,"transferTime":21.0,"accessStopIndex":26,"egressStopindex":27,"transitRouteId":"93517783-1_287780","transitLineId":"100110007:7","departureId":"93517632-1_287842_06:58:00"}</route>
            </leg>
            <activity type="pt interaction" link="31240" x="652838.038196341" y="6858295.183610428" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="07:37:00" trav_time="00:08:29">
                <route type="generic" start_link="pt_StopPoint:59666" end_link="508756" trav_time="00:08:29" distance="610.543587585534"></route>
            </leg>
            <activity type="outside" link="508756" facility="outside_14" x="652601.8490830011" y="6857663.731302492" end_time="07:53:26" >
            </activity>
            <leg mode="access_walk" dep_time="07:53:26" trav_time="00:08:29">
                <route type="generic" start_link="508756" end_link="pt_StopPoint:59666" trav_time="00:08:29" distance="610.543587585534"></route>
            </leg>
            <activity type="pt interaction" link="508756" x="652601.8490830011" y="6857663.731302492" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="08:01:55" trav_time="00:24:05">
                <route type="enriched_pt" start_link="pt_StopPoint:59666" end_link="pt_StopPoint:59209" trav_time="00:24:05" distance="7410.255050348954">{"inVehicleTime":1260.0,"transferTime":185.0,"accessStopIndex":3,"egressStopindex":17,"transitRouteId":"93517741-1_288723","transitLineId":"100110007:7","departureId":"93517701-1_288827_08:01:00"}</route>
            </leg>
            <activity type="pt interaction" link="508756" x="652601.8490830011" y="6857663.731302492" max_dur="00:00:00" >
            </activity>
            <leg mode="transit_walk" dep_time="08:26:00" trav_time="00:01:05">
                <route type="generic" start_link="pt_StopPoint:59209" end_link="pt_StopPoint:59212" trav_time="00:01:05" distance="78.60144797794317"></route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59209" x="651042.0886563308" y="6863599.716479325" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="08:27:05" trav_time="00:08:54">
                <route type="enriched_pt" start_link="pt_StopPoint:59212" end_link="pt_StopPoint:59470" trav_time="00:08:54" distance="2841.5271228126094">{"inVehicleTime":420.0,"transferTime":114.498793351715,"accessStopIndex":17,"egressStopindex":22,"transitRouteId":"95331274-1_267292","transitLineId":"100110008:8","departureId":"95331302-1_267323_08:07:00"}</route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59209" x="651042.0886563308" y="6863599.716479325" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="08:36:00" trav_time="00:03:05">
                <route type="generic" start_link="pt_StopPoint:59470" end_link="269385" trav_time="00:03:05" distance="221.08599197383575"></route>
            </leg>
            <activity type="work" link="269385" facility="22974" x="649200.4" y="6861852.6" start_time="07:38:40" end_time="16:38:40" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="access_walk" dep_time="16:38:40" trav_time="00:03:05">
                <route type="generic" start_link="269385" end_link="pt_StopPoint:59470" trav_time="00:03:05" distance="221.08599197383575"></route>
            </leg>
            <activity type="pt interaction" link="269385" x="649200.4" y="6861852.6" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="16:41:45" trav_time="00:09:15">
                <route type="enriched_pt" start_link="pt_StopPoint:59470" end_link="pt_StopPoint:59212" trav_time="00:09:15" distance="2841.5271228126094">{"inVehicleTime":420.0,"transferTime":135.0,"accessStopIndex":6,"egressStopindex":11,"transitRouteId":"95305985-1_264552","transitLineId":"100110008:8","departureId":"95305925-1_264577_16:36:00"}</route>
            </leg>
            <activity type="pt interaction" link="269385" x="649200.4" y="6861852.6" max_dur="00:00:00" >
            </activity>
            <leg mode="transit_walk" dep_time="16:51:00" trav_time="00:01:05">
                <route type="generic" start_link="pt_StopPoint:59212" end_link="pt_StopPoint:59209" trav_time="00:01:05" distance="78.60144797794317"></route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59212" x="650982.2282691017" y="6863608.229197035" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="16:52:05" trav_time="00:19:54">
                <route type="enriched_pt" start_link="pt_StopPoint:59209" end_link="pt_StopPoint:59298" trav_time="00:19:54" distance="6894.614143041396">{"inVehicleTime":1140.0,"transferTime":54.498793351711356,"accessStopIndex":13,"egressStopindex":26,"transitRouteId":"93518107-1_287714","transitLineId":"100110007:7","departureId":"93518059-1_287550_16:35:00"}</route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59212" x="650982.2282691017" y="6863608.229197035" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="17:12:00" trav_time="00:00:39">
                <route type="generic" start_link="pt_StopPoint:59298" end_link="31240" trav_time="00:00:39" distance="46.250835788845635"></route>
            </leg>
            <activity type="outside" link="31240" facility="outside_13" x="652838.038196341" y="6858295.183610428" end_time="17:14:00" >
            </activity>
        </plan>

    </person>

回答1:

Try changing your for loop to the following, and see if it works:

for xml_event, elem in tree:
if elem.tag=='person':        
    items = list(elem)
    target = items[1]        
    if target.attrib['selected']=='yes':
        traveltimes[elem.attrib["id"]]
        legs = list(items[1])
        for leg in legs:
            if leg.tag=='leg':
                traveltimes[leg.attrib["trav_time"]]
    elem.clear()    


traveltimes = pd.DataFrame.from_dict(traveltimes, orient='index')                        
traveltimes

My output, from your xml above:

10007071

00:11:55

00:13:13