Array summation: Calculating mean of

2019-09-08 14:40发布

问题:

I have a sorted array (as below), what methods can I use to sum equivalent target responses and therefore calculate a mean at each value? e.g. sum all values corresponding to target = 15.26 (not the same values in other trials).

I considered looping through to look for similar numbers in the 3rd column but surely there is a simpler solution.

Many thanks, Newbie

     response   target
103    4.556049   15.260
55    10.549498   15.260
31    18.497221   15.260
130   13.275155   15.260
93     6.621331   15.260
35     7.653972   15.260
149   15.808224   15.260
84     4.556049   15.260
113    8.996922   15.260
152   18.282948   15.260
162   14.606229   15.260
60     5.588690   15.260
57     7.653972   15.260
159   20.902759   15.260
23    11.645647   15.260
40    21.385003   25.367
76    19.298551   25.367
71    17.724806   25.367
70    11.639967   25.367
178   36.359849   25.367
65    16.947743   25.367
32    25.811419   25.367
52    27.309781   25.367
26    27.160049   25.367
179   34.706590   25.367
54    22.157935   25.367
119   13.888544   25.367
123   39.826426   25.367
147   36.674288   25.367
151   33.022869   25.367
175   46.078035   39.536
4     34.567184   39.536
45    34.130893   39.536
8     51.218523   39.536
42    35.367997   39.536
78    26.210535   39.536
157   43.627062   39.536
116   22.879751   39.536
102   25.996262   39.536
135   45.632451   39.536
126   34.580608   39.536
125   40.390764   39.536
30    35.767629   39.536
86    33.775664   39.536
94    30.904922   39.536
88    68.897857   59.655
177   64.219992   59.655
171   63.937565   59.655
74    54.867877   59.655
163   39.521796   59.655
75    68.286534   59.655
148   70.700332   59.655
115   47.631128   59.655
167   55.704317   59.655
80    51.786992   59.655
0     54.931901   59.655
12    46.967656   59.655
10    62.339037   59.655
3     64.174040   59.655
43    66.948747   59.655
44    75.237758   89.387
101   82.690846   89.387
27    74.046606   89.387
105   94.395834   89.387
108   63.940663   89.387
112  111.211880   89.387
161   70.394671   89.387
29   116.047222   89.387
164   86.483221   89.387
96    83.612994   89.387
51    90.062871   89.387
49    99.448547   89.387
120   64.238064   89.387
121   99.623064   89.387
136   87.784865   89.387
53    99.587954  119.710
90    99.497598  119.710
155  117.134593  119.710
2    118.382540  119.710
87   123.984619  119.710
173  126.473800  119.710
124  128.213801  119.710
62   104.233807  119.710
142  151.525160  119.710
77    84.349268  119.710
46   128.496744  119.710
137  135.726266  119.710
6    106.980116  119.710
109  135.305464  119.710
56   146.565384  119.710
21   149.950898  155.140
144  138.947073  155.140
132  157.788645  155.140
25   138.816444  155.140
98   159.238989  155.140
97   136.179079  155.140
18   160.264919  155.140
92   106.939843  155.140
50   133.825173  155.140
156  160.650610  155.140
169  164.086207  155.140
7    120.081751  155.140
82   144.995253  155.140
73   148.619307  155.140
160  155.345932  155.140
154  286.343698  241.970
20   238.666653  241.970
17   243.265521  241.970
61   233.941803  241.970
67   225.647113  241.970
134  238.871632  241.970
141  257.964136  241.970
39   237.710944  241.970
106  267.179426  241.970
158  288.864375  241.970
104  219.470369  241.970
38   221.280073  241.970
36   216.673977  241.970
128  255.494058  241.970
91   222.512530  241.970
9    248.174697  281.250
143  339.346073  281.250
165  319.828122  281.250
166  339.152453  281.250
172  311.936161  281.250
14   229.413155  281.250
153  362.308915  281.250
117  334.014030  281.250
99   266.162791  281.250
85   307.998184  281.250
118  322.768051  281.250
22   247.992436  281.250
100  282.320528  281.250
24   308.043620  281.250
48   277.614265  281.250
89   324.674307  312.340
34   319.110436  312.340
83   294.367320  312.340
107  256.297453  312.340
5    330.217008  312.340
127  394.634200  312.340
66   335.137544  312.340
63   303.852646  312.340
16   336.398915  312.340
133  401.600397  312.340
176  336.454678  312.340
122  367.271789  312.340
1    330.173121  312.340
140  389.322293  312.340
33   306.170925  312.340
170  463.588300  365.130
68   354.929661  365.130
174  511.082051  365.130
41   407.971277  365.130
81   352.324308  365.130
19   455.697372  365.130
95   347.397060  365.130
13   374.191002  365.130
15   471.887121  365.130
146  420.940734  365.130
114  365.869462  365.130
138  466.096069  365.130
11   421.345013  365.130
139  451.122771  365.130
111  358.154084  365.130
129  513.113772  415.210
131  465.486811  415.210
58   498.471436  415.210
59   480.860257  415.210
64   435.301676  415.210
37   401.883341  415.210
28   520.876652  415.210
69   402.135305  415.210
145  514.131956  415.210
47   506.972655  415.210
72   402.655756  415.210
79   402.615483  415.210
150  519.844011  415.210
168  504.783972  415.210
110  435.998192  415.210


    mean_plain = []
for i in range(0,11):
    mean_plain.append([a[i][1].response.mean(),a[i][0]])

Stuck here, so close but not quite in the right format (Ideally I want original columns target and response

回答1:

The easiest thing would be if you had the data in a database. If the data already comes from a db or you could put it in db it would be convenient to do something like:

select target, count(*), avg(col1), avg(response)
from your_table
group by target

If you don't have access to db you could for example try https://www.sqlite.org/ if you are eager to learn the basics of db-usage.



回答2:

Continuing from UlfR thoughts using groupby, this works nicely

 a=list(data.groupby('target'))
 a[0].response.mean()

Just have to loop it through and store values now