Triple Inner join with over 10.000 rows and asp-ca

2019-07-26 08:19发布

问题:

My ASP Classic application are fetching over 10.000 rows via a triple inner join. After this, it is going through each line, calculating some stuff and putting data in a multidimensional array with 17 coloumns.

All this are being fetched through a jQuery AJAX when pushing the search button.

Logically this takes a while, 5+ minutes actually.

It all works fine. BUT... While doing this, the whole system freezes, not only for the user doing the calculations, but also for all the other users using the system in other ways. How can I optimize this?!

A small relevant code snippet:

dim userArray()

sql = "select first.*, second.fullname, second.info, third.inputs from first inner join second on first.userid = second.id inner join third on first.info = third.id where convert(varchar, first.period, 112) between '20020115' and '20120115' order by second.fullname, first.userid"
set rs = conn.execute(sql)

if rs.eof then
  response.write("Nothing were found in the period")
else
  counter = 0
  redim userArray(17, 1000)
  do until rs.eof
    response.flush
    counter = counter + 1

  ' A LOT of calculations and putting in array...

    rs.movenext
  loop

  for i = 1 to counter
    ' Doing all the response.writes to the user...
  next
end if

回答1:

Lets analyse this whilst also bearing mind the SQL has an ORDER BY Clause:-

do until rs.eof 
  response.flush 
  counter = counter + 1 

  ' A LOT of calculations and putting in array... 

  rs.movenext 
loop

Note the Response.Flush, first thing I would do is get rid of that. You will probably need to increase the ASP Response Buffering Limit (in IIS manager). Flush is sending the generated content so far to the client, it waits for the client to acknowledge receipt of all the packets sent before it completes. That is where I'm gonna guess 90% of the 5+ minutes is being spent.

Now "A LOT calculations". VBScript is not know for its peformance. This code may well take some time. In some cases some calculations can be done much better by SQL than in script so that is one option. Another would be to build some COM compiled component to do complex work (although some accounting needs to made for marshalling which can wipe out benefits). However it may be unavoidable that you need to do these calcs in VBScript.

Now rs.movenext. This loop means you hold the connection and rowset open for pretty much all the time the processing is required. That is while the servers is sending bytes over the network to the client and while VBScript is crunching numbers. A much better approach would be suck up all the rowset quickly and disconnect from the DB, then crunch numbers and finally dump the buffer to the client.

Consider using a disconnected recordset (you specify a client side static cursor) or even the simple GetRows method of the recordset object that dumps the whole rowset into a 2-dimensional array. This will mean that you maintain locks on the various tables for the smallest time possible.



回答2:

I see you already use response.flush() to flush data to the browser intermittedly during the process, but if you're using AJAX, the call must first complete before your AJAX callback function is called, so I think response.flush is not going to be of any use there. You might try calling the AJAX url directly, and put in a response.write() in the loop to see what happens (and what the speed is)

To get even more information, you could add a timer before the query, after the query and inside the loop and response.write the time that has passed since starting the script. This will give you a very good idea where the delay is happening: http://www.codefixer.com/codesnippets/vbscript_timer_function.asp

You say the machine freezes, is that the client PC with the browser, or the server where IIS runs? If a bucketload of data is being sent I have seen browsers hang and not update until it's done.

Try to add WITH NOLOCK after each table name in your query to select without locking the database for writing. This might give you some data that was overwritten during the execution of your query, but that's usually not a problem.

Also, indexes. Try changing the clustered index on the fields you use in your WHERE statement, or add some regular indexes if you need your clustered index. Optimizing your indexes will speed things up considerably if you haven't done so.

HTH,

Erik



回答3:

I'd refactor that code like this:

sql = "select first.*, second.fullname, second.info, third.inputs from first inner join second on first.userid = second.id inner join third on first.info = third.id where convert(varchar, first.period, 112) between '20020115' and '20120115' order by second.fullname, first.userid"
Set rs = conn.Execute(sql)
If NOT rs.EOF Then
    aRecords = rs.GetRows() ' retrieve your records and assign them to an array '
End If
rs.Close ' record set is now released, so it shouldn't lock up your database anymore

If IsArray(aRecords) Then ' just a small sanity check '
    iCount = UBound(aRecords, 2)
    For iCounter = 0 To iCount
        ' Your calculations and your Response.Writes '
    Next
    Erase aRecords
Else
    ' no result found '
End If


Update

You can assign the records to variables in your For loop, e.g.

id = aRecords(0, iCounter)

Then you only need to refer to id when ever you need it. You're right though in that if what you're selecting is dynamic (i.e. column's positions can shift), then this approach can produce problems for when you're trying to assign records to variables further along the line. To assign fullname from your second table, for instance, you'd have to know how many columns are being retrieved from first.*.