The Company I work for decided on moving their entire stack to Heroku. The main motivation was it's ease of use: No sysAdmin, no cry. But I still have some questions about it...
I'm making some load and stress tests on both application platform and Postgres service. I'm using blitz
as an addon of Heroku. I attack on the site with number of users between 1 to 250. There are some very interesting results I got and I need help on evaluating them.
The Test Stack:
Application specifications
It hasn't anything that much special at all.
- Rails 4.0.4
- Unicorn
database.yml
set up to connect to Heroku postgres.- Not using cache.
Database
It's a Standard Tengu (naming conventions of Heroku will kill me one day :) properly connected to the application.
Heroku configs
I applied everything on unicorn.rb
as told in "Deploying Rails Applications With Unicorn" article. I have 2 regular web dynos.
WEB_CONCURRENCY : 2
DB_POOL : 5
Data
episodes
table counts 100.000~episode_urls
table counts 300.000~episode_images
table counts 75.000~
Code
episodes_controller.rb
def index
@episodes = Episode.joins(:program).where(programs: {channel_id: 1}).limit(100).includes(:episode_image, :episode_urls)
end
episodes/index.html.erb
<% @episodes.each do |t| %>
<% if !t.episode_image.blank? %>
<li><%= image_tag(t.episode_image.image(:thumb)) %></li>
<% end %>
<li><%= t.episode_urls.first.mas_path if !t.episode_urls.first.blank?%></li>
<li><%= t.title %></li>
<% end %>
Scenario #1:
Web dynos : 2
Duration : 30 seconds
Timeout : 8000 ms
Start users : 10
End users : 10
Result:
HITS 100.00% (484)
ERRORS 0.00% (0)
TIMEOUTS 0.00% (0)
This rush generated 218 successful hits in 30.00 seconds and we transferred 6.04 MB of data in and out of your app. The average hit rate of 7.27/second translates to about 627,840 hits/day.
Scenario #2:
Web dynos : 2
Duration : 30 seconds
Timeout : 8000 ms
Start users : 20
End users : 20
Result:
HITS 100.00% (484)
ERRORS 0.00% (0)
TIMEOUTS 0.00% (0)
This rush generated 365 successful hits in 30.00 seconds and we transferred 10.12 MB of data in and out of your app. The average hit rate of 12.17/second translates to about 1,051,200 hits/day. The average response time was 622 ms.
Scenario #3:
Web dynos : 2
Duration : 30 seconds
Timeout : 8000 ms
Start users : 50
End users : 50
Result:
HITS 100.00% (484)
ERRORS 0.00% (0)
TIMEOUTS 0.00% (0)
This rush generated 371 successful hits in 30.00 seconds and we transferred 10.29 MB of data in and out of your app. The average hit rate of 12.37/second translates to about 1,068,480 hits/day. The average response time was 2,631 ms.
Scenario #4:
Web dynos : 4
Duration : 30 seconds
Timeout : 8000 ms
Start users : 50
End users : 50
Result:
HITS 100.00% (484)
ERRORS 0.00% (0)
TIMEOUTS 0.00% (0)
This rush generated 484 successful hits in 30.00 seconds and we transferred 13.43 MB of data in and out of your app. The average hit rate of 16.13/second translates to about 1,393,920 hits/day. The average response time was 1,856 ms.
Scenario #5:
Web dynos : 4
Duration : 30 seconds
Timeout : 8000 ms
Start users : 150
End users : 150
Result:
HITS 71.22% (386)
ERRORS 0.00% (0)
TIMEOUTS 28.78% (156)
This rush generated 386 successful hits in 30.00 seconds and we transferred 10.76 MB of data in and out of your app. The average hit rate of 12.87/second translates to about 1,111,680 hits/day. The average response time was 5,446 ms.
Scenario #6:
Web dynos : 10
Duration : 30 seconds
Timeout : 8000 ms
Start users : 150
End users : 150
Result:
HITS 73.79% (428)
ERRORS 0.17% (1)
TIMEOUTS 26.03% (151)
This rush generated 428 successful hits in 30.00 seconds and we transferred 11.92 MB of data in and out of your app. The average hit rate of 14.27/second translates to about 1,232,640 hits/day. The average response time was 4,793 ms. You've got bigger problems, though: 26.21% of the users during this rush experienced timeouts or errors!
General Summary:
- The "Hit Rate" never goes beyond the number of 15 even though 150 users sends request to the application.
- Increasing number of web dynos does not help handling requests.
Questions:
When I use caching and memcached (Memcachier add-on from Heroku) even 2 web dynos can handle >180 hits per second. I'm just trying to understand what can dynos and the postgres service can do without cache. This way I'm trying to understand how to tune them. How to do it?
Standard Tengu is said to have 200 concurrent connections. So why it never reaches that number?
If having a prdouction level db and increasing web dynos won't help to scale my app, what's the point to use Heroku?
Probably the most important question: What am I doing wrong? :)
Thank you for even reading this crazy question!