In summary, if you want MRI 2.2 compatibility you must use JRuby 9.0.x and if you want MRI 2.3 compatibility you must use JRuby 9.1.x series. The most current release being 9.1.2.0. Anything prior to that you can refer to this table of versions from Heroku's documentation.
There are important recommendations as well:
Ideally you should use Rails 4.2. Try to be at least above 4.0, and you can turn on
config.threadsafe!
by default in the "config/environments/production.rb" file. To understand this subject, refer to Tenderlove's excellent explanation.If you're deploying to Heroku, don't bother trying the free or 1X dyno, which only gives you 512Mb of RAM. While it is enough for most small Rails applications (even with 2 or 3 Puma concurrent workers), I found that even the smallest apps can easily go above that. So you must consider at least the 2X dyno. In any server configuration, always consider more than 1Gb of RAM.
Gems
There are several gems with C extensions that just won't work. Some of them have drop-in replacements, some don't. You should refer to their Wiki for a list of cases.
In my small sample application - which is nothing more than a content website backed by a Postgresql database, ActiveAdmin to manage content, RMagick + Paperclip (yes, it's an old app) to handle image upload and resizing, there was not a lot to change. The important bits of my "Gemfile" end up looking like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
source 'https://rubygems.org' ruby '2.3.0', :engine => 'jruby', :engine_version => '9.1.1.0' # ruby '2.3.1' gem 'rails', '~> 4.2.6' gem 'devise' gem 'haml' gem 'puma' gem 'rack-attack' gem 'rack-timeout' gem 'rakismet' # Database gem 'pg', platforms: :ruby gem 'activerecord-jdbcpostgresql-adapter', platforms: :jruby # Cache gem 'dalli' gem "actionpack-action_caching", github: "rails/actionpack-action_caching" # Admin gem 'activeadmin', github: 'activeadmin' gem 'active_skin' # Assets gem 'therubyracer', platforms: :ruby gem 'therubyrhino', platforms: :jruby gem 'asset_sync' gem 'jquery-ui-rails' gem 'sass-rails' gem 'uglifier', '>= 1.3.0' gem 'coffee-rails', '~> 4.0.0' gem 'compass-rails' gem 'sprockets', '~>2.11.0' # Image Processing gem 'paperclip' gem 'fog' gem 'rmagick', platforms: :ruby gem 'rmagick4j', platforms: :jruby group :test do gem 'shoulda-matchers', require: false end group :test, :development do gem "sqlite3", platforms: :ruby gem "activerecord-jdbcsqlite3-adapter", platforms: :jruby # Pretty printed test output gem 'turn', require: false gem 'jasmine' gem 'pry-rails' gem 'rspec' gem 'rspec-rails' gem 'capybara' gem 'poltergeist' gem 'database_cleaner', '< 1.1.0' gem 'letter_opener' gem 'dotenv-rails' end group :production do gem 'rails_12factor' gem 'rack-cache', require: 'rack/cache' gem 'memcachier' gem 'newrelic_rpm' end |
Notice how I paired gems for the :ruby
and :jruby
platforms. After doing this change and bundle install
everything, I ran my Rspec suite and - fortunatelly - they all passed on the first run without any further changes! Your mileage will vary depending on the complexity of your application, so have your tests ready.
Puma
In the case of Puma, the configuration is a bit trickier, mine looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
web_concurrency = Integer(ENV['WEB_CONCURRENCY'] || 1) if web_concurrency > 1 workers web_concurrency preload_app! end threads_count = Integer(ENV['RAILS_MAX_THREADS'] || 5) threads threads_count, threads_count rackup DefaultRackup port ENV['PORT'] || 3000 environment ENV['RACK_ENV'] || 'development' on_worker_boot do # Worker specific setup for Rails 4.1+ # See: https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#on-worker-boot ActiveRecord::Base.establish_connection end |
It's a bit different than you might find in other documentations. The important part is to turn off workers and preload_app when loading over JRuby. It will complain and crash. On my original MRI deploy I am using a small 1X dyno and I can leave WEB_CONCURRENCY=2
and RAILS_MAX_THREADS=5
but on the JRuby deploy I set it to WEB_CONCURRENCY=1
(to turn off workers) and RAILS_MAX_THREADS=16
(because I am assuming JRuby can handle more multithreading than MRI).
Another important bit, most people still assume that MRI can't take advantage of native parallel threads at all because of the dared GIL (Global Interpreter Lock), but this is not entirely true. MRI Ruby can parallelize threads on I/O waits. So, if a part of your app is waiting for database to process and return rows, for example, another thread can take over and do something else, in parallel. It's not totally multi-threaded, but it can do some concurrency so setting a small amount of threads for Puma might help a bit.
Do not forget to set the Pool size to at least the same number of RAILS_MAX_THREADS
. You can either use the config/database.yml
for Rails 4.1+ or add an initializer. Follow Heroku's documentation on how to do so.
Initial Benchmarks
So, I was able to successfully deploy a secondary JRuby version of my original MRI-based Rails app.
The original app is still in a Hobby Dyno, sized at 512Mb of RAM. The secondary app needed a Standard 1X Dyno, sized at 1Gb of RAM.
The app itself is super simple and as I'm using caching - as you always should! -, the response time is very low, on the tens of milliseconds.
I tried to use the Boom tool to benchmark the requests. I did a lot of warming up on the JRuby version, running the benchmarks multiple times and even using Loader.io for added pressure.
I am running this test:
1 |
$ boom -n 200 -c 50 https://foo-my-site/ |
The MRI version performs like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
Summary: Total: 16.4254 secs Slowest: 9.0785 secs Fastest: 0.8362 secs Average: 2.6551 secs Requests/sec: 12.1763 Total data: 28837306 bytes Size/request: 144186 bytes Status code distribution: [200] 200 responses Response time histogram: 0.836 [1] | 1.660 [57] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 2.485 [57] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 3.309 [33] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 4.133 [22] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 4.957 [16] |∎∎∎∎∎∎∎∎∎∎∎ 5.782 [6] |∎∎∎∎ 6.606 [3] |∎∎ 7.430 [1] | 8.254 [3] |∎∎ 9.079 [1] | Latency distribution: 10% in 1.2391 secs 25% in 1.5910 secs 50% in 2.1974 secs 75% in 3.4327 secs 90% in 4.5580 secs 95% in 5.6727 secs 99% in 8.1567 secs |
And the JRuby version performs like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
Summary: Total: 15.5784 secs Slowest: 7.4106 secs Fastest: 0.5770 secs Average: 2.3224 secs Requests/sec: 12.8383 Total data: 28848475 bytes Size/request: 144242 bytes Status code distribution: [200] 200 responses Response time histogram: 0.577 [1] | 1.260 [23] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 1.944 [62] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 2.627 [51] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 3.310 [24] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 3.994 [26] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 4.677 [8] |∎∎∎∎∎ 5.361 [1] | 6.044 [0] | 6.727 [1] | 7.411 [3] |∎ Latency distribution: 10% in 1.1599 secs 25% in 1.5154 secs 50% in 2.0781 secs 75% in 2.8909 secs 90% in 3.7409 secs 95% in 4.2556 secs 99% in 6.7685 secs |
In general, I'd say that they are around the same. As this is not a particularly CPU-intensive processing, and most of the time is spent going through the Rails stack and hitting Memcachier to pull back the same content all the time, maybe it's only fair to expect similar results.
On the other hand, I'm not sure I'm using the tool in the best way possible. The log says something like this for every request:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
source=rack-timeout id=c8ad5f0c-b5c1-47ec-b88b-3fc597ab01dc wait=29ms timeout=20000ms state=ready Started GET "/" for XXX.35.10.XXX at 2016-06-03 18:54:34 +0000 Processing by HomeController#home as HTML Read fragment views/radiant-XXXX-XXXXX.herokuapp.com/en (6.0ms) Completed 200 OK in 8ms (ActiveRecord: 0.0ms) source=rack-timeout id=c8ad5f0c-b5c1-47ec-b88b-3fc597ab01dc wait=29ms timeout=20000ms service=19ms state=completed source=rack-timeout id=a5389dc4-9a1a-46b7-a1e5-53f334ca0941 wait=35ms timeout=20000ms state=ready Started GET "/" for XXX.35.10.XXX at 2016-06-03 18:54:36 +0000 Processing by HomeController#home as HTML Read fragment views/radiant-XXXX-XXXXX.herokuapp.com/en (6.0ms) Completed 200 OK in 9ms (ActiveRecord: 0.0ms) source=rack-timeout id=a5389dc4-9a1a-46b7-a1e5-53f334ca0941 wait=35ms timeout=20000ms service=21ms state=completed at=info method=GET path="/" host=radiant-XXXX-XXXXX.herokuapp.com request_id=a5389dc4-9a1a-46b7-a1e5-53f334ca0941 fwd="XXX.35.10.XXX" dyno=web.1 connect=1ms service=38ms status=200 bytes=144608 |
The times reported by the Boom tool are much larger (2 seconds?) than the processing times in the logs (10ms?). Even considering some overhead for the router and so on, still it's a big difference, I wonder if it's being queued for too long because the app is not being able to respond more of the concurrent requests.
The amount of requests divided by the number of concurrent connections will bring the overall performance and throughput down if you increase it too much, I wasn't able to go too much above 14 rpm with this configuration, though.
If you have more experience with http benchmarking and you can spot something wrong I am doing here, please let me know in the comments section below.
Conclusion
JRuby continues to evolve, and you might benefit if you already have a large set of Dynos of large servers around. I wouldn't recommend it for small to medium applications.
I've seen many orders of magnitude improvements in specific use cases (I believe it was a very high traffic API endpoint). This particular case I tested is probably not its sweet spot and changing from MRI to JRuby didn't give me too much advantage, so in this case I would recommend sticking to MRI.
Startup time is still an issue. There is an entry in their Wiki giving some recommendations, but even in the Heroku deploy I ended up having R10 errors (Boot Timeout) every once in a while for this small app.
I didn't try increasing the dynos to the Performance tier introduced last year. I would bet that JRuby would be better at those and more able to leverage the extra power of having from 2.5GB up to 14GB if you have really big traffic (on the order of thousands or tens of thousands of requests per minute). With MRI the recommendation would be using small-ish dynos (2X or at most Performance-M dynos) and scale horizontally. With JRuby you could have less dynos with larger sizes (Performance-L, for example). Again, depends on your case.
Don't take the benchmarks above as "facts" to generalize everywhere, they are just there to give you a notion of an specific use case of mine. Your mileage will vary, so you must test it out yourself.
Another use case (that I did not test) is not to just "port" an MRI app to JRuby but leverage JRuby's unique strenghts as this post from Heroku explains, in the case of using JRuby with Ratpack, for example.
All in all, JRuby is still a great project to experiment. MRI itself came a long way as well and 2.3.1 is not bad at all. Most of the time it's down to your entire architecture choices, not just the language. If you didn't try it yet, you definitely should. It "just works".