includes is the recommended way to load associations of your records eagerly in Rails. In fact, the
Ruby on Rails guide for eager loading
only mentions includes. However, there are other ways, and I want to argue that you should
avoid includes.
Here's why:
includesmakes it easy to introduce an odd bugpreloadtakes the same arguments asincludes, but can't introduce the bug- When needed,
eager_loadalso takes the same arguments, can introduce the bug, but makes it explicit
What bug? Let's show the introduction of an unexpected bug:
Say you have these records:
p = Post.create!
p.comments.create!(content: 'hello', spam: true)
p.comments.create!(content: 'world')You want posts that have at least one comment marked as spam.
# distinct because joins can make duplicates
# I dislike using joins for this.
posts = Post.joins(:comments).distinct
.where(comments: {spam: true})How many comments in this array?
posts.first.comments.to_a.size2, nothing special there.
You will need to display the comments of the posts, so you includes them.
How many comments are in the array?
posts.includes(:comments).first.comments.to_aOnly 1, the one with `spam = true`
What's the count?
posts.includes(:comments).first.comments.countWhat about size?
posts.includes(:comments).first.comments.sizeTo be clear, the joins isn't needed for this to happen.
It was just to show a progression.
Post.includes(:comments)
.where(comments: {spam: true})
.first.comments.size
# => 1Personally, I find that unexpected.
Imagine if the post is passed to a view or a helper, which uses the comments. It
would only print the comments that matched the condition. Now if you try to debug from that helper,
you would see that post.comments only has 1 comment.
Hopefully, you know that this is how includes (and eager_load, see below) behaves, so you may look
up where the post comes from and figure it out. Good luck otherwise.
This is considered a feature. The consequences of doing conditions on eager loaded association are not in the guide, but they are in the middle of this section of the documentation.
I sometimes see this called "conditional eager loading", and the bug is doing it accidentally.
I consider the whole feature a maintenance burden. At least the guide doesn't recommend using it.
TL;DR: my recommendations
Things will get more technical in the next sections, so my condensed and straightforward recommendations are:
- For new code, if you are only doing eager loading, use
preloadinstead ofincludes. It does the same eager loading asincludes, and takes the same arguments, but it ignores the conditions in the query. With it, things work as expected:
posts = Post.joins(:comments).distinct
.where(comments: {spam: true})
posts.preload(:comments).first.comments.to_a.size #=> 2
posts.preload(:comments).first.comments.count #=> 2
posts.preload(:comments).first.comments.size #=> 2- If you need to order by an association, then
eager_loadis basically the only simple way to do so. If you need a condition (a
where) which uses an association, avoidincludes,joinsandeager_load.
Instead, I recommend my gem: activerecord_where_assoc. Here's an introduction to it.
It's made for this purpose, and will support many more use cases, such as:- Recursive associations (parent/child)
- Polymorphic belongs_to
- Negative conditions (ex: posts without comments marked as spam)
- Multiple conditions on different records of the same association
Alternatively, there's another gem for this: where_exists
# Same as before, posts that have at least one comment marked as spam
Post.where_assoc_exists(:comments, spam: true)If
includesseem to work somewhere thatpreloaddoesn't, you're probably doing a condition on an association or ordering by an association. See the previous points for this.For existing code, you can't mindlessly change all
includestopreload, because some of it may rely onincludesadding aJOINto the query (theeager_loadway), which happens when the query refers to the table of the included associations. So while it would be better to change everything topreloadand sometimeseager_load, every such change must be tested.If you see an
includeswith areferences, then that's just a call toeager_load. At this point, just useeager_loadto make your code shorter.
So don't risk includes doing the wrong thing. preload means simple eager loading without the booby trap;
you should use it. Treat eager_load as a warning sign that this could be doing conditional eager loading and
be careful around it.
Down the rabbit hole
If you want to understand why I make those recommendations, we'll have to get technical...
Eager loading means loading associations of multiple records before they are needed. This is done to reduce the number of queries executed, making execution faster.
There are actually 3 methods for eager loading in Rails:
preload: Executes one extra query per association being eager loaded. Same asincludesusually does.eager_load: AddsJOINto the SQL query and load the association without doing an extra query. This also enables adding conditions on the table, which is the cause of the conditional eager loading bug from the introduction.includes: Picks betweenpreloadandeager_loadbased on if there is a reference, in the query, to the table of an association that was passed toincludes. This can be fromwhereor fromjoins.
You may also specify an association withreferencesto force theeager_loadpath, which is needed when your conditions are specified with aStringinstead of aHash(which, again, causes conditional eager loading).
So out of the 3 methods, only one of them cannot trigger conditional eager loading: preload. It only does
full eager loading, always the same way.
When is eager_load needed?
The main reason to use eager_load, that I have no alternative for, is ordering by an association's field.
# Ordering posts by created_at of last comment
Post.eager_load(:comments).order("comments.created_at DESC")Maybe some use it to reduce the number of queries when they do eager loading. I don't think it really saves much, and there is a risk of slowing things down by making queries that are heavier.
Some may use it to actually do conditional eager loading. I still heavily disagree with that use case.
I've had to edit code that used this "feature" once...
You look at a method and it looks wrong; it can't be doing what it should be doing. It's using every
project.users, not just those we want! When I did an interactive console there (binding.pry
or byebug), I saw that users were missing from project.users.
Since I knew of this "feature", I started looking and, as expected, a condition on an includes was found...
3 method calls away from where the association was used, not a single comment to explain what is going on anywhere.
You should avoid code that looks wrong. Code that uses conditional eager loading looks wrong. In our case the overall module was already something that we wanted to rewrite from scratch, so this was just another reason to do so.
Other than ordering, I mostly see eager_load used to do a condition (a where) which uses an association. Let's dig into these.
where on an association with eager_load
It's a somewhat frequent need and there are many questions about this on stack overflow.
The bug from the introduction, accidentally doing conditional eager loading, started with such a need: "I want the posts that have comments marked as spam".
You may see a recommendation to use includes, and then have a condition on its table. This actually
uses the eager_load path.
It looks like this:
# Please stop doing this :(
Post.includes(:comments).where(comments: {spam: true})
# which is equivalent to this; don't do this either
Post.eager_load(:comments).where(comments: {spam: true})Again, this does conditional eager loading, which isn't what we asked for.
To be clear, the where on an association with includes / eager_load can be safe. But only if
the association is a belongs_to. When it is, there are only 2 possibilities: either load the
record and the associated belongs_to records, or don't load either. No conditional eager loading is possible.
But even when it's safe, there are risks:
- Using
includes/eager_loadincreases the chance for a mistake, where you or someone else just add another association to the existing eager loading call. - Every time a reader sees
includes/eager_load, he may wonder if it is safe, or if there could be accidental conditional eager loading.
And as a tool, this isn't so great:
- If you don't need the associated records, then eager loading them is wasteful.
- Doesn't handle recursive associations (ex: parent/children)
- Doesn't compose well
- Looks potentially wrong when you know of the conditional eager loading "feature"
where on an association with joins
The next option is to use joins. It also has downsides.
It looks like this:
# Please stop doing this :(
Post.joins(:comments).where(comments: {spam: true})
# and stop doing this
Post.joins(:comments).distinct.where(comments: {spam: true})Using joins like this is better than includes / eager_load since at least, there is no risk of conditionally
loading an association. But there are still problems with it:
- Doesn't handle recursive associations (ex: parent/children)
- Requires a
distinctto avoid duplicated records when used withhas_manyassociations.
This can be unexpected if you're doing a more complex query than j ust fetching records. - Doesn't compose well
where on an association with Arel
Truth is, this need for a where on an association isn't something that ActiveRecord supports well. So
leaving the ActiveRecord only solutions, you can do an actual EXISTS query with Arel. EXISTS is
the SQL tool that is meant to do this type of condition, not JOIN.
It looks like this:
# An OK way, but error prone
Post.where(Comment.where("posts.id = comments.post_id").where(spam: true).arel.exists)This composes much better with other tools because all it does is add a single WHERE clause to the query.
It works as you would expect with or, not and with other conditions on the same association.
But there still are new downsides:
- You must manually write condition to link the
poststo thecomments. It's easy to forget it, and I've seen StackOverflow answers that forgot to do so.
You won't get any error for forgetting, your query will just be wrong, which may not even be obvious if all you have is a little test data. Bonus: This can get extra tedious for polymorphic associations, where you also need to this check:foos.owner_type = #{Bar.base_class.name}. - If a condition was given when defining the association, you must also manually rewrite it.
- Only the models are named in the code, not the association of interest. This makes the intent less clear, especially when non-trivial associations exist.
- Extra work to handle recursive associations (ex: parent/children)
- Quite a bit longer to write, and this is a short example.
Other than writing the whole condition manually, which would have all the problems of the Arel way, but be more verbose and more error-prone, I think we're out of built-in ways.
where on an association with activerecord_where_assoc
What I recommend for conditions based on associations is a gem I made just for this purpose: activerecord_where_assoc. It looks like this:
# Please consider doing this:
Post.where_assoc_exists(:comments, spam: true)
# Or using a scope such as is_spam:
Post.where_assoc_exists(:comments) { is_spam }The query it generates is the same as the Arel example, with the same benefits and more. See for yourself:
- It just adds a single
wherecondition, so it composes well and works withorand with other conditions on the same association. - Handles recursive associations automatically (ex: parent/children)
- Handles polymorphic belongs_to (
includesandjoinswould simply refuse) - Easy to do a
NOTof the condition (I.E.: where no comment is marked as spam) withwhere_assoc_not_exists. - Composes with other such queries, even on the same association, even with negations
- Unlike Arel, this uses the association's name, so the intent is clearer.
So if you need to do this kind of condition, here are some references for my gem:
- Introduction to activerecord_where_assoc.
- The problems of the other ways of doing such conditions.
- Multiple example usages.
There's simply no way I could find to use builtin tools to have this query be clear, succinct and not booby trapped. Either live with the booby traps, write your own methods to do this cleanly, or use one of the gems written for this purpose:
Seriously, try any of them, it's liberating how simple this once complex task becomes.
But includes is everywhere
It is! Let's explore the reasons I can think of.
includes is the "smart" function out of the 3, it will pick the "right" strategy when needed.
Marketing-wise, this sounds like a good thing... Until you learn that the alternate path, eager_load, is
not always what you want and it can cause bugs due to conditional eager loading.
For a long time, includes (and eager_load) were the only way to do a LEFT JOIN
The method left_joins was added in Rails 5.0. Before that, if you wanted one, you had to either do
includes / eager_load, or write the whole "LEFT JOIN" yourself like this: joins("LEFT JOIN
comments ON comments.post_id = posts.id"). The includes shortcut was often suggested.
includes has always been recommended, so most are familiar with it, and most recommend it.
Everything is against preload, even it's documentation makes preload
sound like an alias for includes, and the Rails guide only mentions includes for eager loading data.
I think not enough people were both harmed by includes and aware that you can just specify preload and eager_load for that knowledge to spread.
Recap
Conditional eager loading:
includesandeager_loadcan accidentally eager load only part of an association, a good source of bugs.- Doing conditional eager loading voluntarily can be maintenance burden
- If you do want conditional eager loading, using
eager_loadmakes it a bit more obvious.
Conditions based on associations:
- Using
includesandeager_loadfor conditions based on associations can do conditional eager loading at the same time, you will get bitten by the bugs it can causes. - if you don't need to load the association, eager loading it is wasteful
- Using specialized gems to do conditions based on association is safer, clearer and easier.
Order based on association:
includesandeager_loadare the only simple way.- Using
eager_loadis explicit about the use case, and you don't need to also callreferences.
Regular old eager loading:
- Just use
preload
If you want to run the examples from this post, here is a self-contained ruby script.