Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please provide a way to preload ancestry #440

Open
fsateler opened this issue Apr 23, 2019 · 10 comments
Open

Please provide a way to preload ancestry #440

fsateler opened this issue Apr 23, 2019 · 10 comments

Comments

@fsateler
Copy link
Contributor

Suppose I have these models:

class Area < ApplicationRecord
  has_ancestry orphan_strategy: :restrict
  has_many :people

  def full_name
    # there is a name column
    (ancestors.map(&:name) + [name]).join " - "
  end
end

class Person < ApplicationRecord
  belongs_to :area
end

So each person belongs to an area, which are arranged as a tree. Now if I want to print a list with a set of people and the area they belong to, I get N+1 queries:

data = people.preload(:area).map{|p| [p.name, p.area.full_name]}
  Area Load (0.5ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.5ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.4ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.5ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.4ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.4ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.5ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.6ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.6ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.5ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.5ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.4ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.3ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.3ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (991) ORDER BY COALESCE("areas"."ancestry", '') ASC
  Area Load (0.4ms)  SELECT "areas".* FROM "areas" WHERE "areas"."id" IN (1129) ORDER BY COALESCE("areas"."ancestry", '') ASC

It would be great to have a way to tell rails to preload the ancestors of a set of records.

@Morred
Copy link

Morred commented Jun 18, 2019

Not sure if this is exactly the same as the original issue stated above, since it concerns descendants and not ancestors, but seems to me that the concept is the same. I'm happy to open a separate issue though, if that's better.

I'm trying to paginate a set of sibling nodes in an ancestry tree and return their nested descendants together with them.

Say I have a tree like this:
Screen Shot 2019-06-18 at 17 15 02

And what I want to return is the part of the tree boxed in here:
Screen Shot 2019-06-18 at 17 15 11

Now of course I can find the subset of A's children that I want to return, and then for each of them get the descendants and run #arrange_serializable. Looking at a realistic pagination size (i.e. at least 10-20 items per page), that's going to give me a bunch of N+1 queries.

I could also get all the descendants of A in one query, #arrange_serializable them and then pick out the part of the tree that I want to return. That seems quite inefficient as in reality, A has thousands of descendants, so I'd rather not load them all into Ruby and then iterate through them.

So it would be great if I could get the children of A that I need, and preload all their descendants in the same query, so I don't have to do it one by one later (ideally without having to write custom SQL).

@alexanderadam
Copy link

Duplicate of #140

@kbrock
Copy link
Collaborator

kbrock commented Jul 17, 2019

@Morred Do you have nodes C and D?

I think that falls under wanting Model.children_of(models) to be a single query and not N+1.
(not sure the exact issue there - but yes, that would be great to have as well)

@kbrock
Copy link
Collaborator

kbrock commented Sep 6, 2019

I'm looking into improving Model.children_of(model) into not requiring loading C and D.
Hopefully I'll get a good solution this go around

@Morred
Copy link

Morred commented Sep 9, 2019

@kbrock As far as I can see in the readme, there's only Node.children_of(node), so if there's actually a Node.children_of(several_nodes), it would be a great first step to have that in the docs as well :)

Otherwise, yes, having a Node.children_of(several_nodes) that just loads them all at once would be what I'm looking for. Right now I'm doing it manually, one query to find the direct children I want, and then this (replacing the real model name with Example):

def children_with_descendants(child_ids)
  query = child_ids.map do |child_id|
    "(examples.id = '#{child_id}' OR examples.ancestry ILIKE '%#{child_id}%')"
  end.join(' OR ')

  Example.where(query)
end

If that seems in any way useful, I'm happy to dive into the code and attempt a PR, or otherwise I'll let you have a go at it!

@slim1979
Copy link

slim1979 commented Sep 4, 2020

I guess, for now best way to prevent N+1 query is to add associasion like this:

   has_many :child_models, class_name: 'Model', foreign_key: 'parent_id'

You can include child_models in your request with

   .includes(:child_models)

@kbrock
Copy link
Collaborator

kbrock commented Dec 8, 2020

parent_id is not in the database. So at the current moment, this is not possible.

I have looked into providing ways of getting the parent_id and ancestor_ids in sql form, but it was very unweildy.

Currently there is no way of getting Model.children_of(ids)
Also of note, if you add pagination, getting children of multiple nodes is very tricky depending upon your sorting requirements. But I think it is tricky in probably a majority of the cases.

@woto
Copy link

woto commented Dec 15, 2020

Hi all. Here is my solution to get parent in one SQL query:

# == Schema Information
#
# Table name: teams
#
#  id                                       :uuid             not null, primary key
#  ancestry                                 :text
#  name                                     :string
class Team < ApplicationRecord
  has_ancestry(orphan_strategy: :restrict, primary_key_format: %r{\A[\w\-]+(/[\w\-]+)*\z})

  def self.with_parents
    joins("INNER JOIN teams AS parents on UUID(SUBSTR(teams.ancestry, STRPOS(teams.ancestry, '/') + 1)) = parents.id")
  end
end

Data:

root1 = Team.create!(company_id: company_id, name: 'root1', id: '361da83d-5c20-4112-b1f6-2ccda3eafe05')
t1 = Team.create!(company_id: company_id, name: 't1', parent: root1, id: '5b243c45-d589-4ea0-8446-0d8b25839009')

And finally request

[25] pry(main)> Team.where(id: '5b243c45-d589-4ea0-8446-0d8b25839009').select('teams.id AS id, parents.id AS parent_id').with_parents.first.attributes
=> {"id"=>"5b243c45-d589-4ea0-8446-0d8b25839009", "parent_id"=>"361da83d-5c20-4112-b1f6-2ccda3eafe05"}

Which issues one SQL query:

2020-12-15 21:36:35.959934 D [23584:12260] ActiveRecord::Base --   Team Load (3.6ms)  SELECT teams.id AS id, parents.id AS parent_id FROM "teams" INNER JOIN teams AS parents on UUID(SUBSTR(teams.ancestry, STRPOS(teams.ancestry, '/') + 1)) = parents.id WHERE "teams"."id" = $1 ORDER BY "teams"."id" ASC LIMIT $2  [["id", "5b243c45-d589-4ea0-8446-0d8b25839009"], ["LIMIT", 1]]

I've searched how to make it with includes(...) but as I understood it's impossible(?)

But later decided to use simpler and most obvious way (because we do not have much data) like this:

teams = Team.where(id: [])
parent_ids = teams.map(&:parent_id)
parents = Team.find(parent_ids)

which issues two SQL queries.
And later in presenter restore relation between them like:

...
found = parents.find { |parent| team.parent_id == parent.id }
...

Still interested if is there a way to create custom relations has_one, belongs_to, has_many with custom SQL?!

@kbrock
Copy link
Collaborator

kbrock commented Mar 11, 2023

rails 1 and 2 used to allow us to specify find sql.
But I am actively trying to find a way to define relations using arel.

I know this doesn't address your problem, but you may be able to reduce data load on teams.
teams = Team.where(id: []).select(:id, :ancestry).map(&:parent_id)

I'm trying to remove our use of to_node and fixing this specific use case.

@kbrock
Copy link
Collaborator

kbrock commented Nov 6, 2023

of note, we do have methods for children_sql, I hope this may provide some use for now.
If you can find a way to solve this with those, that would be great and useful to us all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants