DISQUS

sean cribbs: Modeling a Tree in a Document Database - sean cribbs :: digital renaissance man

  • Professor Sideburns · 2 months ago
    Since when is requiring an index a con? Trying to do this with one hand tied behind your back? :-D
  • Sean Cribbs · 2 months ago
    I'm just saying that document databases have different notions about how an index should be defined, and I was trying to reason about how to model this problem in the most idiomatic fashion for the datastore. Most of these patterns wouldn't work in CouchDB unless you have a view defined, period. Well... they'd "work", but you'd be loading up a lot of documents just to find a few that you need.
  • mlmilleratmit · 1 month ago
    We've worked this out for customers, and the best solution I've seen in production is for a single document to represent a parent-child edge:

    {
    "node" : "some_id",
    "parent": "parent_id",
    "mpath": [parent_id, that_nodes_parent,....,root]
    }

    The material path is the killer, because then a mapreduce view in couch will allow you to emit for each node in the material path to get aggregate counts for any sub-tree in the graph (assumes acyclic).
  • jnunemaker · 3 weeks ago
    Yep, we are doing something similar. We store parent_id and parent_ids which is an array in mongo of all the ancestors up the tree. The benefit with mongo is you can index that and query on it to get all descendants as well.
  • mlmilleratmit · 3 weeks ago
    Exactly, I was a bit too short with my comment. We just build an index using:

    for (node in doc.mpath) {
    emit( node, <something_to_aggregate>);
    }

    and then you've got a fully traversable index to query with start/endkey. Augmented with a reduce function you can also then get summary statistics (e.g. total descendant count, tree level, etc). We actually do a bunch of stuff in a single view by using complex keys:

    emit( ["descendant_count", node], 1);
    emit( ["daughters_at_depth", node_level], 1);
    ...

    This is such a natural pattern that we should likely build it into couch as a default tree view
  • Hogan Long · 19 hours ago
    Make a heap.

    { "id:a","id:b", "id:c", "id:d", "null", "null", "null" }
    Working with heaps is easy.
    -- For those that don't know, child-right is index*2, child-left is index*2+1, parent is floor(index/2)
    (if it gets to big, make heap of heaps... say 255 nodes or something)
  • Sean Cribbs · 16 hours ago
    Hogan, can you elaborate? You can only have one value per key in an JSON object, so I don't think I understand your solution.