Discussion:
[arangodb-google] Is it ok to use arangodb on a large dense graph
Victor Genin
2018-03-22 14:07:57 UTC
Permalink
We want to present our data in a graph and thought about using one of
graphdbs. During our vendor investigation process, one of the experts
suggested that using graphdb on dense graph won't be efficient and we'd
better off with columnar-based db like cassandra.

I gave your use case some thought and given your graph is very dense
(number of relationships = number of nodes squared) and that you seem to
only need a few hop traversals from the particular node along different
relationships. I’d actually recommend you also try out a columnar database.

Graph databases tend to work well when you have sparse graphs (num of
relationships << num of nodes ^ 2) and with deep traversals - from 4-5 hops
to hundreds of hops. If I understood your use-case correctly, a columnar
database should generally outperform graphs there.

Our use case will probably end up with nodes connected to 10s of millions
of other nodes with about 30% overlap between different nodes - so in a
way, it's probably a dense graph. Overall there will be probably a few
billion nodes.


Looking in Neo4j source code I found some reference of isDense flag on the
nodes to differentiate the processing logic - not sure what that does. But
I also wonder whether it was done as an edge case patch and won't work well
if most of the nodes in the graph are dense.


Does anyone have any experience with graphdbs on dense graphs and should it
be considered in such cases? I saw a few issues over the years both on
Titan and Neo4j threads, which seemed to be taken into considerations by
the development teams. But, again, not sure was is it an edge case patch or
a real scalable solution.


Like here
<https://www.datastax.com/dev/blog/a-solution-to-the-supernode-problem>, "supernode"
is a vertex with a disproportionately high number of incident edges. While
supernodes are *rare* in natural graphs...


In our case it's not so rare.


All opinions are appreciated!
--
You received this message because you are subscribed to the Google Groups "ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arangodb+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
CoDEmanX
2018-03-22 18:31:25 UTC
Permalink
Hi Victor,

can you describe what queries you want to run against your dataset?

Thanks,
Simran
--
You received this message because you are subscribed to the Google Groups "ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arangodb+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...