[arangodb-google] Request database optimisation

Killian Janod

2018-11-16 10:02:40 UTC

Hi Simran, Thank's a lot for answering my questions.
I'm sorry I'have been slow to answer back.

There was a copy/paste mistake in the Gist. That's why there was an
undefined gfi_s. The Gist has been update to reflect better the query.
https://gist.github.com/Waateur/6c8c0b2c40d0dfa08cecfe780275cd9f

The collect bellow usually return under 5 results
COLLECT code=b.code, category_id=b.category_id INTO code_group

You are right about this filter FILTER LENGTH(a) == 1 is used to remove b
documents that have no match in a.

About the data distribution this is what i have in mind
A contain 31,025,557 docs and B 20,785,230 doc.
90% of A is link to one or more doc in B
60% of A is link to more than one B. ( mostly 3 and almost never over 5 )
20% of B is link to more than A. ( usually 2 or 3 )
These kind of duplicate are differentiated by the delivery string which is
a "date of upload" information.

As for example of documents, they were added to the gist.
I don't know if any other information can be useful.

Best,
Killian

Hi Killian,
it would be helpful to see your exact index definitions, some example
documents and to know a bit about the data distribution.
How many documents end up in one group here on average?
COLLECT code=b.code, category_id=b.category_id INTO code_group
What do you actually want to return here? "gfi_s" is not defined in the
FOR a_tmp IN A
...
RETURN gfi_s
What is this for? The sub-query is limited to one result anyway. Is this
for the case of no match or a missing attribute (gfi_s, see above)?
FILTER LENGTH(a) == 1
Best, Simran
--
You received this message because you are subscribed to the Google Groups "ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.

--
Killian Janod
Datascientist @ iSmart / Kware
***@kware.fr
***@ismart.fr <***@kware.fr>
+33 (0) 6 61 33 34 76

--
You received this message because you are subscribed to the Google Groups "ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arangodb+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.