Discussion:
[arangodb-google] Creating Edges/Relationships
Nigel Budden
2018-05-25 01:23:40 UTC
Permalink
Hi All,
I'm trying to get to know ArangoDB and using the Movielens 20M data set
to play with.

I've imported the Movies.csv to *Movies *Collection, Genome_Tags.csv to *Tags
*and Tags.csv to *TagLinks*

I've created an Empty Edge Collection called *MyTags *and I'm trying to
create links between *Movies *and *Tags *based on what is found in the *TagLinks
*Collection.

I'm attempting the following Query, however being new to AQL I have a
feeling (given the crazy long query running time so far) that I've got
something wrong:

For tl in TagLinks
For m in Movies
For t in Tags
Filter m.movieId == tl.movieId And Lower(t.tag) == Lower(tl.tag)
Insert { _from: m._id, _to: t._id } IN MyTags

Thanks for any insight into my conundrum.
--
You received this message because you are subscribed to the Google Groups "ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arangodb+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Nigel Budden
2018-05-25 19:17:22 UTC
Permalink
Hi Nigel,
did you define any indexes? For movies.csv, it would probably make sense
to use this as _key. The primary index is over this attribute, thus sparing
you an additional index.
Another way would be to assume that tags are written in the same case or
update them to be all lowercase for instance, then get rid of the function
calls to LOWER() and create an index on movieId,tag in collection TagLinks.
filter m.movieId == tl.movieId And t.tag == tl.tag
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
3 EnumerateCollectionNode 10000 - FOR m IN Movies /* full
collection scan */
4 EnumerateCollectionNode 100000000 - FOR t IN Tags /* full
collection scan */
m : Movies, t : Tags */
9 IndexNode 100000000 - FOR tl IN TagLinks /*
hash index scan, scan only */
8 InsertNode 0 - INSERT #6 IN MyTags
By Type Collection Unique Sparse Selectivity Fields
Ranges
9 hash TagLinks false false 91.46 % [ `movieId`,
`tag` ] ((m.`movieId` == tl.`movieId`) && (t.`tag` == tl.`tag`))
Hi Simran,
I did a bit of tweaking of my indexes as you suggested, along with fixing
the casing and that did the job, thanks for the suggestion.
--
You received this message because you are subscribed to the Google Groups "ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arangodb+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Continue reading on narkive:
Loading...