I had a little trouble initially grokking riak-client’s options for link walking. There is a wiki page on it on their github and some api docs on it, but I still couldn’t get my mind around the options. Here is a mini walkthrough that might be helpful if you are having the same problem.
Fire up IRB or your REPL of choice and load riak-client, then connect to your local riak server or wherever you want:
I created four objects to toy with, a through d. Note at least when using protocol buffers, riak-client doesn’t let you leave data blank, so I set that to the simplest json doc I could, all we need is the riak links:
1 2 3 4 5 6 7 8 9 10
Then I chained them together with some simple links. This part is pretty easy, thanks to the #to_link method on riak-client’s RObject:
1 2 3
So you can get to c from a through the tag “foo” but you can’t get all the way to d, as that is tagged with “bar”.
Last part of setup, store all your objects:
1 2 3 4
There is a shortcut #walk method on RObject, though you can also call #walk from the client itself. I like the shortcut, and at least one use of it is reasonably clear:
So this means walk the links on RObject a, keeping all results to the bucket test, but following all tags (the _ character is the wildcard in riak’s scheme). The true means to return the results of this walk leg, which was one of the parts that was confusing for me at first but makes more sense later.
The returns make sense: it returns b but not c which is one more link away. It only walks the first link, in other words. But why are the results an array of an array? It makes it seem like the return could be two dimensional, but how to achieve that wasn’t real obvious at first.
Note that the api docs do make clear there is an alternate syntax here, using a hash:
Which is nice because its more explicit.
See that syntax and reading the source for the #normalize method on WalkSpec had things making more sense to me. To walk two links out from your current node, just pass it to hashes, both of which will be turned into WalkSpecs:
Now the keep: true adds up, as does the double array. If the first walk had returned more than 1 result, then the second level of the walk would now be branching, and two dimensions of results would be returned. If we wanted to get to c from a but not return b:
And if we want to get to d:
But that only works because we have allowed for any tag, remember we tagged the link to d with bar so if we try:
I might write a larger test script and test data set to really play with multiple levels of walking and those nested results, but overall the logic of #walk is a lot clearer to me now.