Dropping elements left and right: fun with ruby arrays
Let’s say we have an array of hashes, that represents a time series:
DATA_POINTS = [
{ date: '2016-01-01', value: nil },
{ date: '2016-02-01', value: nil },
{ date: '2016-03-01', value: 100 },
{ date: '2016-04-01', value: nil },
{ date: '2016-05-01', value: 300 },
{ date: '2016-06-01', value: nil },
{ date: '2016-07-01', value: nil }
]
What we want to do with it is to remove all elements with a nil
value from the array’s beginning and end, similar to what String#strip
does for strings.
Ruby arrays have a bunch of neat methods, one of them is Array#drop_while, it drops elements from the beginning of an array until the passed in block evaluates to false
or nil
. Let’s try it on our array:
DATA_POINTS.drop_while {|element| element[:value].nil?}
=> [
{ date: '2016-03-01', value: 100 },
{ date: '2016-04-01', value: nil },
{ date: '2016-05-01', value: 300 },
{ date: '2016-06-01', value: nil },
{ date: '2016-07-01', value: nil }
]
That’s half the work done! Now we only need to somehow remove the empty elements at the end of the array. The simplest thing that comes to mind is reversing the array and applying drop_while
again:
no_value = proc {|element| element[:value].nil?}
DATA_POINTS.drop_while(&no_value).reverse.drop_while(&no_value).reverse
=> [
{ date: '2016-03-01', value: 100 },
{ date: '2016-04-01', value: nil },
{ date: '2016-05-01', value: 300 }
]
That gets us to where we want to be, but the implementation feels a bit ugly (and reversing the array twice does not feel very optimal).
Warning: What follows is not something you would usually use in production code, but we’re just having fun, right? ?
In JS-land lodash has dropRightWhile, which drops elements from the end of an array. We can implement it in ruby using refinements:
module ArrayExtensions
refine Array do
def drop_right_while
self.reverse.drop_while { |el| yield(el) }.reverse
end
end
end
Now that we can drop elements from the end of an array we can finally implement what we initially wanted – a method to drop elements from both sides, let’s call it Array#strip
:
module ArrayExtensions
refine Array do
def strip
self.drop_while { |el| yield(el) }.drop_right_while { |el| yield(el) }
end
end
end
And here is how we would use it:
class MyClass
using ArrayExtensions
def strip_data_points
DATA_POINTS.strip {|element| element[:value].nil?}
end
end
MyClass.new.strip_data_points
=> [
{ date: '2016-03-01', value: 100 },
{ date: '2016-04-01', value: nil },
{ date: '2016-05-01', value: 300 }
]
Getting rid of reverse
Our drop_right_while
implementation still seems a bit inefficient with all the reverse
s. Let’s see if we can do better! The most straightforward implementation seems like a more optimal solution: we iterate from the end of the array while the block evaluates to a truthy value and then return the array without unwanted elements:
module ArrayExtensions
refine Array do
def drop_right_while
elements_to_drop = 0
for index in (length - 1).downto(0)
if yield(self[index])
elements_to_drop += 1
else
break
end
end
last_index = length - elements_to_drop - 1
self[0..last_index]
end
end
end
This is a bunch more code, but it should be faster, right? Time to benchmark!
Using the awesome benchmark-ips gem we compare the one-liner (drop_right_while_slow
) with our new implementation (drop_right_while_fast
). We get the following results for an array of 10 000 elements, 20% of which need to be dropped (the full benchmark script can be found here):
Calculating -------------------------------------
drop_right_while_slow 355.000 i/100ms
drop_right_while_fast 352.000 i/100ms
-------------------------------------------------
drop_right_while_slow 3.721k (± 4.4%) i/s - 18.815k
drop_right_while_fast 3.526k (± 2.4%) i/s - 17.952k
Comparison:
drop_right_while_slow: 3720.9 i/s
drop_right_while_fast: 3526.4 i/s - 1.06x slower
Wait.. what? Our “fast” version is 6% slower?
That can’t be right! So we do a bunch more benchmarks (we measure iterations per second for the percentage of elements we need to drop, more is better):
It turns out that the “fast” version is only faster if an array has less than ~15% of empty elements at the end! This is not that surprising when you think about it, both reverse
and drop_while
are implemented in C and our “fast” version of drop_right_while
is pure Ruby.
This is a good reminder that you should not do any optimisation work without testing your assumptions. Always be benchmarking! 😉