Dropping elements left and right: fun with ruby arrays

Mikhail Topolskiy April 14, 2016

#

Let’s say we have an array of hashes, that represents a time series:

DATA_POINTS = [
  { date: '2016-01-01', value: nil },
  { date: '2016-02-01', value: nil },
  { date: '2016-03-01', value: 100 },
  { date: '2016-04-01', value: nil },
  { date: '2016-05-01', value: 300 },
  { date: '2016-06-01', value: nil },
  { date: '2016-07-01', value: nil }
]

What we want to do with it is to remove all elements with a nil value from the array’s beginning and end, similar to what String#strip does for strings.

Ruby arrays have a bunch of neat methods, one of them is Array#drop_while, it drops elements from the beginning of an array until the passed in block evaluates to false or nil. Let’s try it on our array:

DATA_POINTS.drop_while {|element| element[:value].nil?}
=> [
      { date: '2016-03-01', value: 100 },
      { date: '2016-04-01', value: nil },
      { date: '2016-05-01', value: 300 },
      { date: '2016-06-01', value: nil },
      { date: '2016-07-01', value: nil }
   ]

That’s half the work done! Now we only need to somehow remove the empty elements at the end of the array. The simplest thing that comes to mind is reversing the array and applying drop_while again:

no_value = proc {|element| element[:value].nil?}
DATA_POINTS.drop_while(&no_value).reverse.drop_while(&no_value).reverse
=> [
      { date: '2016-03-01', value: 100 },
      { date: '2016-04-01', value: nil },
      { date: '2016-05-01', value: 300 }
   ]

That gets us to where we want to be, but the implementation feels a bit ugly (and reversing the array twice does not feel very optimal).

Warning: What follows is not something you would usually use in production code, but we’re just having fun, right? 😃

In JS-land lodash has dropRightWhile, which drops elements from the end of an array. We can implement it in ruby using refinements:

module ArrayExtensions
  refine Array do
    def drop_right_while
      self.reverse.drop_while { |el| yield(el) }.reverse
    end
  end
end

Now that we can drop elements from the end of an array we can finally implement what we initially wanted – a method to drop elements from both sides, let’s call it Array#strip:

module ArrayExtensions
  refine Array do
    def strip
      self.drop_while { |el| yield(el) }.drop_right_while { |el| yield(el) }
    end
  end
end

And here is how we would use it:

class MyClass
  using ArrayExtensions

  def strip_data_points
    DATA_POINTS.strip {|element| element[:value].nil?}
  end
end

MyClass.new.strip_data_points
=> [
      { date: '2016-03-01', value: 100 },
      { date: '2016-04-01', value: nil },
      { date: '2016-05-01', value: 300 }
   ]

Getting rid of reverse

Our drop_right_while implementation still seems a bit inefficient with all the reverses. Let’s see if we can do better! The most straightforward implementation seems like a more optimal solution: we iterate from the end of the array while the block evaluates to a truthy value and then return the array without unwanted elements:

module ArrayExtensions
  refine Array do
    def drop_right_while
      elements_to_drop = 0
      for index in (length - 1).downto(0)
        if yield(self[index])
          elements_to_drop += 1
        else
          break
        end
      end
      last_index = length - elements_to_drop - 1

      self[0..last_index]
    end
  end
end

This is a bunch more code, but it should be faster, right? Time to benchmark!
Using the awesome benchmark-ips gem we compare the one-liner (drop_right_while_slow) with our new implementation (drop_right_while_fast). We get the following results for an array of 10 000 elements, 20% of which need to be dropped (the full benchmark script can be found here):

Calculating -------------------------------------
drop_right_while_slow  355.000  i/100ms
drop_right_while_fast  352.000  i/100ms
-------------------------------------------------
drop_right_while_slow  3.721k (± 4.4%) i/s -     18.815k
drop_right_while_fast  3.526k (± 2.4%) i/s -     17.952k

Comparison:
drop_right_while_slow:     3720.9 i/s
drop_right_while_fast:     3526.4 i/s - 1.06x slower

Wait.. what? Our “fast” version is 6% slower?

That can’t be right! So we do a bunch more benchmarks (we measure iterations per second for the percentage of elements we need to drop, more is better):

It turns out that the “fast” version is only faster if an array has less than ~15% of empty elements at the end! This is not that surprising when you think about it, both reverse and drop_while are implemented in C and our “fast” version of drop_right_while is pure Ruby.

This is a good reminder that you should not do any optimisation work without testing your assumptions. Always be benchmarking! 😉

Subscribe to the Visible Newsletter

Every week we share the best resources we've come across & what's new with Visible!