Basic Object

Deep thoughts on web programming

Frozen Strings, Symbols, and Garbage Collection in Ruby

I was perusing the paperclip gem’s source code when I came upon the following line (lib/paperclip/interpolations.rb, line 178):

("%09d".freeze % id).scan(/\d{3}/).join("/".freeze)

It’s a useful bit of code for saving attachments to a filesystem, but what struck me immediately was the preponderance of #freeze calls. What the hell is #freeze anyway?

Well, apparently #freeze is relatively common in mature Ruby code. It’s a method of the Object class, and “it prevents further modifications” of an object.

There are a couple of reasons you might want to use it.

1. Making an Object Immutable

Ruby constants are really just variables that you shouldn’t change. Ruby will warn you if you change a constant, but it won’t raise an exception. So, if you really want to have Ruby enforce the constancy of a constant, you can freeze that object.

JEDI_MASTER = "Kenobi".freeze
JEDI_MASTER.prepend("Rey ") # => RuntimeError: can't modify frozen String

Remember that when you assign a frozen object to a variable, this does not prevent the variable from being reassigned to another object. The following code, for example, works just fine:

JEDI_MASTER = "Kenobi".freeze
JEDI_MASTER = "Rey Kenobi"

You can extend this to create a whole class of objects that are constant from birth - just call #freeze at the end of a class’s initialize function definition. This might come in handy if you’re trying to write Ruby in a functional style, or if you need to prevent a programmatically important string (such as a hash key) from being modified by some other code.

2. Performance

As great as this is, this is not why the paperclip developers used #freeze three times in that line at the beginning of this post.

They used it to optimize for performance. Because a frozen object can’t be modified, frozen objects in Ruby are only instantiated a single time for each unique object1 and then retained to be reused in the future. Symbols, fixnums, bignums, and floats are all frozen by default. Strings, however, aren’t, and it’s pretty common for the same string literal - such as the format string in the above example - to be called many, many times in an application.

Therefore, freezing a string literal that you don’t ever intend to modify (such as "/" and the format string "%09d" in the example above) in a method that gets called multiple times can result in substantial memory savings and faster performance. Ruby will allocate those string objects the first time the method is called and then just reference those existing string objects on every subsequent call.

An Immutable Future

For these and other reasons, strings are immutable in several other modern languages, such as Python, Java, C#, and Go, and Ruby is moving in the direction of freezing strings by default, too. Starting with Ruby 2.2, strings used as hash keys are frozen by default, and Ruby 3.0 will freeze all string literals by default. You can see what life is like with frozen string literals right now with the frozen_string_literal pragma, which is available in Ruby 2.3 or later.

With frozen strings, one can take advantage of the wide array of methods available to String objects without incurring the performance penalty of instantiating loads of identical objects all the time. You can also make use of external libraries, in the form of gems, middleware, or whatever, that might force you to use strings in certain places where you might normally prefer symbols. Symbols, for example, are usually used for hash keys, but some software might expect strings are hash keys. Frozen strings bring many of the benefits of using symbols while maintaining compatibility.

Frozen strings are also a good choice for creating objects based on user input. As Richard Schneeman points out in his excellent post about string keys in Ruby 2.0, creating symbols based on user input can lead to a particular sort of denial of service attack and other potential security problems. The JSON gem suffered from this vulnerability before Ruby 2.0.0, and Mr. Schneeman has some other examples of this vulnerability in his post about symbol garbage collection in Ruby 2.2. Ruby now garbage collects dynamically generated symbols, so this vulnerability isn’t an issue in the latest versions of Ruby, but older versions are common enough that this is still a real concern.

Even though Ruby will soon be freezing strings by default, the absolute fastest way to freeze a string is to do it manually. Embrace it. It’s the low-hanging fruit of Ruby optimization. So, when you’ve written an application or a gem or anything in Ruby and you’re looking to speed it up or reduce its memory footprint, look to #freeze.2

  1. This is not strictly true in newer versions of Ruby - dynamically generated symbols (created using #to_sym) are garbage collected.

  2. Mr. Schneeman has written another pretty comprehensive post about #freeze and memory usage that focuses specifically on memory optimization and suggests several tools for profiling your Ruby code for memory usage.