Leaking credit card number in log file

Prevent Logging Sensitive Data with Rails Parameter Filters

This article explains why you shouldn't log confidential or user-identifiable information and how to filter it using parameter filtering in Rails. We'll also do a deep dive into the Rails source code to learn exactly how Rails implements parameter filters.

10 min read
He (Paul Allen) and Bill (Gates) would go “dumpster diving” in C-Cubed’s garbage to find discarded printouts with source code for the machine’s operating system — an area Paul saw as the “Holy Grail” of software, which he knew he could use to unlock the machine’s potential. - Futurist

Most applications have sensitive data they'd like to keep out of reach from prying eyes. It can be something as obvious as passwords and credit card numbers, or something less obvious but equally important, like Stripe API keys.

Exposing this information in log files is a very common way to leak sensitive data accidentally. Developers often like to log everything for debugging or troubleshooting purposes. Additionally, frameworks like Ruby on Rails log a lot of information at various stages of the request lifecycle. If you're not careful, it's very easy to leak any sensitive data in the log files.

Here's an example where the application logs out the user's password along with other information after they complete and submit the user registration form:

Exposing Password in Log
Exposing Password in Log

If some sneaky intruder gets their hands on your log files, all they gotta do is sift through the logs, and bam! Your application's and your users' private information is at risk. To prevent this, Rails lets you filter certain sensitive information from leaking into the logs. This article shows you how.

Here’s a list of topics we'll explore:

Why You Shouldn't Log Sensitive Stuff

💡
It's very easy to accidentally leak confidential information through logs.

Imagine a user is making a payment on your web application. Once they enter their credit card details and hit submit, the website sends all this information in plain text to your server.

Upon receiving this request, the application logs information about this request, such as the user's IP address, user id, name, form data such as payment information, and much more.

These log files are handy when debugging the application, but in production, you may not want every bit of information to be stored in the log file.

Here're three reasons against logging sensitive information.

  1. It makes an attacker's life easy. Since most logs are stored in plain text files for readability, if your servers are compromised and an attacker gets access to the log files, they can just search the file for anything they need.
  2. Data Privacy & Curious Employees. It doesn't have to be hackers. It could just be a curious employee with access to the logs who might be interested in looking into customers' sensitive data. Even if the employee may not have bad intentions, it can land you in a data-privacy hot zone for exposing user-identifiable personal information.
  3. They are leaking to a third-party app. Many applications use error tracking or performance monitoring systems such as AppSignal, New Relic, or Sentry. Whenever a request is made to your application, these tools grab all the user-submitted data and send it to their servers. This data could contain user-identifiable information.

For these reasons, you should prevent sensitive information from leaking into the log files.

Here're some possible solutions to address this problem.

  1. Manual Code Review: A basic, but tedious way is to review your codebase to find all the places where you're logging information and filter it to log only essential information, stripping out everything else.
  2. Automate: A better solution is to make your logger check for the presence of specific sensitive attributes on the data and replace them with `***`. This is better, but it's still a lot of work.
  3. Configure: The best solution is to program and abstract it via configuration. That way, you list the sensitive information in one location (the configuration file), and don't have to worry about it ever.

If you're using Ruby on Rails, you're in luck, because that's what Rails does for you via parameter filtering.

Let's understand how it works.

How to Filter Logs in Rails with Parameter Filter

Ruby on Rails makes it very easy to filter sensitive information from log files by providing a central location where you can configure this information.

For example, if you want to hide attributes named password and credit_card, add this line in the filter_parameter_logging initializer under the config/initializers directory.

# config/initializers/filter_parameter_logging.rb

Rails.application.config.filter_parameters += [
  :password, :secret, :token, :_key, :crypt, :salt, :certificate, :otp, :ssn
]

This tells Rails to replace all instances of the provided parameters with the word [FILTERED] and hide their real values.

Let's reconsider our previous example. This time, when the user submits the payment form, Rails will still log the user's id and name, but it will replace the password parameter with the word FILTERED and its value won't appear in the logs.

Here's what the logs look like after we use parameter filtering to filter out the password and authentication tokens. Notice that the actual values are replaced with [FILTERED] mask.

Parameter Filtering in Rails
Parameter Filtering in Rails

Selectively Filter Attributes with the Same Name

Here's a tricky situation. Let's say you have an attribute named number for both Post and CreditCard models. The Post number is public and need not be hidden, but the Credit Card number is very sensitive information you want to hide.

Simply adding number to the filter_parameters array will hide all numbers, which is not what you want. You still want to access Post numbers for debugging.

You can solve this problem by scoping the number attribute to the CreditCard model.

config.filter_parameters += [ "credit_card.number" ]

Now Rails will only filter the credit card numbers without affecting the post numbers.

Additionally, you can pass a block (a function) to the filter_parameters array, giving you more control and flexibility while filtering. The following example retains the credit card number format while replacing the numbers with '*'.

# config/initializers/parameter_filtering.rb
 
Rails.application.config.filter_parameters += ->(key, value) { value.gsub!(/\d/, "*") if /credit_card/.match?(key) }

Pretty cool, right? But wait, it gets even better!

Filter Attributes on ActiveRecord Models

You can use the filter_attributes method on your ActiveRecord models to tell Rails which columns should not be exposed.

class User < ApplicationRecord
  self.filter_attributes += [ :secret, :token ]
end

Now, when you inspect a user, Rails will hide the values of the above attributes:

> User.first
#=> #<id: 1, name: 'Akshay', secret: "[FILTERED]", token: "[FILTERED]">

If you're curious about how Rails implements this feature, keep reading.

How Rails Implements Parameter Filtering

Before reading this, I encourage you to check my previous article, A Brief Introduction to Rails Initializers. Understanding how initializers work in general will help you better understand the parameter_filtering initializer.

A Brief Introduction to Rails Initializers: Why, What, and How
At first glance, Rails initializers seem complex, but they’re solving a simple, but important problem: run some code after framework and gems are loaded, to initialize the application. This post covers the basics of initializers, including what they are, how they work, and how Rails implements them.

Let's revisit our code to define sensitive attributes in the configuration. We added the following code in the filter_parameter_logging initializer under the config/initializers directory.

config.filter_parameters += [ "password" ]

The ActiveRecord::Railtie class (a fancy word for a class that initializes ActiveRecord) grabs the filter_parameters array from the config, and pushes it to the filter_attributes array.

# activerecord/lib/active_record/railtie.rb

module ActiveRecord
  class Railtie < Rails::Railtie
    initializer "active_record.set_filter_attributes" do
      ActiveSupport.on_load(:active_record) do
        self.filter_attributes += Rails.application.config.filter_parameters
      end
    end
  end
end

The filter_attributes array is defined in the ActiveRecord::Core module that gets included in the ActiveRecord::Base class, the superclass of ApplicationRecord, the class from which all your active record domain classes inherit.

# activerecord/lib/active_record/core.rb

module ActiveRecord
  module Core
    extend ActiveSupport::Concern
    
    included do
      self.filter_attributes = []
    end
  end
end

The ActiveRecord::Core module is a concern and adds the filter_attributes array on the ActiveRecord::Base class.

Don't really understand concerns? Check out this article to learn more about concerns and how they work.

Concerns in Rails: Everything You Need to Know
Concerns are an important concept in Rails that can be confusing to understand for those new to Rails as well as seasoned practitioners. This post explains why we need concerns, how they work, and how to use them to simplify your code.

Okay, so far we've seen how Rails added the filter_attributes array to your models' superclass. Now let's inspect how Rails uses this filter_attributes array for filtering sensitive information.

All the methods dealing with this array are included under the ClassMethods module. Hence they become class methods on the ActiveRecord::Base class.

module ActiveRecord
  module Core
    extend ActiveSupport::Concern

    included do
      self.filter_attributes = []
    end
    
    module ClassMethods
      def filter_attributes 
      end

      def filter_attributes=(filter_attributes)
      end

      def inspection_filter
      end
    end
  end
end

Let's understand how they work in two simple steps:

Step 1: Set the filter_attributes, which specifies columns which shouldn't be exposed. Here, we initialize the class instance variable @filter_attributes.

def filter_attributes=(filter_attributes)
  @inspection_filter = nil
  @filter_attributes = filter_attributes
end

Step 2: The inspection_filter method creates an instance of the ActiveSupport::ParameterFilter class, which is responsible for filtering the parameters. The mask is set to [FILTERED] by default.

Here's the simplified implementation of this method.

def inspection_filter
  @inspection_filter ||= begin
    mask = InspectionMask.new(ActiveSupport::ParameterFilter::FILTERED)
    ActiveSupport::ParameterFilter.new(@filter_attributes, mask: mask)
  end
end

Whenever Rails needs to filter sensitive information, it uses the inspection_filter method. There're only three places in the entire Rails codebase where inspection_filter is used.

Here's the highly simplified version of these methods.

# activerecord/lib/active_record/core.rb
def pretty_print(pp)
  value = inspection_filter.filter_param(attr_name, value)
end

# activerecord/lib/active_record/log_subscriber.rb
def filter(name, value)
  ActiveRecord::Base.inspection_filter.filter_param(name, value)
end

# activerecord/lib/active_record/attribute_methods.rb
def format_for_inspect(name, value)
  inspection_filter.filter_param(name, value)
end

Whenever you try to pretty print, log, or inspect an ActiveRecord model, Rails will use the inspection_filter to hide the value of the sensitive attributes.

We're not done yet. So far, we've been looking at the infrastructure code, i.e., how Rails defines and uses the filtered array of attributes. Now let's try to understand the core of the solution, the ActiveSupport::ParameterFilter class.

How ParameterFilter Class Works

The ActiveSupport::ParameterFilter class replaces the values for specified keys from a hash-like object, if the keys match one of the pre-defined filter.

💡
Matching based on nested keys is possible by using dot notation, e.g. "credit_card.number".

To work with it, initialize an instance of this class with an array of keys and then call either filter(params) or filter_param(key, value) on this instance. These methods replace the sensitive values with a mask, which is [FILTERED] by default.

  • The filter(params) method replaces all values corresponding to filters with the mask
  • The filter_param(key, value) replaces the value with a mask if the key belongs to the filters list, which is provided during the initialization.

Here's the external API of the ParameterFilter class.

module ActiveSupport
  class ParameterFilter
    # default mask
    FILTERED = "[FILTERED]"
    
    def initialize(filters = [], mask: FILTERED)
    end
    
    # replace all values corresponding to filters with the mask
    def filter(params)
    end
    
    # replace the value with mask if key belongs to filters 
    def filter_param(key, value)
    end
  end
end

The following test shows how it works:

test "parameter filter" do
  parameter_filter = ActiveSupport::ParameterFilter.new(['foo'])
    
  assert_equal ({ "foo" => "[FILTERED]", "bar" => "baz" }), parameter_filter.filter({ "foo" => "bar", "bar" => "baz" })
  assert_equal "[FILTERED]", parameter_filter.filter_param("foo", "bar")
end

You might be surprised to find that ParameterFilter class can also accept a block/lambda as one of the filters. During replacement, this block will be called with the provided key-value pair, and you can replace the value in place.

This example replaces the credit card numbers while preserving their format.

test "parameter filter with block" do
  filter_params = []
  filter_params << ->(key, value) { value.gsub!(/\d/, "*") if /credit_card/.match?(key) }
    
  parameter_filter = ActiveSupport::ParameterFilter.new(filter_params)

  assert_equal ({ "credit_card" => "**** ****" }), parameter_filter.filter({ "credit_card" => "9999 9999" })
  assert_equal "*****", parameter_filter.filter_param("credit_card", "12345")
end

Internally, the ParameterFilter class uses the CompiledFilter class to filter the values with the mask. It accepts the filters and a mask to build a list of regular expressions that replace the values corresponding to filters with the provided mask.

And that's how Rails implements the parameter filtering feature to hide sensitive information. If you haven't used it, give it a try!

Summary

  • You should prevent sensitive, user-identifiable information from leaking into the log files. The log files are typically stored in plain text, and it's very easy to reveal confidential data accidentally.
  • The parameter filtering feature in Rails allows you to filter important information from the logs.

If you'd like to read similar deep-dives into the Rails codebase, check out the Rails Internals tag on the blog. I love reading the Rails source code and frequently write new articles that show how Rails implements a certain feature.

Rails Internals - Write Software, Well
Gain a better and deeper understanding of the Ruby on Rails framework by exploring how it works behind the scenes. Each post in this series takes a feature in Rails and shows how it’s implemented behind the scenes.

That's a wrap. I hope you found this article helpful and you learned something new.

As always, if you have any questions or feedback, didn't understand something, or found a mistake, please leave a comment below or send me an email. I reply to all emails I get from developers, and I look forward to hearing from you.

If you'd like to receive future articles directly in your email, please subscribe to my blog. If you're already a subscriber, thank you.