While programming, it's easy to get by with a superficial understanding of lot of things and fooling yourself by thinking that you are programming when you are blindly copy+pasting Stack Overflow answers and/or using the framework libraries.

I am guilty of this, too. There are many times during the day when I will come across a new concept when programming, that's what makes programming fun. At these moments, the temptation is to just use that nice framework method to get the job done, or just copy the answer from a Stack Overflow question. But that robs you from using a better implementation or discovering bugs, if there are any, from that SO answer. It also doesn't build your understanding of that concept. Next time you run into similar problem, you have to repeat the process.

Base64 encoding was one of these topics that was bugging me for a while. I often came across Base64 data, such as Base64 encoded image, or URL, and had no idea whatsoever that meant, or why it was even used. Finally, I decided to do some research to fill that knowledge gap. Here's a brief summary of what/why/how of Base64 encoding.

Basically, Base64 encoding takes binary data and converts it into text, specifically ASCII text. The resulting text contains only letters from A-Z, a-z, numbers from 0-9, and the symbols '+' and '/'. As there are 26 letters in the alphabet, we have 26 + 26 + 10 + 2 = 64 characters, hence the name Base64.   

As there are only 64 characters available to encode into, we can represent them using only 6 bits, because 2^6 = 64. Every Base64 digit represents 6 bits of data. Now, there are 8 bits in a byte, and the closest common multiple of 8 and 6 is 24. So 24 bits, or 3 bytes can be represented using 4 6-bit Base64 digits. For example, the text
My name is Akshay
can be encoded as a Base64 string 
TXkgbmFtZSBpcyBBa3NoYXk=
The next question is why do we do this encoding. The most obvious use case is when we have to transmit some binary data over the network that's supposed to handle text. Base64 can also be used for passing data in URLs when that data includes non-url friendly characters.

That is all well and good, but how do you actually convert data to Base64? Here's a simple algorithm that converts some text into Base64:
  1. Convert the text to its binary representation. 
  2. Divide the bits into groups of 6 bits each.
  3. Convert each group to a decimal number from 0-63. It cannot be greater than 64 as there are only 6 bits in each group. 
  4. Convert this decimal number to the equivalent Base64 character using the Base 64 alphabet. 
That's it. You have a Base64 encoded string. If there're insufficient bits in the final group, you can use '=' or '==' as padding. 

Here's an example that converts my name "Akshay" to its Base64 equivalent string. 

  • Convert "Akshay" to binary, which looks like:
01000001 01101011 01110011 01101000 01100001 01111001
  • Divide the bits into groups of 6 bits in a group, instead of of 8 bits
010000 010110 101101 110011 011010 000110 000101 111001
  • Convert each group to a decimal number
16 22 45 51 26 6 5 57
  • Now use the above Base 64 alphabet to convert each group to its Base 64 representation
QWtzaGF5
Here is a program in C# that takes some text as input and converts it into Base-64 encoded string.
public static string ToBase64(string value)
{
    byte[] bytes = Encoding.ASCII.GetBytes(value);

    string base64 = Convert.ToBase64String(bytes);

    return base64;
}
For more information, see the RFC for Base64 which describes the encoding in detail. Also, next time you meet me, don't hesitate to call me "QWtzaGF5", instead of Akshay :)