Generating Data URIs with Haskell

To make one of my projects distributable as a single script, I needed to use base64 encoded data URIs for the image assets it depends on. Most of the tools I found for generating data URIs were web services, while I was looking for a command line tool to avoid having to upload my assets. I decided to build one in Haskell.

You can get the binary and see the full source on Github: https://github.com/cgag/data-uri

A data URI is essentially a MIME type plus (usually) base64 encoded data for the asset, usually an image.

There are nice libraries for both figuring out MIME types from file extensions and doing base64 encoding. Using them together was easy, only tedious thing was converting between ByteStrings, Text, and Strings.

I’m coming from dynamic languages like Clojure and Ruby, so I’m used to more documentation and usage examples than you typically find on Hackage, but I was surprised how clear it was to just compose functions based on types.

The meat of this tool lives in the fromPath function, which receives a filepath as a String, and returns an IO String containing the data URI.

fromPath :: String -> IO DataUri
fromPath path = do
  let mime = defaultMimeLookup $ T.pack path
  file <- B.readFile path
  let encoded = encode64 file
  let rawUri = dataUri mime encoded
  let uri = if isImg mime
              then toImgTag $ rawUri
              else rawUri
  return uri

This determines the MIME type from the filepath using the mime-types library, and then reads and base64 encodes the file itself. dataUri combines the pieces into an actual dataUri, and then for convenience we check if the MIME type is an image, and if so wrap the URI in an image tag.

The helper functions encode64, dataUri, isImg are all very simple and terse.

encode64 just delegates to Data.ByteString.Base64.encode then unpacks the ByteString to a String:

encode64 :: B.ByteString -> String
encode64 = BC.unpack . Base64.encode 

dataUri takes a MIME type and a Base64 string and combines the two into a formatted string. show mime adds quotes around the string, so we filter the quotes out before doing any interpolation.

dataUri :: MimeType -> Base64 -> DataUri
dataUri mime s = 
  let m = filter (/='"') $ show mime 
  in printf "data:%s;base64,%s" m s

isImg returns true if a MIME type is an image. MIME type strings are formatted as /, for example image/jpg. We split on ‘/’ and see if the mediatype matches “image”.

isImg :: MimeType -> Bool
isImg mime = 
  let mimeSuperType = head $ BC.split '/' mime
  in  mimeSuperType == BC.pack "image"

That’s pretty much it. Here’s what it looks like all together with imports, a main function, and a few additional helpers:

import System.Environment
import qualified Data.Text as T
import Text.Printf

import Network.Mime
import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as BC
import Data.ByteString.Base64 as Base64

type Base64  = String
type DataUri = String

encode64 :: B.ByteString -> String
encode64 = BC.unpack . Base64.encode 

dataUri :: MimeType -> Base64 -> DataUri
dataUri mime s = 
  let m = filter (/='"') $ show mime 
  in printf "data:%s;base64,%s" m s

fromPath :: String -> IO DataUri
fromPath path = do
  let mime = defaultMimeLookup $ T.pack path
  file <- B.readFile path
  let encoded = encode64 file
  let rawUri = dataUri mime encoded
  let uri = if isImg mime
            then toImgTag $ rawUri
            else rawUri
  return uri
  
usage :: String -> String
usage progname = printf "Usage: ./%s file1 <file2> <file3..>" progname

toImgTag :: DataUri -> String
toImgTag d = printf "<img src=\"%s\" />" d

isImg :: MimeType -> Bool
isImg mime = 
  let mimeSuperType = head $ BC.split '/' mime
  in  mimeSuperType == BC.pack "image"

main = do
  filePaths <- getArgs
  progName  <- getProgName
  if length filePaths < 1
  then do
    mapM putStrLn $ [usage progName]
  else do
    dataUris  <- mapM fromPath filePaths
    mapM putStrLn dataUris