In September, I worked on improving Pikaβs image performance. Iβve had a long career now (25 years π) doing mostly web-programming tasks, yet somehow Iβve never set up a CDN myself. I suppose the βmanagement yearsβ right as my prior organization was getting bigger contributed to missing out on that experience. In any case, the work was overdue on Pika and it was time to tackle it.
Through a bit of help from online articles and online friends, Iβve gotten it mostly figured out. Here is Pikaβs setup.
β Excerpt β
The tools
Since we started Good Enough with lots of AWS credits, Amazon has got us a bit locked in with their services. And since, remember, I have no past experience setting these things up, well, I tallied-ho with Amazonβs CloudFront for the CDN and S3, which we were already using, for storage. Through this process I had a lot of βgrass is greenerβ feelings toward Cloudflare and Cloudflare R2, but Iβll save that dalliance for another day.
I started thinking about the many background jobs I was going to need to orchestrate for creating the various tuned images (resizing, removing Exif data, compression, etc). Through that research I ran into John Nunemakerβs Imgproxy is Amazing blog post. I reached out to confirm that he is still using imgproxy, and, boy howdy, is he ever. Thanks to Nunes for sharing many details about how he has configured both imgproxy and CloudFront!
The flow
When someoneβs browser requests an uncached image from a Pika blog post, hereβs how an image request flows through all of these systems:
ββββββββββββββ
β β
β Reader β
β requests β
β image β
β β
βββββ¬βββββββββ
β β²
β β
βΌ β
ββββββββββββββββββ΄ββββββββββββ
β β
β Regional CloudFront node β
β β
βββββββ¬βββββββββββββββββββββββ
β β²
β CloudFront β caches
β in regions β and at
β shield β
βΌ β
ββββββββββββββββββββββββββ΄ββββ
β β
β CloudFront Origin Shield β
β β
βββββββ¬βββββββββββββββββββββββ
β β²
β imgproxy β strips
β Exif, β resizes,
β and β compresses
βΌ β
ββββββββββββββββββββββββββ΄ββββ
β β
β imgproxy β
β β
βββββββ¬βββββββββββββββββββββββ
β β²
β β
β β
β β
βΌ β
ββββββββββββββββββββββββββ΄ββββ
β β
β S3 β
β β
ββββββββββββββββββββββββββββββ
The configuration details (as of today)
Letβs start one step above Rails with the imgproxy setup.
imgproxy
We deploy our services at Render.com. This is the full contents of the Dockerfile we use to deploy our Pika imgproxy web service instance:
FROM ghcr.io/imgproxy/imgproxy:latest
To configure imgproxy I am using environment variables to the max. Here are the environment variables Iβm currently using:
IMGPROXY_TTL=30758400: Feeling pretty confident here and setting the TTL to 1 year. Attaching images to rich text fields in Rails should never really re-use an existing image or its URLs, making cache invalidation happen as a matter of course.IMGPROXY_FALLBACK_IMAGE_DATA=R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7: This is a 1x1 transparent GIF fallback image in case imgproxy cannot retrieve the requested image.IMGPROXY_FALLBACK_IMAGE_TTL=120: I set the TTL for the fallback image to be much less than our system TTL set above. I donβt want system hiccups to lead to broken images. Well, not for more than 2 minutes, anyway!IMGPROXY_FORMAT_QUALITY=jpeg=90,png=90,webp=79,avif=63,jxl=77: Setting mild compression for all images. I am very cautious about over-compressing anything in Pika, and the default compression of80was too extreme for me.webp,avif, andjxlformats are not currently used in Pika, but I added them here to match the defaults that imgproxy uses forIMGPROXY_FORMAT_QUALITY. Thegifformat is also not being used, as youβll see below.IMGPROXY_STRIP_COLOR_PROFILE=false: Related to the above, I want Pika to be as color-accurate as possible.IMGPROXY_MAX_SRC_RESOLUTION=75: Did you know there is such a thing as image bombs? Neither did I! impgroxy can protect you from them.IMGPROXY_ALLOW_SECURITY_OPTIONS=true: This is required to allow the use of theIMGPROXY_MAX_SRC_RESOLUTIONenvar.IMGPROXY_USE_S3=true: This allows imgproxy to grab images directly from S3. Very clever as it saves a trip through our Rails servers! You will also need to set up the following envars:IMGPROXY_S3_REGION,AWS_ACCESS_KEY_ID, andAWS_SECRET_ACCESS_KEY. The downside with this technique is that the URLs no longer end with image extensions like.jpg, which has caused some problems with third-party services. I do wonder if it has been worth saving that trip through our Rails servers. π€IMGPROXY_ALLOW_ORIGIN=https://pika.page: Iβm actually not sure if this is needed since we never hit our Rails app when loading an image.IMGPROXY_USE_LAST_MODIFIED=true: Given what I wrote about TTL above, I donβt think this is necessary, but it just feels right.IMGPROXY_SENTRY_DSN: Set this to enable error reporting to Sentry.IMGPROXY_TIMEOUT=15: Iβm not sure why I increased this from the default of10.IMGPROXY_READ_REQUEST_TIMEOUT=15: Ditto.IMGPROXY_KEY&IMGPROXY_SALTneed set as well, of course.
CloudFront
Hereβs how we have CloudFront configured. Iβm only mentioning the settings that we changed from the default.
Main Distribution settings:
Alternate domain name:
cdn.u.pika.pageCustom SSL certificate: Requested through the interface that CloudFront offers inline
Origin:
Origin domain:
u.pika.pageEnable origin shield: Yes, setting the Original Shield region to be the best match for our other server locations
Behaviors:
Compress objects automatically: No
Allowed HTTP methods: GET, HEAD, OPTIONS
Cache HTTP methods: checked OPTIONS
Cache key and origin requests: checked Legacy cache settings (I donβt love that we are on this Legacy option, but I could never get the other option to work)
Logging: Added a log destination because Iβm not sure how you troubleshoot without it!
DNS
Hereβs how I have DNS configured for CloudFront and imgproxy:
Our CDN requests go to
cdn.u.pika.page(yes, I can already tell that that should have beencdn1.u.pika.page)The CDN requests our imgproxy origin at
u.pika.page(yes, I should have went withu1.pika.page)At dnsimple I pointed
u.pika.pageto our imgproxy origin according to Renderβs instructionsI also added a
CNAMErecord to pointcdn.u.pika.pageto the Distribution domain name provided by CloudFrontAs mentioned above, the SSL certificate for
cdn.u.pika.pagewas acquired via CloudFrontβs interface, which required a DNS record to be set up at dnsimple during setup for certificate validation
The Rails setup
Pika is configured to upload images to S3. This is a pretty straightforward setup that is written about in many other places.
Iβm using the imgproxy gem to help build URLs for images. (There is also an imgproxy-rails gem, but it didnβt play well with our setup.) Hereβs our imgproxy.yml configuration file:
default: &default
key: Rails.application.credentials.dig(:imgproxy, :key)
salt: Rails.application.credentials.dig(:imgproxy, :salt)
development:
<<: *default
endpoint: <%= ENV['IMGPROXY_FREE_CDN'] %>
test:
production:
<<: *default
endpoint: <%= ENV['IMGPROXY_FREE_CDN'] %>
use_s3_urls: true
The IMGPROXY_FREE_CDN envar is set to https://cdn.u.pika.page, which is actually the CloudFront CDN URL. Also note use_s3_urls: true for the production environment. This assures the URLs generated by the imgproxy gem are pointing imgproxy to S3 directly.
The simplest images we serve are site avatars, which can be used in the headings of a blog as well as social share images. Rendering the imgproxy/CDN URL is pretty easy for this example. Hereβs what we have in our User model:
has_one_attached :avatar
def avatar_url(variant=:small)
variant_options = case
when variant == :small
{ height: "100", width: "100" }
when variant == :medium
{ height: "300", width: "300" }
end
avatar.imgproxy_url(variant_options)
end
Rich text is a whole different beast in Rails. In our case, we have already heavily overridden the _blob.html.erb file, and our CDN updates fit right in there. Along the way I decided not to serve GIF files from imgproxy, so youβll see some reference to that in the code as well. Processing animated images can get complicated, and I decided to leave that thinking for another day.
Further, for local development I wanted to support accessing a local imgproxy instance, but not break if it isnβt available. So youβll see mention of an imgproxy? method, which is supported by inclusion of this module in ApplicationHelper and User:
module ImgproxyDetector
def imgproxy?
return @imgproxy if defined?(@imgproxy)
@imgproxy =
(Rails.env.production? || (Rails.env.development? && Rails.application.config_for(:imgproxy).endpoint.present?))
end
end
Hereβs the simplified imgproxy/CDN-related code from our _blob.html.erb file:
<figure class="attachment attachment--<%= blob.representable? ? "preview" : "file" %> attachment--<%= blob.filename.extension %>">
<% if blob.representable? %>
<%
if blob.content_type == 'image/gif' # don't use imgproxy URLs for GIFs in case they are animated
img_src_url = url_for(blob)
else
if imgproxy?
img_src_url = blob.imgproxy_url(height: "1400", width: "1800")
else
img_src_url = url_for(blob.variant(resize_to_limit: [1400, 1800], saver: { quality: 90 }))
end
end
%>
<%= image_tag img_src_url %>
<% end %>
<figcaption class="attachment__caption">
<% if caption = blob.try(:caption) %>
<%= caption %>
<% else %>
<span class="attachment__name"><%= blob.filename %></span>
<span class="attachment__size"><%= number_to_human_size blob.byte_size %></span>
<% end %>
</figcaption>
</figure>
imgproxy itself is much more performant than a Rails server, but you canβt get around the fact that image processing is a resource-heavy process. In order to avoid flooding our imgproxy server with an unpredictable number of requests the first time an image-heavy post is loaded, I decided that it would be best to warm the cache as soon as possible. So in the end I wasnβt able to avoid background jobs in our image processing stack. When a new post is created or has edited its images, a background job is created to query the CDN URL for each blob in the post. Iβll leave this code as an exercise for the reader.
Above you may remember that I mentioned the security concern of image bombing. While imgproxy protects us from that, I wanted to avoid folks uploading such images in the first place. So I added a validation to check image resolutions, which means I also didnβt manage to avoid doing any image processing on our Rails server. π Here is a simplified version of how I do that for rich text image attachments:
# post.rb
has_rich_text :body
validate -> { acceptable_image_attachments(:body) }
def acceptable_image_attachments(attr)
return true if self.send(attr).body.blank?
self.send(attr).body.attachables.each do |attachment|
next unless attachment.is_a?(ActiveStorage::Blob)
if image_resolution_over_limit?(attachment)
errors.add(attr, image_resolution_error_message_for(attachment.filename))
end
end
end
def image_resolution_over_limit?(blob)
width, height = blob_dimensions(blob)
(width.to_f * height.to_f) / 1_000_000.0 > Rails.application.config.x.image_resolution_limit.to_f
end
def blob_dimensions(blob)
width = blob.metadata["width"]
height = blob.metadata["height"]
if width.nil? || height.nil?
blob.analyze
width = blob.metadata["width"]
height = blob.metadata["height"]
end
[width, height]
end
# application.rb
config.x.image_resolution_limit = 75 # in megapixels
Local testing is pretty easy once you get it all set up. Well, if youβre familiar with Docker. (Iβm really not, but I got it set up, and doing that setup is another exercise Iβll leave to you, dear reader.) Our test code does not use imgproxy, but our development environment sure can. As mentioned above, we have a repo for Pikaβs imgproxy that is a very simple Dockerfile.
I have Docker and OrbStack installed locally to make things work
dotenv is installed to manage my local envars
In my
.envfile I haveIMGPROXY_FREE_CDN = βhttp://localhost:7777βI have foreman installed to handle Procfile applications
Then I run
foreman start -f Procfile_imgproxy.dev
Hereβs my Procfile_imgproxy.dev file, which is in my main Rails app:
imgproxy: docker run --rm --name pika-imgproxy -p 7777:8080 --add-host=pika.test:host-gateway -e IMGPROXY_ENABLE_INSECURE_MODE=true -e IMGPROXY_ALLOW_PRIVATE_NETWORKS=true -e IMGPROXY_ALLOW_LOOPBACK_NETWORKS=true -e IMGPROXY_ALLOW_ORIGIN=http://pika.test -e IMGPROXY_FALLBACK_IMAGE_DATA=R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7 -e IMGPROXY_FALLBACK_IMAGE_TTL=120 -e IMGPROXY_FORMAT_QUALITY=jpeg=90,png=90,webp=79,avif=63,jxl=77 -e IMGPROXY_STRIP_COLOR_PROFILE=false -e IMGPROXY_TTL=604800 -e IMGPROXY_USE_LAST_MODIFIED=true ghcr.io/imgproxy/imgproxy:latest
With this all running, you can see imgproxy in action in your local development environment!
The future
Weβre hoping to ride wih this setup for quite a while. Down the road weβll probably look into tuning GIFs, and I may look into ways to implement WebP and AVIF while still keeping colors and performance to our liking. During implementation I did not have good luck making those formats work well.
And, as an admittedly novice CDN implementor, maybe others will read this blog post and have some ideas about how I could improve this setup. Happy to hear them!