Web performance analysis can be split into three major categories:
- Page load time
- App run-time performance (“jank,” responsiveness)
- Memory consumption
While app performance and memory consumption sometimes mix together (for example, garbage collection janks affect performance and are also related to memory leaks), each of them has enough sub-categories to merit its own place on the above list.
In this article, I’d like to focus on an aspect of the second category — run-time performance. For the sake of clarity, here’s my definition of “run-time performance:” I define run-time performance as the performance of visual elements in the app when a user interacts with them. So, if a user clicks a button and has to wait for a second before something happens — it’s run time.
If you have a better name for this kind of performance, be sure to let me know in the comments below. I confess I did not give too much thought to the name as I fully understand what I mean 90% of the time :)
To further drill down, I’ll be focusing on a sub-category of run-time performance I call response time (or app responsiveness).
By the end of this article you will know:
- What makes an app responsive/not responsive
- How to measure your app’s responsiveness
- How to measure your functions’ running time
- How to track down and handle time consuming methods
The Theory
Most of our apps run on a 60Hz rendering cycle. This happens because the screen shows 60 frames per second (= 16.667 milliseconds (“ms”) per frame), and our app is limited by this time frame. These 16.667 ms are shared between our JavaScript (“JS”) code, our HTML markup, our CSS styles, and their parsing. All combine together to create the Critical Rendering Path (“CRP”).
Longer frames result in visible delays in animation and responsiveness. These are the two main symptoms of poor run-time performance:
- Animation Jank — your animation moves as if it’s had three too many shots of tequila and is trying to climb the Rocky Mountains. Here’s a nice game that demonstrates jank: http://jakearchibald.github.io/jank-invaders/
This happens when you have frames longer than 16.667 ms. - Responsiveness — while your JS code is running (or anytime during the CRP), the app is stuck. It’s not just your code that’s not running. A user can’t click, scroll, type, etc. In some implementations, even a GIF’s animation won’t move. That’s NOT a good thing… While it’s not always as noticeable as animation jank, you generally don’t want your user to wait more than 100 ms for something to happen (see http://theixdlibrary.com/pdf/Miller1968.pdf and of course, the ultimate source of truth, google).
To summarize the above: Animations need our code and styling to finish rendering in less than 16.677 ms. When it comes to responsiveness, we’d like our app’s responses to user input to be no longer (and preferably much shorter) than 100 ms.
How can we tell there IS a problem? How can we tell where the problem is? And if there is a problem, how do we fix it so our app will respond to user input in less than 100 ms? All this, and more, coming right up!
”Live” Use Case
In the previous paragraph, the first question is one of the most important, and sometimes befuddling. Do we actually have a problem? Is the client complaining about something tangible? Do we expect a problem when scaling our app later on?
This is a big question, arguably deserving of its own post, and it can be answered in many ways. But, to make a very long story short: optimization is a time consuming, and usually never-ending, process. Before starting to optimize, make sure you have a good reason to do it.
Now — assuming we have a good reason (a customer complains about non responsiveness, our boss wants us to check a certain scenario he thinks might be problematic or you understand that your app might get more data in the future (scaling?)) — we can move on to finding the problem.
The ticket
Ok — you got a ticket from support. A client called in saying his users wait for a long time for results when using your super smart search bar. You load your own local dev, try it out, and… it loads really fast!
Checking more thoroughly, you find out the client who called has more than 800 search results coming back from the server. Your testing data set had around 50. This means, this client has 750 more search results than you, eachfighting for a piece of real estate in the DOM tree.
Being so familiar with your code, and being able to add 1 and 1, you figure the problem must be somewhere in the search results processing.
But how do you find the part of code that causes the issue?
DevTools to the rescue!
As mentioned in the Theory section, we are talking about frame rate — meaning, we would like to see how long it takes frames to render. Luckily, Chrome DevTools has a performance tool that does exactly that — it tells us how long it takes each frame to render. It does even more — it tells us what function(s) run in each frame, and how much each function costs us!
Excited at my new toy, I go ahead and start the monitoring process:
- Load my app in Chrome
- Prepare the search bar to be visible
- Load the DevTools (there are many ways to do this— F12; right click -> inspect element, and more)
- Go to the Performance tab
- Start recording
- Search using the super smart search bar
- Wait for results to return
- Clear the super smart search bar
- Search again
- Wait for results to return
- Clear the super smart search bar
- Stop recording
…And when the dust settles, I have something that looks like this:
Press enter or click to view image in full size
While, in this case, it is easy to tell when I did the searches themselves (the colorful parts in the top panel), there are plenty of ways to find out what happened at a given point in time. Chrome tells us where it finds problems in performance and marks them in red. Let’s focus on one such part:
Press enter or click to view image in full size
Here, I’ve highlighted a problematic time range on the chart at the top. From this bird’s-eye-view, it’s evident the process took more than 100 ms — 114.9 ms to be exact. And this was for around 50 to 100 results (depends on my query). If the client’s users have 800, it might take more than one second to load. If the end-user machine is a bit slow, has background processes consuming resources, etc., it might take even longer. That’s NOT good. Can we improve that? Let’s see.
Clicking on the long frame allows us to see what happened in its CRP. After clicking, I select the Call Tree tab in the bottom panel, and I can see the following picture:
Press enter or click to view image in full size
Observing the bottom panel reveals that an event took up most of the frame time. This means that the problem is inside the event somewhere. All that’s left is to drill down until we find a method which can significantly decrease the run time of the function:
Press enter or click to view image in full size
Drilling down, we find that, as expected, performSearch takes the most time to run, but while its total time is 106.8 ms, its self time is 0 ms. This means that one or several of its “child” methods are at fault, so we continue to go deeper until we reach a fork.
As we drill deeper yet, we get to the methods removeLoadingbar and appendSearchResults. These two are the last methods in the chain that are “my code.” Taking a look at removeLoadingBar, we get the following picture:
Press enter or click to view image in full size
Since our code is the widget (which holds the search bar), we see that a 3rd-party method (in the maketutorial_lib.js file) is taking 50.5 ms to run. What is this method doing?
I can’t really show you the code, but what happens is this: removeLoadingbar is calling a method that eventually calls this reinitialize method. Looking at the code, it seems that reinitialize is a call to a 3rd-party library that sets the scroll pane. It is called on almost every change to the widget search results area. This means that this method is not really taking that long to run — it is just called too many times…
In this case, I’ll use a technique called “debounce”. If you’re not familiar with it, here’s a link to my favorite implementation. In essence, it delays the execution of a method if the method is called several times in a certain time frame. So, if I set a function to debounce for 200 ms, it means that every call to the function would start the timer anew, and only after 200 ms without another function call would it fire.
In this case, I’ve set the debounce to 50 ms, because I don’t want the user to wait for so long. This means that during my search process, which currently takes more than 100 ms, this method will fire only once.
Ready for the results?
Press enter or click to view image in full size
Our removeLoadingbar method is now taking only 0.5 ms because its calls to reinitialize are debounced. Moreover — the calls inside appendSearchResults are also debounced, so we eventually get only one call to the method, which reduced our total appendResults time to 25 ms and our total search event to 60.6 ms. This is way below our 100 ms goal!
Conclusion
Let’s recap what we’ve learned:
- We saw that an app’s responsiveness can be measured by looking at frame times
- We measured our frame times using Chrome’s DevTools by recording a “suspicious” event
- We investigated the events by looking at our functions’ run time costs
- We eventually tracked down a code that was wasteful
- We improved the code using the debounce technique
NOTE: when using the debounce technique, make sure to remember that the code is now asynchronous, because debounce works like a timeout. If you have a code that relies on your debounced function to return immediate results, you can compensate for the debounce by adding a “force” parameter to the function; the force parameter causes the function to immediately return a value.
With a simple change, we’ve managed to improve our app’s performance dramatically below the 100 ms threshold. Of course, more data might cause a bigger delay, and there’s always room for improvement. Nonetheless, our app is already better at handling more search results than before — after just a few minutes work!