A checklist for investigating slowness in web applications

A checklist for investigating slowness in web applications

More often than not I find myself dealing with legacy web applications of any kind (mostly Python and PHP). Sometimes these websites and applications can show signs of slowness, or they become slow all of a sudden once migrated to a new environment.

Most issues are low hanging fruit, and don't require complex instrumentation. In all these cases it's important to have a simple checklist to investigate and solve these issues quickly. Here's mine!

1. Check DNS resolution and outgoing HTTP requests

This is old but gold. DNS resolution problems are frequent and subtle.

Sometimes a web application becomes too slow without any clear cause. Before setting up more serious instrumentation you can check first of all if the application makes DNS requests to the outside.

As you might know, machines need to make a DNS request to resolve the remote host (if it's not an ip address) before the HTTP request can start.

If the DNS server isn't responding, or if it's too slow, the application can become slow or totally unresponsive.

To investigate DNS resolution issues you can use tcpdump on the host machine:

tcpdump port 53

Once the capture is in place you can try visiting the affected page, or making the appropriate request to trigger the issue. Your output should roughly be along these lines:

12:48:59.492154 IP some-host.local.44630 > resolver1.opendns.com.domain: 46758+ A? request-domain.com. (30)
12:48:59.492161 IP some-host.local.44630 > resolver1.opendns.com.domain: 35512+ AAAA? request-domain.com. (30)

Here request-domain.com is the domain for which the local host some-host.local is requesting DNS info to opendns.com.

In some cases the DNS response could be slow or unreliable and the application becomes inexplicably slow.

In a recent case I had an ipv6-enabled host where some piece of code was making HTTP and DNS requests. Due the way glibc makes DNS requests, the host was showing terrible slowness.

Takeaway: always check if the code makes outgoing requests. Check if DNS resolution works as expected on the target host.

2. Check and offload I/O blocking, synchronous operations

Most programming languages are synchronous by nature. Take Python or PHP for example.

What this means in practice is that any I/O blocking operation made from a view (view in the MVC or MVT paradigm) or from any piece of code in response to some user interaction can block the application until the operation has completed.

For I/O blocking operations I mean:

  • interactions with external systems over the network.
  • interactions with the filesystem.
  • delayed tasks.

One day I took in charge a Python project which was terribly slow. Upon further investigation I found out each view was making an HTTP request to an ip checking API. Each of these calls was taking two to three seconds to run.

If your application is slow, or some specific url takes too much to complete, check if there are blocking operations launched from the view.

Once these I/O blocking operations have been identified, offload them to a task queue. There are task queues for any programming language. Python for example has rq, or Celery. For Django there is Django Q.

Takeaway: offload I/O blocking commands to a task queue.

3. Check database connection and performances

If the application is still slow and there are no signs of outgoing HTTP requests your next step should be checking the database.

In particular, two of the most effective tweaks for MySQL and MariaDB are:

  • skip-name-resolve in the configuration.
  • slow queries measurement.

skip-name-resolve ensures that no DNS resolution is made for client's hostnames.

In other words, if a client from some-host.local connects to the database instance, MySQL and MariaDB by default make a DNS query to resolve the hostname.

This is most of the times unnecessary and can have a dramatic performance impact. skip-name-resolve can solve the issue.

If the database is still slow you can check if there are slow queries: to activate slow query logging you can follow this handy guide.

If skip-name-resolve doesn't help, or no slow queries are showing in the log, check if the database performs well enough with a benchmark. This means launching the following query from the database console:

SELECT BENCHMARK(1000000,ENCODE('hello','goodbye'));

If the query takes too much then you likely have some performance issue on the system: slow disks or some database misconfiguration.

Other than these checks, when migrating to a new environment always check if the application is pointing to the right database.

It's easy to forget updating the database configuration to point to a new host if you're in a rush.

Takeaway: when migrating legacy applications to new environments always check if thw code is pointing to the new database. It this doesn't help do a quick benchmark. Measure slow queries and add skip-name-resolve as well to the database configuration.

4. Check PHP's configuration and timeouts

In most web frameworks there are utilities for outputting static resources into the HTML markup.

One such example is CakePHP which has utilities for inserting <script> tags into the template:

$this->Html->script('script-to-load.js');

One day after migrating a legacy website to a new machine I noticed a strange behaviour when the website tried to load a bunch of scripts:

net::ERR_CONTENT_LENGTH_MISMATCH 200

At first, I tried to minify and split up the bundle (an old, non minified jQuery app) and to defer the loading with <script defer> and <script async>, thinking the bundle size was too much, but nothing helped.

In the end, the problem was a lower value for PHP's max_execution_time. Increasing it solved the issue.

There are also situations where a larger value for max_execution_time isn't enough, and the application timeouts like it's raining.

In these cases always check step 1 (Check DNS resolution and outgoing HTTP requests) to see if there's some outgoing request taking too much.

Takeaway: when migrating legacy PHP applications to a new environment always check the PHP configuration, and tweak it as needed.

5. Check any external system

In general, any external system connected to the application is a potential source of issues, especially if it should be reached over the network.

If you ruled out any possible issue, but the application is still slow check if it's trying to reach some external system. These could be:

  • session storages.
  • cache storages.
  • search engines (Elasticsearch and friends).

Thanks for reading!

Valentino Gagliardi

Hi! I'm Valentino! I'm a freelance consultant with a wealth of experience in the IT industry. I spent the last years as a frontend consultant, providing advice and help, coaching and training on JavaScript, testing, and software development. Let's get in touch!