Seven ways to make sure your waiting room works first time

When you install CrowdHandler and set up a waiting room, we want you to get it right first time.

We’re pleased to say we’ve seen plenty of people install CrowdHandler straight into prod, in the middle of a load emergency, to see it save the day.

But, with the best will in the world... mistakes happen. So, here’s a heads-up. Seven traps we’ve seen customers fall into – with advice on avoiding them to ensure a smoother setup.

Think you've prepared...?

Say you're going live with a new product, service or sale.

You know your site is likely to have demand or load issues. Perhaps you are anticipating high demand because you've got an extremely limited product, or a high load problem because you know an unprecedented number of visitors will arrive at your site at the same time.

So you know you're going to need a waiting room. In fact, you're building the waiting room in from the get-go. It's going to be part of the user experience. You know exactly how it's going to work and how the queue will be incorporated into the solution you're building. It'll be fully tested. There's no way it can go wrong.

...Is there?

To make sure, let's go through the most common CrowdHandler custom integration issues and how to avoid them.

1. Limited functional testing

We often see clients testing their carefully crafted custom integration with a small group of internal stakeholders: five or ten people who will make some standard user journeys, and check that users are being queued and redirected to the site correctly.

However, this type of testing doesn't give you a realistic idea of how the site operates with an active queue. After all, once those few users have been through the waiting room process and allowed onto the site, there is no queue. But a lot of integration issues only tend to reveal themselves when the queue activates on a busy site.

When you test, you really need to check that the users are retaining their promoted status (usually indicated by a CrowdHandler cookie) on the target site. Rather than just have your test users filter out of the waiting room and onto the site, you should test whether the CrowdHandler session is sticking.

So, as a minimum, you should:

Set the rate to 0, which will cause the queue to kick in
Raise the rate temporarily to allow some users to get through
Set the rate back to 0 again (again creating an active queue)
Then check that the users who have been promoted onto the site can still complete functional user journeys.

Setting the rate back to 0 after allowing some users through is vital because, at this point, if a user’s CrowdHandler session is broken, they will be sent back to the queue. However, if you keep the rate high, those users won't be sent to the queue, and you won't know that your integration has a problem.

2. Prod does not reflect Dev

Sure, everyone would love to say that their staging and development environments are a true reflection of live... but, in our experience, it's rarely the case.

One of the most common issues we’ve seen is that a client will spend days, or weeks, testing on their development or staging environment, then they'll apply the same configuration to the production site an hour before they go live and expect it to work identically. By the time they realise it won’t, it’s too late.

This happens because the things that tend to be different between live and dev are exactly the types of things that would catch out a waiting room. A lot of the protections people have to have in place on their production site – for example, having the site behind a CDN, running a firewall, or bot protection – aren't usually needed on staging, so some of your network routing or configuration might be different.

So do make sure you look out for differences in:

Network routing (particularly applicable in CDN, DNS implementations)
Bot protection or firewall infrastructure or configurations that do not match live

To make sure you don't hit any big issues at a late stage, we recommend going live on production at least 24 hours before you need to. Once you have completed functional testing in a dev or stage environment, scope it to an area of prod that isn't going to be a problem for you – a specific protected URL pattern that’s unlikely to inconvenience real users – and do some small scale testing in the live environment. You could even do a little production test in the middle of the night.

Just... don't leave it until the last minute.

3. IP blocking

Custom integrations send the user's IP to the CrowdHandler API, so that we can check it's not on a banned list. However, it's not uncommon for sites to be set up in a way that accidentally sends us some proxy server's IP instead of the user's IP… and this erroneously triggers CrowdHandler’s IP blocking.

This only tends to be an issue if you're hand-rolling your integration (if you're using one of our integrations, we've taken care of all that) but it is quite easy to get wrong, so take care.

4. Setting the rate too high

Most people think their site can handle a lot more traffic than it can. Which is why, in our experience, customers often set the ingress rate way too high.

If this is your first go live, you don’t know how much traffic to expect and – honestly – you don’t yet know how much traffic you can handle. Don't be fooled by your site's ability to handle thousands of requests per second; this will not translate to the number of actual human beings you can funnel in from your queue.

Sites that can easily handle more than 60 users per minute (i.e. one new session per second) are pretty rare, so you should probably be starting at 30 or less, unless you really know the load profile of your site.

In short: it’s better to start low. Then, if you're pleasantly surprised – it's easy to rase the rate. Recovering a dead application can be trickier.

5. URL Exclusions

If you're using the DNS integration, the CDN integration, or your own server-side integration with a front controller, it will be checking every single URL to send users to the queue.

However, certain URLs are likely to be called by third party applications which should not be sent to a queue – so, these URLs need to be excluded.

An obvious example of this is a payment gateway. A user completes a transaction with, say, PayPal or Stripe, and the payment gateway sends back an order confirmation receipt, hitting a URL on your server. If you haven't excluded that URL, the CrowdHandler integration (which is checking every single URL, remember) then steps in and says the payment gateway needs to be in a queue. The order will never complete, and the user won't be able to check out.

This is another reason to do full functional testing, well in advance. Set your rate to 0, to ensure the queue is active, and test complete user journeys, which include those third-party apps that may rely on your protected domain.

6. Over-simplifying the template

One of the main reasons people like to use CrowdHandler is custom templates, and we've seen people do some really creative, exciting things with the freedom that templates offer.

The default, or “starting” template includes different messages and states for several scenarios, some of which are more obvious than others. These include the room not being open yet, the room being full to capacity, the user being blocked for suspicious behaviour, and the user not being allocated a queue position yet.

However, when people customise the template, it’s common for them only to consider the ‘happy path’, and ignore – or, worse, strip out completely – some of the less obvious error messages and states. Then, when a user hits one of these conditions, they are presented with a largely unhelpful UI.

It's important to understand that users need to see feedback, even if that feedback is telling them something you don't feel is positive, like their current position, the progress bar, or the estimated wait time. In our experience, hiding these elements is counterproductive. Without them, users will start to behave unpredictably, or react badly to the lack of information. You may lose sales, or even damage your reputation.

7. Setting extreme session timeouts

Every web application uses the concept of a session timeout, and CrowdHandler is no different.

CrowdHandler’s defaults to 15 minutes, which is a relatively standard session length for a web application. All this means is that, once a user has clicked on something, then, for the next 15 minutes, CrowdHandler will presume they are still there.

The user keeps their session alive by clicking on pages, but a 15-minute timeout means they could take 15 minutes away to fix a cup of coffee, and come back without finding themselves back in the queue. When they make another click, that revives the session, so they still have plenty of time to browse.

It’s worth remembering that the number of sessions is only a very simplified proxy for load on your app. When people are away making coffee, they're not putting any load on your application.

However, we find that some customers set the session length either way too low or way too high.

Either way, it’s counterproductive. Too low and you might end up sending people back to the queue in the middle of their visit. Too high, and you could find yourself managing a very long queue, because users’ sessions are staying active for a long time after they’ve left.

Ready to get it right first time?

Anticipating high load or high demand, and ready to install a waiting room?

Sign up for free to try us out.

CrowdHandler via Amazon CloudFront: the best waiting room for applications on AWS

Feature announcement: introducing CAPTCHA anti-bot protection