Load TEsting Your App

php[detroit] 2018

Ian Littman / @iansltx

follow along at https://ian.im/loaddet18

In this presentation we will Learn...

  • ...the difference between a smoke test, a load test, a stress test, and a spike test
  • ...when it makes sense to test
  • ...how to better match your load test with (anticipated) reality for more useful results
  • ...what bottlenecks to look for when stress testing
  • ...that a bunch of free, open-source utilities exist to load test your application
  • ...how to use a couple of them (time permitting; repo's up)

In this presentation we wOn'T Learn...

  • ...about every load test application out there
  • ...how to set up clustered load testing
  • ...how to simulate far-end users
    • Slow connections tie up server/load balancer resources for longer
    • Solutions for slow connections (e.g. compression) may affect system capacity elsewhere
  • ...how to do deep application profiling, e.g. Blackfire
  • ...about single-user load testing (e.g. running an import with a larger data set than usual)

#IFNDEF

Load TEst

  • <= peak traffic
  • Your system shouldn't break
  • If it does, it's a stress test

Stress Test

  • Trying to break your system
  • Surfaces bottlenecks
  • Increase traffic above peak or decrease available resources
  • Capacity Test is a subset

Soak Test

  • Extended test duration
  • Watch behavior on ramp down as well as ramp up
  • Memory leaks
  • Disk space exhaustion (logs!)
  • Filled caches

Spike Test

  • Stress test with quick ramp-up
  • Woot.com at midnight
  • TV ad "go online"
  • System comes back online
    after downtime
  • Everyone hits your API via
    on-the-hour cron jobs

Smoke test

  • An initial test to confirm the system operates properly without a large amount of generated load
  • May be integration tests in your existing test suite
  • May be your load test script, turned down to one (thorough) iteration and one Virtual User
  • Do this before you load test

When?

  • When your application performance may change
    • Adding/removing features
    • Refactoring
    • Infrastructure changes
  • When your load profile may change
    • Initial app launch
    • Feature launch
    • Marketing pushes/promotions

What are your metrics?

  • Speed - response latency
  • Scalability - throughput, resource utilization
  • Stability - % failed calls/transactions/flows

How should I test?

How should I test?

Accurately.

What should I test?

  • Flows, not just single endpoints
  • Frequently used
  • Performance intensive
  • Business critical

Concurrent Requests != Concurrent Users

  • Think Time
  • API client concurrency
  • Caching (client-side or otherwise)

Oversimplification...It's a trap!

  • No starting data in database
  • No parameterization
  • No abandonment at each step in the process
  • No input errors
  • No think times
  • Static think times
  • Uniformly distributed think times
  • Assuming you have one type of user
  • Assuming that a distribution is normal

Vary Your Testing

  • Worst Case: heavier endpoints get refreshed more often
  • Anticipated Case
  • Best Case: validation failures + think time

Keep it real

  • Run your APM (e.g. New Relic, Tideways) on your load test env
    • Better profiling info
    • You'll have the same perf hit as production
  • Is your environment code-ified?
    • Easier to copy envs
    • Cheaper to set up an env for an hour to run a load test
  • Decide whether testing from near your env is accurate enough
  • Use logs/analytics to figure out how long your users are spending
  • Test autoscaling/load-shedding facilities

Aggregate your metrics repsonsibly

  • Average
  • Median (~50th percentile)
  • 90th, 95th, 99th percentile
  • Standard Deviation
  • Distribution of results
  • Explain your outliers

Bottlenecks

  • Web Server
    • FPM workers/Apache processes
    • CPU + RAM utilization
    • Network utilization
    • Disk utilization
  • Load balancer
    • Network utilization/warmup
    • Connection count
  • External Services
    • Rate limits (natural or artificial)
    • Latency
    • Network egress
  • Queues
    • Per-job spin-up latency
    • Worker count
    • CPU + RAM utilization
      • Workers
      • Broker
    • Queue depth
  • Caches
    • Thundering herd
    • Churning due to
      cache evictions

Bottleneck Gotchas

  • Just because a request is heavy doesn't mean
    it's the biggest source of load
  • As a system reaches capacity you'll see
    nonlinear performance degradation

A Challengr Appears

This will be our system under test

Load Test Tools We'll Look At


* I've used this on a project significantly more real than Challengr, so that's a big reason we're looking at it today.

More Tools

  • Tsung
    • Erlang (efficient, high volume from a single box)
    • Flexible (not just HTTP)
    • XML based config
  • The Grinder
    • Java-based
    • Java, Jython or Clojure scripts

Even More Tools!

  • Artillery.io
    • Node-based
    • Simple stuff in Yaml, can switch to JS (including npm)
  • Molotov (by Mozilla)
    • Python 3.5+, uses async IO via coroutines
  • Locust
    • Python based
    • Can be run clustered
  • Wrk2
    • Built in C
    • Scriptable via Lua

Thanks! Questions?

Load Testing Your App - PHPDetroit 2018

By Ian Littman

Load Testing Your App - PHPDetroit 2018

  • 2,150