Loading

Don't Wait; Generate! - MidwestPHP 2020

Ian Littman

This is a live streamed presentation. You will automatically follow the presenter and see the slide they're currently on.

Don't Wait; Generate!

MidwestPHP 2020

Ian Littman / @iansltx

follow along at https://ian.im/gen20mw

We'll Answer

  • What's a Generator?
  • What's a Coroutine?
  • Why async/non-blocking I/O && event loops?
  • Why not?
  • What a Generator + Coroutine based app looks like
    (with real code, time permitting)

This is a generator

function gRange($min, $max, $step): \Generator
{
    for ($i = $min; $i < $max; $i += $step) {
        yield $i;
    }
}

foreach (gRange(12, 82, 5) as $n) {
    echo "$n\n"; // 12, 17, 22, etc.
}

What's going on here?

  • Resumable function
  • Uses yield rather than (or in addition to) return
  • Incremental, iterable results
  • Behaves a bit like an Iterator
  • Values and Exceptions can be sent/thrown in
    (more on this in a moment)

<spooky>This is a generator</spooky>

function spooky(): \Generator
{
    while (true) {
        yield random_int(0, 1) ? 'O' : 'o';
    }
}

foreach (spooky() as $chr) {
    echo $chr; usleep(100000);
}

When can I use it?

  • PHP 5.5+, but 7.0+ gets you...
    • return
    • yield from
    • Throwables
  • Was on HHVM and predecessors before PHP
  • ES2015 (aka ES6) via function*
  • C# (.NET 4.5+)
  • Python 2.2+

Before we look at our next example...

If you call a function that yields, you will not execute that function. You'll get back a Generator object. To execute the function, you need to call methods on the Generator.

 

Also, "return" means something different for a Generator than for a normal function.

Generator

Parent

$g = gen(1);

$a = $g->current();

2

$b = yield $arg1 + 1;

$c = $g->send($a + 1);

$d = yield $b + 2;

5

3

function gen($arg1)

$e = $g->send($c + 1);

6

return $d + 2;

null

echo $g->getReturn();

8

We just did a coroutine.

  • Yield: stop execution until caller restarts it via send()
  • Yield from: pass through execution until yielded-from object has nothing else to yield
  • Return: use just like normal
  • Cooperative multitasking!

Standard PHP-FPM Request Model

Client (or Load Balancer)

Web Server (nginx or Apache)

FastCGI Daemon (php-fpm) + Workers

HTTP

FastCGI

Pros

  • Common
  • Multicore
  • Shared-nothing (safe)
  • Fast for static resources
  • Library support
    • Request\Response wrappers
    • Database engines
    • ...basically anything else
  • Don't worry (much) about blocking the thread

Cons

  • No in-request parallelism*
  • Blocking I/O
  • Not memory-efficient
  • Startup penalty on every req
  • Not 12-factor
    • Process manager (runit)
    • nginx
    • php-fpm + workers

* Ignoring curl_multi and wrappers (e.g. Guzzle)

DEMO TIME: ngx + FPM + SLIM APP

Blocking I/O is drive-thru
Async is delivery

Event Loops!

T_PARADIGM_SHIFT

  • Async I/O
  • For compute-heavy operations...
    • Don't do in-process if you can avoid it
    • Don't do all at once if it must be in-process

Callbacks/Promises

  • Very common (used in e.g. ReactPHP), but...
  • Hard to follow execution flow
    • Error callback convention (vs. Exceptions)
    • Messages only at function borders
    • Callback Pyramid of Doom

Why Generators?

  • Easier to follow
  • Cleaner error handling
  • Pass control inside functions
  • Can still do async!

With generator-based concurrency, you're trading flexibility in defining concurrency for clarity in defining your task.

Generators in an Event Loop

  1. Run until blocking I/O
  2. Yield promise representing blocking I/O
  3. Event loop skips coroutine until promise is resolved (can use yield from inside a coroutine when you want to call another coroutine and wait for it to complete)
  4. Event loop send()s promise result to coroutine
  5. Repeat from 1 until coroutine is complete (return)

Modern JS

  • async () => somePromise()
  • const foo = await somePromise()

PHP

  • fn(): \Generator => yield somePromise();
  • $foo = yield somePromise();

Event Loop Extensions

Amphp http-server Request Model

Client (or Load Balancer)

Application Server (AMPHP)

HTTP

var_dump(is_12_factor()); // bool(true)

Pros

  • No per-request bootstrap time
  • Fewer moving parts (12F app)
  • Async execution
  • Generator based (!pyramid)
  • Async database access
  • Lower memory use per request
  • Fast!

Cons

  • A bit fragile*
  • Requires port match
  • Single-threaded**
  • Plenty to refactor

 

* Throwables to the rescue!
** amphp/cluster to the rescue!

Benchmarks!

  • System under test
    • Vultr 4GB high frequency server (2 vCPUs), Dallas
    • Ubuntu 19.10 + Docker, HTTP-only
    • Current docker-compose setups from GitHub
  • Load generation system
    • Vultr 1GB high frequency server, same data center
    • Siege 4.0.4 on Ubuntu 19.10
    • Benchmarked a few times to allow for warm-up
    • Hitting raffle entrants URL w\cookie, one "blank" raffle
    • siege -c 15 -t 30S -b <url> <cookie header> 

Benchmarks! Nginx + PHP-FPM (7.4)

Transactions:                3616 hits
Availability:              100.00 %
Elapsed time:               29.19 secs
Data transferred:            0.14 MB
Response time:               0.12 secs
Transaction rate:          123.88 trans/sec
Throughput:                  0.00 MB/sec
Concurrency:                14.96
Successful transactions:     3616
Failed transactions:            0
Longest transaction:         0.25
Shortest transaction:        0.02

 

~28MB RAM, ~190% CPU

Benchmarks! AMPHP (7.4, 2 workers)

Transactions:                8767 hits
Availability:              100.00 %
Elapsed time:               29.24 secs
Data transferred:            0.33 MB
Response time:               0.05 secs
Transaction rate:          299.83 trans/sec
Throughput:                  0.01 MB/sec
Concurrency:                14.99
Successful transactions:     8767
Failed transactions:            0
Longest transaction:         0.16
Shortest transaction:        0.00

           

~30MB RAM, ~80% CPU

Benchmark Caveats

  • In favor of amphp
    • Relatively low concurrency
    • Didn't turn opcache revalidation completely off on FPM
    • Failing requests didn't automatically disqualify
  • In favor of nginx + fpm
    • Very little I/O (a couple very quick DB calls)
    • Tiny I/O latency (DB was on-server)

Thanks! Questions?