FP @ FB - NFHN Reader

Bad actors on the internet

No, not that kind.

Bad actors on the internet

Fighting spam is an arms race.

Bad actors on the internet

Fighting spam is an arms race.

Bad actors on the internet

The attack landscape is constantly evolving:

Fake accounts, account cloning
Social engineering
Hacked accounts posting spam
Malicious browser extensions

…and many more

Why does spam appear to be “solved”?

Massive automation
Fast engineering triage of breaking attacks
Continuous deployment of code to respond

Fighting spam at Facebook

Sigma is a rule execution engine

Every interaction on FB has associated rules

Posting a status update
Liking a post
Sending a message

Sigma evaluates each interaction to identify and block malicious acts

An example policy

“A user who is less than a week old posts a photo tagging >= 5 non-friends”

History

“A user who is less than a week old posts a photo tagging >= 5 non-friends”

The old Sigma rule language, FXL:

If (AgeInHours(Account) < 168 &&
    Length(Difference(Tagged(Post),
                      Friends(Account))) >= 5)
Then [BlockAction, LogRequest]
Else []

History: FXL

Pluses:

Pure functions
Batched data fetches

Minuses became an increasing drag:

Home-grown language
Simple, non-extensible type system
Interpreted runtime
Too much C++ written

Purely functional and strongly typed

Policies can’t inadvertently interact with each other
Rule code can’t crash Sigma
Policies become easy to test in isolation

Automatically batch and overlap data fetches

Policies fetch data from many other systems

Must use concurrency wherever possible

Concurrency must be implicit

Engineers writing policies can concentrate on fighting spam
Avoids clutter from efficiency-related details
Code is easier to understand and modify

Fast turnaround

Push code to production in minutes

Allows new or updated policies to go live quickly

Support for interactive development

Spam fighters need to experiment, test code, and get feedback quickly

Our desired endpoint

x and y are Facebook users
We want to compute the number of friends that x and y have in common

Our ideal for expressiveness:

length (intersect (friendsOf x) (friendsOf y))

Some requirements

We have many varied data sources

Synchronous vs async
Batch vs one-shot requests
Pooled vs per-request connections

Must minimise network roundtrips

Overlap accesses to different services
One “multi-get” access when a single service is used
Cache repeated requests for same/similar data

Need to abstract all this away

A terrible hack

{-# LANGUAGE NoImplicitPrelude #-}
module HackyPrelude where

length :: IO [a] -> IO Int
intersect :: Eq a => IO [a] -> IO [a] -> IO [a]
friendsOf :: UserID -> IO [UserID]

This approach “works”, but it’s awful.

What’s wrong?

Everything gets lifted into IO:

length :: IO [a] -> IO Int

No more pure, safe code :-(

No concurrency or batching when we execute two friendsOf actions:

length (intersect (friendsOf x) (friendsOf y))

Explicit concurrency

What if we use MVar to express this?

length (intersect (friendsOf x) (friendsOf y))

Can’t see what’s happening for all the MVar machinery!

do
  m1 <- newEmptyMVar
  m2 <- newEmptyMVar
  forkIO (friendsOf x >>= putMVar m1)
  forkIO (friendsOf y >>= putMVar m2)
  fx <- takeMVar m1
  fy <- takeMVar m2
  return (length (intersect fx fy))

The central problem

We need something where:

We can reorder an expression to optimize data fetching
No side effects (other than fetching)
No clutter from concurrency machinery

What do we want?

Inspect a rule to find all data fetches, without executing them
Re-organize fetches to use batching, concurrency, and caching
Execute fetches, wait for results
Resume execution once results come in,
inspecting and re-organizing the next round of fetches

Static analysis

We want to be able to inspect the structure of code without executing it

Monads and static analysis

Let’s think briefly of a monad m as the “structure” of a computation.

Look at the flipped version of bind:

(=<<) :: Monad m => (a -> m b) -> m a -> m b

It transforms an m a into an m b.

To do so, it takes its instructions from a -> m b.

That function can use a to decide what m b it returns.

Thus we can’t analyse this statically (at least not easily):

Won’t know what structure (m b) will be returned until the code is run.

Functors, applicatives, and static analysis

Compare this:

(=<<) :: Monad m       => (a -> m b) -> m a -> m b

With these rather similar type signatures:

(<$>) :: Functor f     =>   (a -> b) -> f a -> f b
(<*>) :: Applicative f => f (a -> b) -> f a -> f b

What’s the crucial difference?

Neither Functor nor Applicative can affect the “structure” f that gets returned.

They can only change b inside f.

Therefore they’re friendly to static analysis!

Enter Haxl

newtype Haxl a = Haxl { unHaxl :: IO (Result a) }

data Result a = Done a
              | Blocked (Haxl a)

instance Monad Haxl where
  return a = Haxl (return (Done a))

  m >>= k = Haxl $ do
    a <- unHaxl m
    case a of
      Done a    -> unHaxl (k a)
      Blocked r -> return (Blocked (r >>= k))

What’s going on?

data Result a = Done a
              | Blocked (Haxl a)

A computation has either

Completed successfully
Blocked on a pending data fetch

What is the `Haxl` monad?

Alternate names for this construction:

Resumption monad, concurrency monad
Modern name: free monad (ours has a few bells and whistles)

What?

We are separating the representation of the computation from the way it will be run
Haxl and Result give us an abstract syntax tree (AST) that we can manipulate before we execute the code

Problem?

countCommonFriends :: UserID -> UserID -> Haxl Int

countCommonFriends x y = do
  fx <- friendsOf x
  fy <- friendsOf y
  return (length (intersect fx fy))

We run out of AST to explore as soon as we hit the first friendsOf, thanks to use of >>= (via do desugaring).

Can’t see the second one, so no concurrent execution.

Applicatives to the rescue!

Can we rewrite our function to make more of the fetches statically visible?

Yes!

countCommonFriends x y =
  length <$> (intersect <$> friendsOf x <*> friendsOf y)

A little bit of sequencing

instance Applicative Haxl where
  pure = return
  Haxl f <*> Haxl a = Haxl $ do
    r <- f
    case r of
    Done f' -> do
      ra <- a
      case ra of
        Done a'    -> pure (Done    (f' a'))
        Blocked a' -> pure (Blocked (f' <$> a'))
    Blocked f' -> do
      ra <- a
      case ra of
        Done a'    -> pure (Blocked (f' <*> pure a'))
        Blocked a' -> pure (Blocked (f' <*> a'))

Simplifying coding

Two complementary approaches:

Alternate Haxl-friendly prelude
Fancy language support (ApplicativeDo)

Fancy language support

{-# LANGUAGE ApplicativeDo #-}

Facebook-developed language extension

When ApplicativeDo is turned on:

GHC will use a different method for desugaring do-notation
Attempts to use the Applicative operator <*> as far as possible, along with fmap and join
Makes it possible to use do-notation for types that are Applicative but not Monad

Example:

do
  x <- a
  y <- b
  return (f x y)

Translates to:

f <$> a <*> b

Why does caching matter?

Consider our example fragment of code:

length (intersect (friendsOf x) (friendsOf y))

What if this was executed as two fetches?

And what if x == y?
If the two queries returned different results, we’d get bogus behaviour
For this application, caching matters for performance and correctness

Data sources

Several “core data” systems at Facebook

Memcache
TAO
MySQL

Other data sources also involved in Sigma

Machine learning systems

Architecture

Haxl core doesn’t know about data sources
Data sources are hot-pluggable (!)

A data source interacts with Haxl core in 3 ways

Persistent state (e.g. threads, connection pools)
Issuing data fetch requests
Fetching the data

Implementation

-- Core abstraction
class DataSource req where
  {- ... -}

-- An example data source
data ExampleReq a where
  CountAardvarks :: String -> ExampleReq Int
  ListWombats :: Id -> ExampleReq [Id]
  deriving Typeable

-- Fetch data generically
dataFetch :: DataSource req => req a -> Haxl a

Refinement: blocked computations

dataFetch :: DataSource req => req a -> Haxl a

Haxl core has to manage requests submitted with dataFetch

Once an entire round of fetching has stopped making progress, we retrieve the pending requests

class DataSource req where
  fetch :: [BlockedFetch req] -> IO ()

data BlockedFetch req =
  forall a . BlockedFetch (req a) (MVar a)

Refinement: acquiring pending requests

Remember: Haxl core is agnostic to data sources

We use dynamic typing (Typeable) to manage them

-- hack to support parameterised types
class Eq1 req where
  eq1 :: req a -> req a -> Bool

class (Typeable1 req, Hashable1 req, Eq1 req) =>
      DataSource req where
  data DataState req

  fetch :: DataState req
      -> [BlockedFetch req]
      -> IO ()

DataState is an associated type

Is that it?

A lot of other demanding and intricate work went into making Haxl run effectively at scale.

As of June 2015, Haxl-powered Sigma was handling over a million RPS.

For many more juicy details, see the article on code.facebook.com.

Code is open sourced on Hackage as haxl.

Haxl inspired Twitter’s Stitch project.

Dynamic types at scale

Facebook is famous for having been built in PHP.

As our code base grew to millions of lines in size:

PHP’s dynamic typing became increasingly troublesome
It became harder to understand and change code
“What type is this variable? I have no idea!”

We observed the same growing pains in Javascript.

Gradual typing

In between static and dynamic type systems lies the interesting middle ground of gradual types.

Some variables are given explicit types, which can be checked statically
Others have types checked at runtime

Languages that start all-dynamic can acquire gradual typing

Racket, Clojure

…as can languages that start all-static!

Gradual typing of PHP

At Facebook, we developed Hack: hacklang.org

Addresses the productivity and reliability problems of PHP
(lots of PHP!)
A gradually typed language
Add type annotations to existing code incrementally
IDE support: type-sensitive autocomplete, live typechecking and errors

Hackification

Well over 90% of Facebook PHP code is now statically typed via Hack.

Much of our code was always statically typeable!

Some code is still too tricky to statically type.

Tends to be isolated.

Type system ergonomics

PHP programmers rightly value rapid feedback.

Fast iteration: “Save my file, hit reload in the browser, see what went wrong, fix it.”

With Hack, this short cycle time had to be preserved.

Implications are huge:

Type checking algorithms must be fast and mostly incremental
A persistent daemon maintains type system state
Filesystem change notifications lead to “instant” type checking
Typical typechecking time is 200ms (outliers up to 1000ms) for millions of lines of code

Implementing Hack

The Hack type checker is built in OCaml.

OCaml had, until recently, zero support for multiple CPUs.

But multiple CPUs can speed up type checking, and we care greatly about speed …

Workaround: multi-process architecture using shared memory and lock-free concurrent data structures

Gradually typed Javascript: Flow

Following the success of Hack at Facebook, we started work on the same problem space of types in Javascript.

The result is the Flow type checker: flowtype.org

Similar ergonomics:

Heavily based on automatic inference
Fast incremental checks

Philosophical differences

TypeScript

Pessimistically assumes most JS code is statically untypeable
Only typechecks code that is explicitly annotated
Designed mainly for IDE tooling; unsound whenever convenient

Flow

Optimistically assumes that most JS is statically typable
Uses type inference to fill in missing annotations
Designed to find errors; takes soundness far more seriously

Flow vs TypeScript: pragmatics

Somewhat similar ideas, different tradeoffs

Pros for TS:

Really slick Visual Studio integration
Support for typing of third-party js (but all those .ts files are a pain)
One-stop type checking and transformation

Pros for Flow:

Path-sensitive type narrowing
Better type checking of idiomatic js
Really fast
Gets non-nullability right (new TS support is unsound)

Summary

React

Highly influential FP-based framework:

Focus:

Building large, understandable UI-heavy applications

FP heritage:

Immutable data, mostly pure functions
DOM updates abstracted and made efficient
Declarative “here’s how my UI should look at an instant in time” feel
Focus on composable components
Flow for type safety

Mobile development

iOS and Android development are complex and expensive

Huge quirky APIs take ages to master
Investment in one platform is not transferable to the other
Client-side “business logic” locked up in ObjC or Java (or worse, C++)

React Native

Similar playbook to React:

Declarative, self-contained UI components
Rapid feedback during development

But:

Integrates with platform native components
Uses native gesture recognition
Allows proper “look and feel”, not typical “write once, run anywhere” clunkiness

Fetching network data for modern apps

REST isn’t a great model in practice.

Consider the GitHub API.

Here’s a response to a “fetch me all comments on an issue” request.

[
  {
    "id": 1,
    "url": "https://api.github.com/repos/octocat/Hello-World/issues/comments/1",
    "html_url": "https://github.com/octocat/Hello-World/issues/1347#issuecomment-1",
    "body": "Me too",
    "user": {
      "login": "octocat",
      "id": 1,
      "avatar_url": "https://github.com/images/error/octocat_happy.gif",
      "gravatar_id": "",
      "url": "https://api.github.com/users/octocat",
      "html_url": "https://github.com/octocat",
      "followers_url": "https://api.github.com/users/octocat/followers",
      "following_url": "https://api.github.com/users/octocat/following{/other_user}",
      "gists_url": "https://api.github.com/users/octocat/gists{/gist_id}",
      "starred_url": "https://api.github.com/users/octocat/starred{/owner}{/repo}",
      "subscriptions_url": "https://api.github.com/users/octocat/subscriptions",
      "organizations_url": "https://api.github.com/users/octocat/orgs",
      "repos_url": "https://api.github.com/users/octocat/repos",
      "events_url": "https://api.github.com/users/octocat/events{/privacy}",
      "received_events_url": "https://api.github.com/users/octocat/received_events",
      "type": "User",
      "site_admin": false
    },
    "created_at": "2011-04-14T16:00:49Z",
    "updated_at": "2011-04-14T16:00:49Z"
  }
]

Problems with REST

Kitchen-sink responses with a bunch of information you probably don’t need, because maybe someone sometimes will need some of them
Multiple fetches: data included by reference in a response require costly additional roundtrips, concurrency management, and error handling
Custom endpoint proliferation (and duplication) as new requirements evolve
Version bloat: as client versions proliferate, legacy server endpoints must be maintained
Client-side cache management and UI updates are painful when data changes

GraphQL

A query language and protocol for graph-based data.

Puts the client in control of requesting only the data it needs in a single round-trip.

Unlike SQL, where there’s a storage engine with tables and indices behind the query engine, a GraphQL server is powered by arbitrary code (which you write).

Developer friendliness: self-documenting schema, strong type system, interactive exploration.

GraphQL and Haxl

Chad Austin wrote a great overview of a GraphQL server that he built in Haskell using Haxl: chadaustin.me

In response to a single GraphQL request, this uses Haxl to efficiently query multiple endpoints in a minimal number of roundtrips.

Convergent evolution

Sharing some design features with Haxl is the Relay project: facebook.github.io/relay

Declarative “here’s the data this view needs” specifications
Abstracts away details of how and when to fetch data
When modifications are made to data, hides details of updates and cache consistency

Recap

Haxl demonstrates how to roll your own unobtrusive concurrency

There’s a ton of other FP-influenced work at Facebook:

Languages: Hack and Flow (built in OCaml)
User interfaces: React and React Native
Data: GraphQL and Relay

Key takeaway:

Ergonomics really matter!
All of these systems have had a ton of attention put into making them efficient and easy to work with.

Bad actors on the internet

Bad actors on the internet

Bad actors on the internet

Bad actors on the internet

Why does spam appear to be “solved”?

Fighting spam at Facebook

An example policy

History

History: FXL

Purely functional and strongly typed

Automatically batch and overlap data fetches

Fast turnaround

Our desired endpoint

Some requirements

A terrible hack

What’s wrong?

Explicit concurrency

The central problem

What do we want?

Static analysis

Monads and static analysis

Functors, applicatives, and static analysis

Enter Haxl

What’s going on?

What is the Haxl monad?

Problem?

Applicatives to the rescue!

A little bit of sequencing

Simplifying coding

Fancy language support

Why does caching matter?

Data sources

Architecture

Implementation

Refinement: blocked computations

Refinement: acquiring pending requests

Is that it?

Dynamic types at scale

Gradual typing

Gradual typing of PHP

Hackification

Type system ergonomics

Implementing Hack

Gradually typed Javascript: Flow

Philosophical differences

Flow vs TypeScript: pragmatics

Summary

React

Mobile development

React Native

Fetching network data for modern apps

Problems with REST

GraphQL

GraphQL and Haxl

Convergent evolution

Recap

What is the `Haxl` monad?