Bad actors on the internet
No, not that kind.
Bad actors on the internet
Fighting spam is an arms race.
Bad actors on the internet
Fighting spam is an arms race.
Bad actors on the internet
The attack landscape is constantly evolving:
Fake accounts, account cloning
Social engineering
Hacked accounts posting spam
Malicious browser extensions
…and many more
Why does spam appear to be “solved”?
Massive automation
Fast engineering triage of breaking attacks
Continuous deployment of code to respond
Fighting spam at Facebook
Sigma is a rule execution engine
Every interaction on FB has associated rules
Posting a status update
Liking a post
Sending a message
Sigma evaluates each interaction to identify and block malicious acts
An example policy
“A user who is less than a week old posts a photo tagging >= 5 non-friends”
History
“A user who is less than a week old posts a photo tagging >= 5 non-friends”
The old Sigma rule language, FXL:
If (AgeInHours(Account) < 168 &&
Length(Difference(Tagged(Post),
Friends(Account))) >= 5)
Then [BlockAction, LogRequest]
Else []
History: FXL
Pluses:
Pure functions
Batched data fetches
Minuses became an increasing drag:
Home-grown language
Simple, non-extensible type system
- Interpreted runtime
Too much C++ written
Purely functional and strongly typed
Policies can’t inadvertently interact with each other
Rule code can’t crash Sigma
Policies become easy to test in isolation
Automatically batch and overlap data fetches
Policies fetch data from many other systems
- Must use concurrency wherever possible
Concurrency must be implicit
- Engineers writing policies can concentrate on fighting spam
- Avoids clutter from efficiency-related details
- Code is easier to understand and modify
Fast turnaround
Push code to production in minutes
- Allows new or updated policies to go live quickly
Support for interactive development
- Spam fighters need to experiment, test code, and get feedback quickly
Our desired endpoint
xandyare Facebook usersWe want to compute the number of friends that
xandyhave in common
Our ideal for expressiveness:
length (intersect (friendsOf x) (friendsOf y))
Some requirements
We have many varied data sources
- Synchronous vs async
- Batch vs one-shot requests
- Pooled vs per-request connections
Must minimise network roundtrips
Overlap accesses to different services
One “multi-get” access when a single service is used
Cache repeated requests for same/similar data
Need to abstract all this away
A terrible hack
{-# LANGUAGE NoImplicitPrelude #-}
module HackyPrelude where
length :: IO [a] -> IO Int
intersect :: Eq a => IO [a] -> IO [a] -> IO [a]
friendsOf :: UserID -> IO [UserID]
This approach “works”, but it’s awful.
What’s wrong?
Everything gets lifted into IO:
length :: IO [a] -> IO Int
No more pure, safe code :-(
No concurrency or batching when we execute two friendsOf actions:
length (intersect (friendsOf x) (friendsOf y))
Explicit concurrency
What if we use MVar to express this?
length (intersect (friendsOf x) (friendsOf y))
Can’t see what’s happening for all the MVar machinery!
do
m1 <- newEmptyMVar
m2 <- newEmptyMVar
forkIO (friendsOf x >>= putMVar m1)
forkIO (friendsOf y >>= putMVar m2)
fx <- takeMVar m1
fy <- takeMVar m2
return (length (intersect fx fy))
The central problem
We need something where:
We can reorder an expression to optimize data fetching
No side effects (other than fetching)
No clutter from concurrency machinery
What do we want?
Inspect a rule to find all data fetches, without executing them
Re-organize fetches to use batching, concurrency, and caching
Execute fetches, wait for results
Resume execution once results come in,
inspecting and re-organizing the next round of fetches
Static analysis
We want to be able to inspect the structure of code without executing it
Monads and static analysis
Let’s think briefly of a monad m as the “structure” of a computation.
Look at the flipped version of bind:
(=<<) :: Monad m => (a -> m b) -> m a -> m b
It transforms an m a into an m b.
To do so, it takes its instructions from a -> m b.
That function can use a to decide what m b it returns.
Thus we can’t analyse this statically (at least not easily):
- Won’t know what structure (
m b) will be returned until the code is run.
Functors, applicatives, and static analysis
Compare this:
(=<<) :: Monad m => (a -> m b) -> m a -> m b
With these rather similar type signatures:
(<$>) :: Functor f => (a -> b) -> f a -> f b
(<*>) :: Applicative f => f (a -> b) -> f a -> f b
What’s the crucial difference?
Neither Functor nor Applicative can affect the “structure” f that gets returned.
They can only change b inside f.
Therefore they’re friendly to static analysis!
Enter Haxl
newtype Haxl a = Haxl { unHaxl :: IO (Result a) }
data Result a = Done a
| Blocked (Haxl a)
instance Monad Haxl where
return a = Haxl (return (Done a))
m >>= k = Haxl $ do
a <- unHaxl m
case a of
Done a -> unHaxl (k a)
Blocked r -> return (Blocked (r >>= k))
What’s going on?
data Result a = Done a
| Blocked (Haxl a)
A computation has either
Completed successfully
Blocked on a pending data fetch
What is the Haxl monad?
Alternate names for this construction:
Resumption monad, concurrency monad
Modern name: free monad (ours has a few bells and whistles)
What?
We are separating the representation of the computation from the way it will be run
HaxlandResultgive us an abstract syntax tree (AST) that we can manipulate before we execute the code
Problem?
countCommonFriends :: UserID -> UserID -> Haxl Int
countCommonFriends x y = do
fx <- friendsOf x
fy <- friendsOf y
return (length (intersect fx fy))
We run out of AST to explore as soon as we hit the first friendsOf, thanks to use of >>= (via do desugaring).
Can’t see the second one, so no concurrent execution.
Applicatives to the rescue!
Can we rewrite our function to make more of the fetches statically visible?
Yes!
countCommonFriends x y =
length <$> (intersect <$> friendsOf x <*> friendsOf y)
A little bit of sequencing
instance Applicative Haxl where
pure = return
Haxl f <*> Haxl a = Haxl $ do
r <- f
case r of
Done f' -> do
ra <- a
case ra of
Done a' -> pure (Done (f' a'))
Blocked a' -> pure (Blocked (f' <$> a'))
Blocked f' -> do
ra <- a
case ra of
Done a' -> pure (Blocked (f' <*> pure a'))
Blocked a' -> pure (Blocked (f' <*> a'))
Simplifying coding
Two complementary approaches:
Alternate Haxl-friendly prelude
Fancy language support (
ApplicativeDo)
Fancy language support
{-# LANGUAGE ApplicativeDo #-}
Facebook-developed language extension
When ApplicativeDo is turned on:
GHC will use a different method for desugaring
do-notationAttempts to use the
Applicativeoperator<*>as far as possible, along withfmapandjoinMakes it possible to use
do-notation for types that areApplicativebut notMonad
Example:
do
x <- a
y <- b
return (f x y)
Translates to:
f <$> a <*> b
Why does caching matter?
Consider our example fragment of code:
length (intersect (friendsOf x) (friendsOf y))
What if this was executed as two fetches?
And what if
x==y?If the two queries returned different results, we’d get bogus behaviour
For this application, caching matters for performance and correctness
Data sources
Several “core data” systems at Facebook
Memcache
TAO
MySQL
Other data sources also involved in Sigma
- Machine learning systems
Architecture
Haxl core doesn’t know about data sources
Data sources are hot-pluggable (!)
A data source interacts with Haxl core in 3 ways
- Persistent state (e.g. threads, connection pools)
- Issuing data fetch requests
- Fetching the data
Implementation
-- Core abstraction
class DataSource req where
{- ... -}
-- An example data source
data ExampleReq a where
CountAardvarks :: String -> ExampleReq Int
ListWombats :: Id -> ExampleReq [Id]
deriving Typeable
-- Fetch data generically
dataFetch :: DataSource req => req a -> Haxl a
Refinement: blocked computations
dataFetch :: DataSource req => req a -> Haxl a
Haxl core has to manage requests submitted with dataFetch
Once an entire round of fetching has stopped making progress, we retrieve the pending requests
class DataSource req where
fetch :: [BlockedFetch req] -> IO ()
data BlockedFetch req =
forall a . BlockedFetch (req a) (MVar a)
Refinement: acquiring pending requests
Remember: Haxl core is agnostic to data sources
We use dynamic typing (Typeable) to manage them
-- hack to support parameterised types
class Eq1 req where
eq1 :: req a -> req a -> Bool
class (Typeable1 req, Hashable1 req, Eq1 req) =>
DataSource req where
data DataState req
fetch :: DataState req
-> [BlockedFetch req]
-> IO ()
DataState is an associated type
Is that it?
A lot of other demanding and intricate work went into making Haxl run effectively at scale.
As of June 2015, Haxl-powered Sigma was handling over a million RPS.
For many more juicy details, see the article on code.facebook.com.
Code is open sourced on Hackage as haxl.
Haxl inspired Twitter’s Stitch project.
Dynamic types at scale
Facebook is famous for having been built in PHP.
As our code base grew to millions of lines in size:
PHP’s dynamic typing became increasingly troublesome
It became harder to understand and change code
“What type is this variable? I have no idea!”
We observed the same growing pains in Javascript.
Gradual typing
In between static and dynamic type systems lies the interesting middle ground of gradual types.
Some variables are given explicit types, which can be checked statically
Others have types checked at runtime
Languages that start all-dynamic can acquire gradual typing
- Racket, Clojure
…as can languages that start all-static!
- C#
Gradual typing of PHP
At Facebook, we developed Hack: hacklang.org
Addresses the productivity and reliability problems of PHP
(lots of PHP!)A gradually typed language
Add type annotations to existing code incrementally
IDE support: type-sensitive autocomplete, live typechecking and errors
Hackification
Well over 90% of Facebook PHP code is now statically typed via Hack.
- Much of our code was always statically typeable!
Some code is still too tricky to statically type.
- Tends to be isolated.
Type system ergonomics
PHP programmers rightly value rapid feedback.
- Fast iteration: “Save my file, hit reload in the browser, see what went wrong, fix it.”
With Hack, this short cycle time had to be preserved.
Implications are huge:
Type checking algorithms must be fast and mostly incremental
A persistent daemon maintains type system state
Filesystem change notifications lead to “instant” type checking
Typical typechecking time is 200ms (outliers up to 1000ms) for millions of lines of code
Implementing Hack
The Hack type checker is built in OCaml.
OCaml had, until recently, zero support for multiple CPUs.
But multiple CPUs can speed up type checking, and we care greatly about speed …
- Workaround: multi-process architecture using shared memory and lock-free concurrent data structures
Gradually typed Javascript: Flow
Following the success of Hack at Facebook, we started work on the same problem space of types in Javascript.
The result is the Flow type checker: flowtype.org
Similar ergonomics:
Heavily based on automatic inference
Fast incremental checks
Philosophical differences
TypeScript
Pessimistically assumes most JS code is statically untypeable
Only typechecks code that is explicitly annotated
Designed mainly for IDE tooling; unsound whenever convenient
Flow
Optimistically assumes that most JS is statically typable
Uses type inference to fill in missing annotations
Designed to find errors; takes soundness far more seriously
Flow vs TypeScript: pragmatics
Somewhat similar ideas, different tradeoffs
Pros for TS:
Really slick Visual Studio integration
Support for typing of third-party js (but all those
.tsfiles are a pain)One-stop type checking and transformation
Pros for Flow:
Path-sensitive type narrowing
Better type checking of idiomatic js
Really fast
Gets non-nullability right (new TS support is unsound)
Summary
React
Highly influential FP-based framework:
Focus:
- Building large, understandable UI-heavy applications
FP heritage:
Immutable data, mostly pure functions
DOM updates abstracted and made efficient
Declarative “here’s how my UI should look at an instant in time” feel
Focus on composable components
Flow for type safety
Mobile development
iOS and Android development are complex and expensive
Huge quirky APIs take ages to master
Investment in one platform is not transferable to the other
Client-side “business logic” locked up in ObjC or Java (or worse, C++)
React Native
Similar playbook to React:
Declarative, self-contained UI components
Rapid feedback during development
But:
Integrates with platform native components
Uses native gesture recognition
Allows proper “look and feel”, not typical “write once, run anywhere” clunkiness
Fetching network data for modern apps
REST isn’t a great model in practice.
Consider the GitHub API.
Here’s a response to a “fetch me all comments on an issue” request.
[
{
"id": 1,
"url": "https://api.github.com/repos/octocat/Hello-World/issues/comments/1",
"html_url": "https://github.com/octocat/Hello-World/issues/1347#issuecomment-1",
"body": "Me too",
"user": {
"login": "octocat",
"id": 1,
"avatar_url": "https://github.com/images/error/octocat_happy.gif",
"gravatar_id": "",
"url": "https://api.github.com/users/octocat",
"html_url": "https://github.com/octocat",
"followers_url": "https://api.github.com/users/octocat/followers",
"following_url": "https://api.github.com/users/octocat/following{/other_user}",
"gists_url": "https://api.github.com/users/octocat/gists{/gist_id}",
"starred_url": "https://api.github.com/users/octocat/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/octocat/subscriptions",
"organizations_url": "https://api.github.com/users/octocat/orgs",
"repos_url": "https://api.github.com/users/octocat/repos",
"events_url": "https://api.github.com/users/octocat/events{/privacy}",
"received_events_url": "https://api.github.com/users/octocat/received_events",
"type": "User",
"site_admin": false
},
"created_at": "2011-04-14T16:00:49Z",
"updated_at": "2011-04-14T16:00:49Z"
}
]
Problems with REST
Kitchen-sink responses with a bunch of information you probably don’t need, because maybe someone sometimes will need some of them
Multiple fetches: data included by reference in a response require costly additional roundtrips, concurrency management, and error handling
Custom endpoint proliferation (and duplication) as new requirements evolve
Version bloat: as client versions proliferate, legacy server endpoints must be maintained
Client-side cache management and UI updates are painful when data changes
GraphQL
A query language and protocol for graph-based data.
Puts the client in control of requesting only the data it needs in a single round-trip.
Unlike SQL, where there’s a storage engine with tables and indices behind the query engine, a GraphQL server is powered by arbitrary code (which you write).
Developer friendliness: self-documenting schema, strong type system, interactive exploration.
GraphQL and Haxl
Chad Austin wrote a great overview of a GraphQL server that he built in Haskell using Haxl: chadaustin.me
In response to a single GraphQL request, this uses Haxl to efficiently query multiple endpoints in a minimal number of roundtrips.
Convergent evolution
Sharing some design features with Haxl is the Relay project: facebook.github.io/relay
Declarative “here’s the data this view needs” specifications
Abstracts away details of how and when to fetch data
When modifications are made to data, hides details of updates and cache consistency
Recap
Haxl demonstrates how to roll your own unobtrusive concurrency
There’s a ton of other FP-influenced work at Facebook:
Languages: Hack and Flow (built in OCaml)
User interfaces: React and React Native
Data: GraphQL and Relay
Key takeaway:
Ergonomics really matter!
All of these systems have had a ton of attention put into making them efficient and easy to work with.