Introducing Apache Foryβ„’ Rust: A Versatile Serialization Framework for the Modern Age

9 min read Original article β†—

TL;DR: Apache Fory Rust is a blazingly-fast, cross-language serialization framework that delivers ultra-fast serialization performance while automatically handling circular references, trait objects, and schema evolution. Built with Rust's safety guarantees and zero-copy techniques, it's designed for developers who refuse to compromise between performance and developer experience.


The Serialization Dilemma​

Every backend engineer has faced this moment: your application needs to serialize complex data structures such as nested objects, circular references, polymorphic types, and you're forced to choose between three bad options:

  1. Fast but fragile: Hand-rolled binary formats that break with schema changes
  2. Flexible but slow: JSON/Protocol with 10x performance overhead
  3. Complex and limiting: Existing solutions that don't support your language's advanced features

Apache Fory Rust eliminates this false choice. It's a serialization framework that delivers exceptional performance while automatically handling the complexities of modern applicationsβ€”no IDL files, no manual schema management, no compromises.

What Makes Apache Fory Rust Different?​

1. Truly Cross-Language​

Apache Fory Rust speaks the same binary protocol as Java, Python, C++, Go, and other language implementations. Serialize data in Rust, deserialize in Python β€” it just works. No schema files. No code generation. No version mismatches.

// Rust: Serialize

let user = User {

name: "Alice".to_string(),

age: 30,

metadata: HashMap::from([("role", "admin")]),

};

let bytes = fory.serialize(&user);

// Python: Deserialize (same binary format!)

user = fory.deserialize(bytes) # Just works!

This isn't just convenient β€” it changes how we develop microservices architectures where different teams use different languages.

2. Automatic Shared/Circular Reference Handling​

Most serialization frameworks panic when encountering circular references. Apache Fory tracks and preserves reference identity automatically:

Shared Reference:

use fory::Fory;

use std::rc::Rc;

let fory = Fory::default();

// Create a shared value

let shared = Rc::new(String::from("shared_value"));

// Reference it multiple times

let data = vec![shared.clone(), shared.clone(), shared.clone()];

// The shared value is serialized only once

let bytes = fory.serialize(&data);

let decoded: Vec<Rc<String>> = fory.deserialize(&bytes)?;

// Verify reference identity is preserved

assert_eq!(decoded.len(), 3);

assert_eq!(*decoded[0], "shared_value");

// All three Rc pointers point to the same object

assert!(Rc::ptr_eq(&decoded[0], &decoded[1]));

assert!(Rc::ptr_eq(&decoded[1], &decoded[2]));

Circular Reference:

use fory::{ForyObject, RcWeak};

#[derive(ForyObject)]

struct Node {

value: i32,

parent: RcWeak<RefCell<Node>>, // Weak pointer breaks cycles

children: Vec<Rc<RefCell<Node>>>, // Strong references tracked

}

// Build a parent-child tree with circular references

let parent = Rc::new(RefCell::new(Node { ... }));

let child = Rc::new(RefCell::new(Node {

parent: RcWeak::from(&parent), // Points back to parent

...

}));

parent.borrow_mut().children.push(child.clone());

// Serialization handles the cycle automatically

let bytes = fory.serialize(&parent);

let decoded: Rc<RefCell<Node>> = fory.deserialize(&bytes)?;

// Reference relationships preserved!

assert!(Rc::ptr_eq(&decoded, &decoded.borrow().children[0].borrow().parent.upgrade().unwrap()));

This isn't just a featureβ€”it's essential for graph databases, object-relational mappers, and domain models.

3. Trait Object Serialization​

Rust's trait system enables powerful abstractions, but serializing Box<dyn Trait> is notoriously difficult. Apache Fory makes it trivial:

use fory::{ForyObject, Serializer, register_trait_type};

trait Animal: Serializer {

fn speak(&self) -> String;

}

#[derive(ForyObject)]

struct Dog { name: String, breed: String }

#[derive(ForyObject)]

struct Cat { name: String, color: String }

// Register implementations

register_trait_type!(Animal, Dog, Cat);

// Serialize heterogeneous collections

let animals: Vec<Box<dyn Animal>> = vec![

Box::new(Dog { ... }),

Box::new(Cat { ... }),

];

let bytes = fory.serialize(&animals);

let decoded: Vec<Box<dyn Animal>> = fory.deserialize(&bytes)?;

// Polymorphism preserved!

decoded[0].speak(); // "Woof!"

decoded[1].speak(); // "Meow!"

Alternative: Using dyn Any without trait registration:

use std::rc::Rc;

use std::any::Any;

// No trait definition or registration needed

let dog: Rc<dyn Any> = Rc::new(Dog { name: "Rex".to_string(), breed: "Labrador".to_string() });

let cat: Rc<dyn Any> = Rc::new(Cat { name: "Whiskers".to_string(), color: "Orange".to_string() });

let bytes = fory.serialize(&dog);

let decoded: Rc<dyn Any> = fory.deserialize(&bytes)?;

// Downcast to concrete type

let unwrapped = decoded.downcast_ref::<Dog>().unwrap();

assert_eq!(unwrapped.name, "Rex");

Supports:

  • Box<dyn Trait> - Owned trait objects
  • Rc<dyn Trait> / Arc<dyn Trait> - Reference-counted trait objects
  • Rc<dyn Any> / Arc<dyn Any> - Runtime type dispatch without traits
  • Auto-generated wrapper types for standalone serialization

This unlocks plugin systems, heterogeneous collections, and extensible architectures that were previously impossible to serialize.

4. Schema Evolution Without Breaking Changes​

Microservices evolve independently. Apache Fory's Compatible mode allows schema changes without coordination:

use fory::{Fory, ForyObject};

// Service A: Version 1

#[derive(ForyObject)]

struct User {

name: String,

age: i32,

address: String,

}

let mut fory_v1 = Fory::default().compatible(true);

fory_v1.register::<User>(1);

// Service B: Version 2 (evolved independently)

#[derive(ForyObject)]

struct User {

name: String,

age: i32,

// address removed

phone: Option<String>, // New field

metadata: HashMap<String, String>, // Another new field

}

let mut fory_v2 = Fory::default().compatible(true);

fory_v2.register::<User>(1);

// V1 data deserializes into V2 structure

let v1_bytes = fory_v1.serialize(&user_v1);

let user_v2: User = fory_v2.deserialize(&v1_bytes)?;

// Missing fields get default values automatically

Compatibility rules:

  • βœ… Add new fields (default values applied)
  • βœ… Remove fields (skipped during deserialization)
  • βœ… Reorder fields (matched by name)
  • βœ… Change nullability (T ↔ Option<T>)
  • ❌ Type changes (except nullable variants)

This is critical for zero-downtime deployments and polyglot microservices.

The Technical Foundation​

Protocol Design​

Apache Fory uses a sophisticated binary protocol designed for both performance and flexibility:

| fory header | reference meta | type meta | value data |

Key innovations:

  1. Efficient encoding: Variable-length integers, compact type IDs, bit-packed flags
  2. Reference tracking: Deduplicates shared objects automatically (serialize once, reference thereafter)
  3. Meta compression: Gzip compression for type metadata in meta-sharing mode
  4. Little-endian layout: Optimized for modern CPU architectures

Compile-Time Code Generation​

Unlike reflection-based frameworks, Apache Fory generates serialization code at compile time via procedural macros:

use fory::ForyObject;

#[derive(ForyObject)]

struct Person {

name: String,

age: i32,

address: Address,

}

// Macro generates:

// - fory_write_data() for serialization

// - fory_read_data() for deserialization

// - fory_reserved_space() for buffer pre-allocation

// - fory_get_type_id() for type registration

Benefits:

  • ⚑ Zero runtime overhead: No reflection, no vtable lookups
  • πŸ›‘οΈ Type safety: Compile-time errors instead of runtime panics
  • πŸ“¦ Small binary size: Only code for types you actually use
  • πŸ” IDE support: Full autocomplete and error checking

Architecture​

Apache Fory Rust consists of three focused crates:

fory/ # High-level API

└─ Convenience wrappers, derive re-exports

fory-core/ # Core serialization engine

β”œβ”€ fory.rs # Main entry point

β”œβ”€ buffer.rs # Zero-copy binary I/O

β”œβ”€ serializer/ # Type-specific serializers

β”œβ”€ resolver/ # Type registration & dispatch

β”œβ”€ meta/ # Meta string compression

└─ row/ # Row format implementation

fory-derive/ # Procedural macros

β”œβ”€ object/ # ForyObject derive macro

└─ fory_row.rs # ForyRow derive macro

This modular design ensures clean separation of concerns and makes the codebase maintainable.

Benchmarks: Real-World Performance​

DatatypeSizeOperationFory TPSJSON TPSProtobuf TPSFastest
companysmallserialize10,063,906761,673896,620fory
companymediumserialize412,50733,83537,590fory
companylargeserialize9,183793880fory
ecommerce_datasmallserialize2,350,729206,262256,970fory
ecommerce_datamediumserialize59,9774,6995,242fory
ecommerce_datalargeserialize3,727266295fory
personsmallserialize13,632,5221,345,1891,475,035fory
personmediumserialize3,839,656337,610369,031fory
personlargeserialize907,85379,63191,408fory
simple_listsmallserialize27,726,9454,874,9574,643,172fory
simple_listmediumserialize4,770,765401,558397,551fory
simple_listlargeserialize606,06141,06144,565fory
simple_mapsmallserialize22,862,3693,888,0252,695,999fory
simple_mapmediumserialize2,128,973204,319193,132fory
simple_maplargeserialize177,84718,41918,668fory
simple_structsmallserialize35,729,59810,167,0458,633,342fory
simple_structmediumserialize34,988,2799,737,0986,433,350fory
simple_structlargeserialize31,801,5584,545,0417,420,049fory
system_datasmallserialize5,382,131468,033569,930fory
system_datamediumserialize174,24011,89614,753fory
system_datalargeserialize10,6718761,040fory

When to Use Apache Fory Rust​

βœ… Ideal Use Cases​

  1. Microservices with polyglot teams

    • Different services in different languages
    • Need seamless data exchange without schema files
    • Schema evolution across independent deployments
  2. High-performance data pipelines

    • Processing millions of records per second
    • Memory-constrained environments (use row format)
    • Analytics workloads with selective field access
  3. Complex domain models

    • Circular references (parent-child relationships, graphs)
    • Polymorphic types (trait objects, inheritance hierarchies)
    • Rich object graphs with shared references
  4. Real-time systems

    • Low-latency requirements (<1ms serialization)
    • Memory-mapped file access
    • Zero-copy deserialization critical

⚠️ Consider Alternatives If​

  1. You need human-readable data: Use JSON/YAML for debugging
  2. You need long-term storage format: Use Parquet for data lakes
  3. Your data is trivial: serde + bincode is simpler for basic types

Getting Started in 5 Minutes​

Installation​

Add to Cargo.toml:

[dependencies]

fory = "0.13"

Basic Object Serialization​

use fory::{Fory, Error, ForyObject};

#[derive(ForyObject, Debug, PartialEq)]

struct User {

name: String,

age: i32,

email: String,

}

fn main() -> Result<(), Error> {

let mut fory = Fory::default();

fory.register::<User>(1); // Register with unique ID

let user = User {

name: "Alice".to_string(),

age: 30,

email: "alice@example.com".to_string(),

};

// Serialize

let bytes = fory.serialize(&user);

// Deserialize

let decoded: User = fory.deserialize(&bytes)?;

assert_eq!(user, decoded);

Ok(())

}

Cross-Language Serialization​

use fory::Fory;

// Enable cross-language mode

let mut fory = Fory::default().compatible(true).xlang(true);

// Register with id/namespace for cross-language compatibility

fory.register_by_namespace::<User>(1);

// fory.register_by_namespace::<User>("example", "User");

let bytes = fory.serialize(&user);

// This can now be deserialized in Java, Python, Go, etc.

Register types with consistent IDs or names across all languages:

  • By ID (fory.register::<User>(1)): Faster serialization, more compact encoding, but requires coordination to avoid ID conflicts
  • By name (fory.register_by_name::<User>("example.User")): More flexible, less prone to conflicts, easier to manage across teams, but slightly larger encoding

Supported Types​

Apache Fory Rust supports a comprehensive type system:

Primitives: bool, i8, i16, i32, i64, f32, f64, String

Collections: Vec<T>, HashMap<K,V>, BTreeMap<K,V>, HashSet<T>, Option<T>

Smart Pointers: Box<T>, Rc<T>, Arc<T>, RcWeak<T>, ArcWeak<T>, RefCell<T>, Mutex<T>

Date/Time: chrono::NaiveDate, chrono::NaiveDateTime

Custom Types: Derive ForyObject for object graphs, ForyRow for row format

Trait Objects: Box<dyn T>, Rc<dyn T>, Arc<dyn T>, Rc<dyn Any>, Arc<dyn Any>

Roadmap: What's Next​

Apache Fory Rust is production-ready today, but we're just getting started and continuing active development:

βœ… Shipped in v0.13​

  • βœ… Static codegen via procedural macros
  • βœ… Row format serialization with zero-copy
  • βœ… Cross-language object graph serialization
  • βœ… Shared and circular reference tracking
  • βœ… Weak pointer support (RcWeak, ArcWeak)
  • βœ… Trait object serialization (Box/Rc/Arc)
  • βœ… Schema evolution in compatible mode

🚧 Coming Soon​

  • Cross-language reference serialization: serialize Rc/Arc to/from other languages.
  • Partial row updates: Mutate row format in-place

🎯 Help Wanted​

We're actively seeking contributors for:

  • Performance tuning: Profile and optimize hot paths
  • Documentation: More examples, tutorials, and guides
  • Testing: Fuzzing, property tests, edge case coverage

Production Considerations​

Thread Safety​

Fory becomes fully thread-safe after registration is complete. Once every type is registered (which requires &mut Fory), wrap the instance in an Arc and freely share it across worker threads for concurrent serialization and deserialization.

use fory::Fory;

use std::{sync::Arc, thread};

let mut fory = Fory::default();

fory.register::<Item>(1)?;

let fory = Arc::new(fory); // `Fory` is Send + Sync once registration is done

let item = Item::default();

let handles: Vec<_> = (0..4)

.map(|_| {

let fory = Arc::clone(&fory);

let input = item.clone();

thread::spawn(move || {

let bytes = fory.serialize(&input);

let decoded: Item = fory.deserialize(&bytes).expect("valid data");

(bytes, decoded)

})

})

.collect();

for handle in handles {

let (bytes, decoded) = handle.join().expect("thread finished");

// work with `bytes` / `decoded`

}

Error Handling​

Apache Fory uses Result<T, Error> for all fallible operations:

use fory::Error;

match fory.deserialize::<User>(&bytes) {

Ok(user) => process_user(user),

Err(Error::TypeMismatch) => log::error!("Schema mismatch"),

Err(Error::BufferTooShort) => log::error!("Incomplete data"),

Err(e) => log::error!("Deserialization failed: {}", e),

}

Documentation​

Community and Contribution​

Apache Fory is an Apache Software Foundation project with a vibrant, growing community:

How to Contribute​

We welcome contributions of all kinds:

  1. Code: Implement features from the roadmap
  2. Docs: Write tutorials, examples, and guides
  3. Testing: Add benchmarks, fuzz tests, integration tests
  4. Feedback: Report bugs, request features, share use cases

See CONTRIBUTING.md for guidelines.

License​

Apache Fory is licensed under the Apache License 2.0, a permissive open-source license that allows commercial use, modification, and distribution.

Conclusion​

Apache Fory Rust represents a paradigm shift in serialization:

  • No more trade-offs: Get performance and flexibility
  • No more boilerplate: Derive macros handle the complexity
  • No more lock-in: Trait-object and shared reference support by nature

Whether you're building microservices, data pipelines, or real-time systems, Apache Fory Rust delivers the performance you need with the ergonomics you deserve.

Try it today:

Join the community:

git clone https://github.com/apache/fory.git

cd fory/rust

cargo test --features tests

Share your experience:

  • Write a blog post about your use case
  • Present at your local Rust meetup
  • Contribute benchmarks from your domain