GitHub - binaryeq/daleq: datalog-based binary equivalence

Publications

Jens Dietrich, Behnaz Hassanshahi: DALEQ - Explainable Equivalence for Java Bytecode (preprint)
Jens Dietrich, Tim White, Behnaz Hassanshahi, Paddy Krishnan: Levels of Binary Equivalence for the Comparison of Binaries from Alternative Builds. ICSME'25
Jens Dietrich, Tim White, Valerio Terragni, Behnaz Hassanshahi: Towards Cross-Build Differential Testing. ICST'25

Videos

Build and Use CLI

DALEQ has been developed and tested with Java 17. DALEQ can be built with Maven (mvn clean package). This will create an executable daleq-<daleq-version>.jar in target. This jar contains all Java dependencies needed.

DALEQ needs the following (non-Java) tools to be installed separately:

souffle -- the datalog engine being used
diff -- the standard diff tool
diff2html -- convert diff output to html

The respective commands should be in the path. For Souffle, the executable can also be passed to DALEQ as a parameter. Then run DALEQ as follows:

java -jar daleq-<daleq-version>.jar
 -j1,--jar1 <arg>   the first jar file to compare (required)
 -j2,--jar2 <arg>   the second jar file to compare (required)
 -o,--out <arg>     the output folder where the report will be generated (required)
 -s1,--src1 <arg>   the first jar file with source code to compare (optional)
 -s2,--src2 <arg>   the second jar file with source code to compare (optional)
 -a,--autoopen      if set, the generated html report will be opened automatically (optional, don't use this for CI integration)
 -dq,--daleq        one of {sound,soundy,both} (optional, default is soundy)

Running the program will create a report report.html in the specified output folder.

The program returns with the following exit codes:

0 - all classes and resources are equal, source codes (if analysed) are equivalent
1 - some classes are not equal but still equivalent, resources are equal, source-codes (if analysed) are equivalent
2 - some classes are not equivalent (wrt daleq) or some resources (e.g. metadata) are not equal or some sources (if analysed) are not equivalent

This can be used to integrate the tool into CI processes.

Locating Souffle

Daleq will try to find the souffle executable in the path. This can be overridden by passing it as a JVM argument, i.e. -DSOUFFLE=<path-to-souffle>.

Overview

Daleq takes Java bytecode as input and produces a relational database. This is done in two steps.

Step 1 - EDB Extraction

flowchart LR
    bytecode["bytecode (.class)"] --> asm(("asm")) --> edb["EDB"] --> souffle(("souffle")) --> idb["IDB"]

A database representing the bytecode is extracted. An asm-based static analysis is used for this purpose. The result is the EDB (extensional database), a folder with tsv files, each tsv file corresponding to a predicate. Those facts represent the bytecode, including class name, bytecode version, super types and interfaces, access flags for method, fields and classes, bytecode instructions in methods, etc.

The representation is low-level, however, there is one significant abstraction (provided by asm): constant pool references are resolved.

Each fact has a unique generated id that is used to provide provenance.

Step 2 - IDB Computation

The IDB computation applies rules to compute a second set of facts (IDB files) that normalises the EBD to establish whether two bytecode compared are equivalent or not.

Rules that are applied for this purpose are defined in src/main/resources/rules/, advanced.souffle is the default set.

The API to generate the IDB is provided by io.github.bineq.daleq.Souffle::createIDB.

Step 3 - Comparing Bytecode

When trying to establish equivalence between two (compiled) classes, IDBs cannot be compared directly for the following reasons:

The provenance (id) terms -- this reflects the process of normalisation that might be different for each of the classes compared
the IDB contains additional facts for predicates to keep track of facts that have been removed (i.e. EDB facts that have no counterpart in the IDB), or moved.
facts corresponding to bytecode instruction have an instructioncounter slot to define the order of instructions within a method. Those might change in the IDB as normalisation may move/remove/insert facts corresponding to instructions.

Projection

The purpose of projection is to remove the aforementioned parts (facts and slots) from the database. At the moment normalisation is implemented in Java.

Workflow to Compare two Binaries

The classes io.github.bineq.daleq.idb.IDBReader and io.github.bineq.daleq.idb.IDBPrinter are used to read an IDB, and convert it into a textual representation suitable for comparison and diffing (using a standard text fiff tool such as diff or kdiff3). This also supports projection.

flowchart LR
    bytecode1["class1"] --> asm1(("asm")) --> edb1["EDB1"] --> souffle1(("souffle")) --> idb1["IDB1"] --> IDBReader1(("project"))  --> file1 --> diff --> result
    bytecode2["class2"] --> asm2(("asm")) --> edb2["EDB2"] --> souffle2(("souffle")) --> idb2["IDB2"] --> IDBReader2(("project"))  --> file2 --> diff

Normalisation Rules

The datalog rules being used can be found in src/main/resources/rules/. The structure of the rules is as follows:

core.souffle - mostly generated standard rules mapping EDB facts to IDB facts, plus some additional guard predicates that can be used to override default rules
commons/access.souffle - rules to derive facts for access flags such as final and public from access flags using bit encoding, for improved usability
normalisations/sound - normalisations that have no impact on program behaviour
normalisations/soundy - normalisations that are soundy in the sense that they might have some impact on program behaviour if the program uses reflection or similar dynamic programming techniques to inspect its own code. Those have been vetted for possible exploitability. For a conservative approach, those rules can be excluded from the analysis by using the sound CLI option.

Provenance

The first slot (term) for each fact is an id. For EDB facts, those ids are generated during the bytecode analysis / extracyion phase.

Example (from AALOAD.facts):

F1428629	org/apache/hadoop/hdfs/HAUtil::getConfForOtherNodes(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List;	3200

When the IDB is created, each IDB fact also has this slot. The values encode a derivation tree. Example:

This fact has been computed by applying rule R_AALOAD to fact F1428629. I.e. those ids encode a derivation tree:

graph TD;
R_AALOAD --> F1428629;

A less trivial example is R_42[F1,R_43[F2,F3]] , this encodes the following derivation tree:

graph TD;
    R_42 --> F1;
    R_42 --> R_43;
    R_43 --> F2;
    R_43 --> F3;

The project contains a parser to transform those strings into trees (io.github.bineq.daleq.souffle.provenance.ProvenanceParser).

The grammar of the encoded proofs is defined here: src/main/antlr4/io/github/bineq/daleq/souffle/provenance/Proof.g4.

The Generated Report

This is an example of a generated report. Here, we have compared the jar file for the artifact javax.transaction:jta:1.1 from Maven Central with the corresponding artifact rebuilt by Google's assured open source. Daleq employs various analysers to compare classes, meta-data and resources within the jars. The results are displayed in a table, the rows correspond to files within the jar(s), and the columns correspond to the various analysers. Analysers compare the content of files, specialised analysers are used to compare source code and bytecode. The actual daleq analyser results are displayed in the last column. Please have a look at our paper for more details.

Possible analysis result states are:

PASS the comparison yields true
FAIL the comparison yields false
N/A the comparison cannot be applied (e.g., a source Code comparison cannot be used to compare bytecode)
ERROR the comparison has resulted in an error

The markers next to each result are hyperlinks to generated pages with additional provenance. For non-equivalence statements, those pages are usually diff reports rendered in HTML. For daleq equivalence statements, advanced diff reports are generated based on the derivations recorded when datalog rules are applied. This is an example:

Constructing the EDB Only

usage: java -cp <path-to-built-jar> io.github.bineq.daleq.edb.FactExtractor
 -cl <classes>   the location of compiled classes, a jar file or folder
 -f <facts>      a folder where to create the extension database (input
                 .facts files), the folder will be created if it does not
                 exist
 -s <souffle>    a file where to create the souffle program (.souffle
                 file) containing imports and input predicate declarations
 -v              whether to verify the datalog facts against the schema,
                 must be true or false, default is true

Notes

Testing during Builds

Note that tests requires souffle to be set up (see also IDB Computation section). This has to be done locally at the moment, see also issue 11. Tests that require souffle will not fail but will be skipped if souffle is not available.

Performance and Issues with Souffle

Souffle occasionally fails. When souffle is used, a Thread::sleep instruction is used to pause DALEQ. This reduces the number of errors (perhaps a souffle race condition?), but makes DALEQ slower.

Souffle issues will be reported by the appplication, and flagged as error .

Adding Your Own Analyser

Additional analysers can be easily added by following those steps:

Write an analyser (say MyAnalyser) implementing io.github.bineq.daleq.cli.Analyser
In your project, add a file src/main/resources/META-INF/services/io.github.bineq.daleq.cli.Analyser
Add a line with the fully qualified name of MyAnalyser to this file
Build a jar file
When running DALEQ, add this jar file to the classpath
Note that the order in which columns are displayed in reports is determined by the property io.github.bineq.daleq.cli.Analyser::positionHint, a values between 0 and 100.

Contributing

To contribute to this project, please follow the steps below to set up your development environment.

First, create a virtual environment to isolate dependencies:

Activate the virtual environment and install the pre-commit hooks.

source .venv/bin/activate
pip install -r requirements.txt
pre-commit install

Now, the hooks will automatically run on each commit to check for any issues in the code. You can manually run the pre-commit hooks on all files at any time with:

pre-commit run --all-files