Settings

Theme

Show HN: TARDIS – Warp a process's perspective of time by hooking syscalls

github.com

77 points by DavidBuchanan 9 years ago · 30 comments

Reader

foob 9 years ago

It's worth mentioning that libfaketime [1] is a more mature alternative with macOS support and more complete coverage of the relevant system calls. Nothing against the current project but that might be a better choice for many people.

[1] - https://github.com/wolfcw/libfaketime

  • zitterbewegung 9 years ago

    Also you can google the name to figure out how to use it.

  • aray 9 years ago

    Libfaketime is great, especially if you have a good idea what time calls your target process is using.

    If they're inspecting EHDR and calling VDSO directly, though, or they've statically compiled in their libc, then it won't help though.

    I've also had a lot of issues getting it into tightly sandboxed contexts (e.g. the flash runtime in chrome).

  • kreetx 9 years ago

    I've been a satisfied user of libfaketime as well. Would be great if the OP could highlight the feature differences to it!

  • nzmsv 9 years ago

    Sadly, the license is GPL rather than the expected LGPL.

    • jwilk 9 years ago

      In what sense LGPL is "expected"?

      • zeckalpha 9 years ago

        For a lib

        • dTal 9 years ago

          This library is not meant to be incorporated into an application - it is meant to be preloaded to modify the behaviour of an existing program. So I don't see what permission to link into a proprietary program really gets you.

aray 9 years ago

Nice Implementation! I built something similar a while ago to warp time in video games (for training reinforcement learning agents).

Some issues off the top of my head (that I ran into): VDSO censoring is a lot harder than just symbol overriding, it has to actually be removed from the aux vector (third thing on the process stack when the process launches after arguments and environment variables. The EHDR entry is what you need to remove.

Gist for censoring EHDR: https://gist.github.com/machinaut/a08b581c921775263cf0e20ccc...

Some libc's (notably glibc) are really good at finding/using EHDR even if you do that symbol overriding, so dumping EHDR is the most assured way of making sure it's gone.

ptrace overhead is HUGE -- because you're debugging a userspace program with another program every time call now results in 4 context switches (to/from your debugging program at every time call entry/exit), even pinning both to the same CPU this is not fast.

This is where my least favorite part of the linux kernel comes in handy: SECCOMP-BPF. Instead of firing _every_ syscall, you can write a syscall packet filter rules list that only matches certain time-based syscalls with certain arguments. This greatly improves the performance (but for me, still not fast enough to play video games live).

At the end of the day I ended up reviving a >10 year old patch someone sent to the linux kernel to add these parameters (time offset and time warp) to thread structs and do the warping in the kernel (much faster -- dont pay the context overhead, etc). Sadly even this didn't work because our end application needed to run on multiple clouds in docker, and we'd need to have access to the host kernel to do these operations.

I'd like to have an affine time warp as part of the cgroups, and then maybe extend it through runc so anyone can run time-warped docker containers, but maybe that's wishful thinking.

Overall I think this is great work, and super happy you posted it. I'd love to chat about it sometime.

(P.S. most ironic to me was my version of this was called 'timelord' :)

  • AstralStorm 9 years ago

    And the most foolproof way would be to run in a virtual machine or a prepared container. Pretty fast too.

    Having a clock cgroup would be easier and more useful than you'd think. Also, you can play tricks like ntpd does in a container. (e.g. adjtime)

    • aray 9 years ago

      This depends on the virtual machine or container!

      Ironically, because the folks working on containers/VMs are _really_ good at what they do, time access calls in particular have been really optimized (they get called a lot). This makes it very hard to intercept time calls at this layer! e.g. KVM and LXC both essentially hand time calls straight to the host.

      This means time intercepts at the VM/container layer need fundamental support (I mentioned affine time transformation in the linux kernel in another comment) which doesn't work for people who need to deploy on current hosted container.

    • MayeulC 9 years ago

      According to Wikipedia, there is a proposed time namespace. Controlled time warping could be a useful feature there.

      A proposal should be made, if that's not part of the planned features.

tyingq 9 years ago

There was some similar commercial implementation like this called "time machine" (I think) that sold like hotcakes during the Y2K prep work...had versions for all the various RISC vendors, Linux, etc.

Edit: Yep, still exists. Was $2000 per server back then. Wonder if the price was some sort of inside joke... $2k to prepare for Y2K? https://www.cnet.com/news/new-tool-tests-for-y2k-compliance/

hoytech 9 years ago

There's also fluxcapacitor:

https://github.com/majek/fluxcapacitor

  • majke 9 years ago

    fluxcapacitor autor here.

    Fluxcapacitor is focused on speeding up complex programs - most notably test suites (that do fork/execve). The idea is to cheat on time, to allow testing timeout-related branches in code. You can spin out a server and a client, write a test that needs to wait 60 seconds for completion and see it pass in 0.6 seconds.

    Tardis on the other hand seems single-threaded, which makes it useful for... not really sure. I guess a demo how to use ptrace.

    The problem with syscall interception with ptrace is that it doesn't work for golang. Golang doesn't use libc. This means there is no way to hook into the VDSO[1] - based syscalls. They are just a jump from userspace to special userspace memory region, so ptrace won't ever see it.

    So, this approach, using ptrace, as used in tardis and fluxcapacitor will not work for golang.

    [1] http://man7.org/linux/man-pages/man7/vdso.7.html

    • aray 9 years ago

      Syscall interception works for _every_ program, it's just a matter of doing it correctly.

      VDSO is a small set of (3) calls which are not syscalls but direct calls (for speed/efficiency). Our goal is to remove this functionality to force libs to call through the (slower) syscall route instead.

      I mention in another comment how EHDR censoring is needed for robust VDSO removal.

      I've not run into a libc where censoring EHDR breaks time calls (i.e. it doesn't fallback to syscalls) but possibly golang has this.

      In this case it's straightforward to setup a fake VDSO and then instead of EHDR censoring you just replace it with your fake VDSO address and you're golden!

amelius 9 years ago

It uses ptrace, and ptrace sucks under Linux because you can't use it recursively (i.e., a process under ptrace can't ptrace another process).

wyldfire 9 years ago

I designed something similar for general fault injection [1] (and to learn rust). There's no intercept written for time syscalls yet, but it's on the issues list.

[1] https://github.com/androm3da/libfaultinj

throwaway_374 9 years ago

So... in English... for us mere mortals with limited Linux kernel exposure... does this accelerate (or decelerates) the system clock or is it a mock patching for system time function calls?

partycoder 9 years ago

There's people that did this to cheat on our games. But using server time exclusively can help mitigate this.

  • jfoutz 9 years ago

    It's also used in virus and malware detection. Lots of stuff lays low for a week or two to help hide the attack vector.

franze 9 years ago

cool name, sadly copyrighted http://tardis.wikia.com/wiki/Tardis:Copyrights

  • wyldfire 9 years ago

    As pointed out elsewhere this page identifies a trademark on the term "TARDIS" and not a copyright. Terms like "TARDIS" cannot be copyrighted.

    But trademarks have limited scope and the standard test for infringement is that it be "confusingly similar". Hilariously, BBC's TARDIS USPTO word mark [1] includes "... computer software for use in database management; ... computer, electronic and video games programs and equipment, namely, software,"

    [1] registration #4161487 -- http://tmsearch.uspto.gov/bin/showfield?f=doc&state=4803:loe...

    • jwilk 9 years ago

      The link doesn't work:

      This search session has expired. Please start a search session again by clicking on the TRADEMARK icon, if you wish to continue.

  • jwilk 9 years ago

    s/copyrighted/trademarked/

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection