C++ vs. OCaml: Ray tracer comparison (2005)

63 points by lnmx 11 years ago · 46 comments

Reader

Kinda missing the point of C++ by using a linked-list.

If you're going to spend the time to implement an algorithm in C++ at least take the time to make it cache aware. Otherwise it's not really surprising when the performance is the same when you're doing the exact same thing.

Would be curious to see what the performance looks like with a tightly packed scene and static dispatch on a discrete number of primitives instead of using virtual functions.

mden 11 years ago
I was curious about whether using a vector over list would have any effect even though the size is so small and the answer seems no (I guess as it could be expected).
It seems like the listed OCaml program has a default level of 9 where as the C++ one uses 6?
Times using original listing (minus print stmts):
```
  $ time ./ray-cpp  
  real    0m3.369s  
  user    0m3.347s  
  sys     0m0.010s

  $ time ./ray-ml   
  real    0m4.710s  
  user    0m4.703s  
  sys     0m0.003s
```
Time after changing the 9 in the ocaml program to a six:
```
  $ time ./ray-ml  
  real    0m4.032s  
  user    0m4.026s  
  sys     0m0.003s
```
Seems like C++ has a good edge. Above was ran on a 64 bit machine.
- vvanders 11 years ago
  
  List is just one part that makes the C++ impl sub-optimal. All the scenes are allocated on the heap so you're actually bouncing around memory twice(once for the next link list entry, second for the actual heap allocation).
  Lastly the virtual call again has the chance to cache miss(although probably only once seeing as we've just got a few impls).
  In a small example like this there's a good chance that most of the heap allocations are close together but put this approach into production and you can very easily see fragmentation start driving up your cache misses.
  (FWIW I'd also store radius*radius on creation, use a BSP/kd-tree and all sorts of other tricks that really help in these types of computations).
  - altern 11 years ago
    
    I did a qucik and dirty test with levels = 14.
    # program time change ./ray-ml: 21.46 # Original ocaml ./ray-cpp: 17.63 # Original c++ ./ray-cpp-no-dtor: 13.86 # Removed all cleanup, ocaml doesn't do it ./ray-cpp-2: 8.90 # Changed list to vector ./ray-cpp-3: 7.89 # Removed virtual functions ./ray-cpp-4: 5.64 # Preallocated large array for all Groups
    
    vvanders 11 years ago
    
    Solid stuff, glad to see someone testing my assumptions :).
    If there's one thing I could teach new CS-grads it's how much your memory access patterns(and cache) really-effing-matter. Everything else is almost noise in comparison.
    
    sanderjd 11 years ago
    
    I'm pretty curious about this: have memory access patterns and cache awareness become more of a big deal in the past decade or so, or has it only recently reached my awareness? I studied CS in the early 2000s and while I was aware of various trade-offs between contiguous and linked data structures, it's only in the last few years that the importance of cache (and non-"system" languages' widespread flouting of its importance) has really bubbled up in my consciousness.
    It's certainly likely that I just didn't take the right kinds of classes, or that I did but just missed or failed to really internalize this, but I'm curious if something else is going on. Has the bottleneck increasingly become memory rather than CPU? Have we more thoroughly exhausted other areas of optimization?
    I was thinking about this while watching a talk[0] given by someone from Google about data structures in the STL, which decried that the standardized specification for `std::unordered_map` more or less forces an implementation to use buckets instead of the more cache-friendly technique of probing. I'm sure that the people who standardized it were very competent, which makes me think that, at least at the time, buckets were thought to be the better implementation.
    [0]: https://www.youtube.com/watch?v=fHNmRkzxHWs
    Edit: Added video link.
    
    blt 11 years ago
    
    > Has the bottleneck increasingly become memory rather than CPU?
    Yes.
    Besides the basic memory latency vs. CPU clock speed issue, there's also:
    - CPU caches have gotten bigger, so the payoff of making an program cache-friendly is larger
    - CPUs have gained SIMD vector ALUs and increased superscalar behavior, so the CPU is capable of processing more data in per clock cycle
    - A cultural shift happened after the freewheeling Moore's Law era ended in the mid-2000s, and people started examining performance more closely
    http://gec.di.uminho.pt/discip/minf/ac0102/1000gap_proc-mem_...
    
    sanderjd 11 years ago
    
    Thanks, this is what I had suspected, but it seems to be left unstated in everything I've read.

kentonv 11 years ago

> Type inference removes the need to specify types explicitly over and over.

These days with C++11's auto, this is improved a lot. It's not full type inference, but eliminates most of the boilerplate, and arguably what's left you really want to keep for documentation purposes.

> The object-oriented representation of a trivial sum type as an inheritance hierarchy of classes is needlessly inefficient.

This use case strikes me as a particularly bad example of sum types: The author wants to represent different kinds of scene members, like spheres and groups, as a sum type. Using a sum type means that anyone wanting to add a new kind of object has to add it to the sum type and handle it everywhere, which will quickly become problematic. Using object-oriented design allows new object types to be introduced without updating everything.

Basically the sum type looks nicer here only because it's a toy example and we're not considering how it will evolve.

That said, I do think sum types are really useful in some use cases, especially in compilers, and I'm sad C++ doesn't have them. I tend to end up defining ASTs using Cap'n Proto as a convenient way to declare a tagged union, even though I have no intent to serialize the structures.

> No need to write constructors for tuples, records and variant types in OCaml.

To be fair, for an all-public struct (as in the example), you don't need constructors in C++ either.

Once you have private members, then of course you need a constructor.

> Native lists in OCaml.

C++11 initializer lists make it basically possible to write list/vector/array classes that "feel native".

> Higher-order functions instead of loops in OCaml.

C++11 lambdas, etc. However, in a lot of cases, loops are easier to read.

vvanders 11 years ago

I've been really happy with C++ lambda's lately.
Would be nice to see Rust's Option<T> and Result<T,E> ported over to C++ since I find that error handling extremely elegant compared to most C++ approaches.
- lholden 11 years ago
  
  Already done (kinda). There is no match exhaustion, but C++ is getting an option type. (Part of the library fundamentals TS). http://en.cppreference.com/w/cpp/experimental/optional
- loop2 11 years ago
  
  Option has been around for a while as boost::optional or now std::experimental::optional. Result is coming and has a reference implementation as boost::expected. There's also a type-erased option type, "any".
  They're quite nice, especially with a few trivial additions for ergonomics.
  - vvanders 11 years ago
    
    Yup, except you lost me as soon as you said "boost". I enjoy my fast compile speeds very much. :)

chollida1 11 years ago

I work on a project that's a mix of F# and C++ and my code tends to break down very similarly to this.

The actual algorithmic lines are very similar in size and readability but when it comes to defining your data structures, F#/OCaml clearly wins.

The author of the website(Jon Harrop) is a very good writer and though his books are expensive I'd recommend them to anyone learning OCaml or F#.

Specifically F#/OCaml for scientists is awesome( it comes in variants for both languages) as is F# for Numerics.

duaneb 11 years ago

Yea, c/++ tends to have a memory oriented way of defining data structures. Sometimes this is great, but most of the time there is a lot of mental translation and work.

thelucky41 11 years ago

The benefit of this comparison is not that the languages have nearly identical performance, but that OCaml can be briefer while accomplishing the same task, with 47% less lines of code.

Saying nothing about comparing LOC in an imperative/OO language to a functional one, does the brevity actually help a reader's understanding of the code at all? It seems to me that a lot of the comparisons call out descriptions of explicit actions in C++ where OCaml does the same action implicitly.

That seems like a language trade-off more than a feature.

progman 11 years ago

Most people just consider the raw execution time to measure performance. In long-term applications that really matters.
In usual applications however most people don't realize that the development time should also be considered. In that sense a program written in a few lines of Python could actually outperform a super optimized C++ program because it usually requires more time and effort to code the same thing in C++.

vardump 11 years ago

From article:

> With two cross-overs, there is little difference between the performance of the OCaml and C++ implementations. Note that the C++ was compiled with CPU-specific optimizations whereas the OCaml binary will run on any 486 compatible.

Ok, so the author is suggesting numerical C++ compiler generated SSE2+ code is comparable to OCaml 486, and thus with only very slow stack based x87 FPU available? SSE beats x87 hands down even without vectorization enabled, using a single float/double (scalar) per XMM vector register! That should make a significant difference in any ray-tracer worth 2 cents.

I didn't bother reading the source code, but I think it's very likely he didn't write C++ code like you should for a performance sensitive numerical application.

Although maybe the code for scene graph, bounding box and hit test was bad? FPU performance starts to matter once you actually find the ray intersecting object...

Anyways, usually if you do use C++, you do so because of specific performance requirements. Not to have the shortest or cleanest possible implementation.

xorblurb 11 years ago

The author seems to not know enough about what he is talking about. For ex. he claims that the performance of passing a struct Vec { double x, y, z; } by value would be very poor, which is trivially not true, and he uselessly wrote some constructors for this and similar POD structs, even explicitly claiming they are needed in C++ (and not in OCaml, implying superiority of the latter on the conciseness criteria)
- agumonkey 11 years ago
  
  The compiler would have inferred the same constructor code ?

36erhefg 11 years ago

You can omit the OO in C++ and achieve a similar code size. If that is a reasonable metric in the first place, is a question the article doesn't address.

evanpw 11 years ago

I'm surprised the performance of the two implementations is so close. Aren't floating-point numbers in OCaml usually boxed? Is the compiler smart enough to optimize away the boxes in this situation?

rwmj 11 years ago

The floats in the structs are unboxed. Float arrays and all-float structs have a special representation[1] in the compiler so they are stored unboxed. I believe the developers of OCaml have expressed some regrets about this, but in this particular case it's likely to be a win.
Within functions, the OCaml compiler can assign floats to registers, so those would also be unboxed at least for the duration of the function.
[1] https://realworldocaml.org/v1/en/html/memory-representation-...

jguegant 11 years ago

One can do a ray tracer in O(1) at runtime in C++ ;)

https://github.com/phresnel/metatrace

Performance 100%!

OCaml is a really clean language though!

joe_the_user 11 years ago

Amazingly, there is no easy way to define a type in C++ which can be "one of these or one of these", e.g. a sphere or a group in this case.

The author may have assumptions I'm missing here but the standard way to have a "one of these or one of these" thing is to have two classes inherent from a given abstract base class and use a pointer to the base class.

If one wants the device specifically as a type, one could wrap the base-pointer in a class.

Is there something I'm missing?

unhammer 11 years ago

https://programmers.stackexchange.com/questions/231092/is-pa... points out some differences.
Some of the things that make me miss OCaml variants/sum types when I have to use C++ subclassing/dynamic dispatch instead: * very lightweight syntax both for creating them and matching on them (and lightweight representation; variants without arguments are just ints under the hood) * compile-time checks on exhaustiveness in pattern matches (so no unexpected NULL's!)
Note that OCaml also allows open/polymorphic variants for the cases where you want to allow expanding the list of options – here the compiler will only check that e.g. a function that matches on only `A|`B won't be sent a `C, although it can be sent a member of a type that contains only `A's. This is typically used for public interfaces where you want to let your implementation expand later (e.g. your current function accepts `UTF8|`UTF16 but you might later decide to accept `WTF8 as well).
the_af 11 years ago

> The author may have assumptions I'm missing here but the standard way to have a "one of these or one of these" thing is to have two classes inherent from a given abstract base class and use a pointer to the base class.
I think this is what the author describes in the sentence that follows the one you quoted. He also calls it "a common (ab)use of object oriented programming."
- AnimalMuppet 11 years ago
  
  So it's reasonable to define data types in this way in OCaml, but it's abusing OO to do so in C++? Given that classes are data types, that seems... less than consistent.

mempko 11 years ago

I would love to see this updated using C++14.

lholden 11 years ago

Totally. I've been mulling over the idea of porting it. Would be interesting to see how the performance compares vs the older style.
Between C++14 and some template usage, I'd imagine the LOC could be reduced a lot.

tacos 11 years ago

And now's a perfect time to reflect on how the two languages evolved during the past decade. C++ obviously moved forward quite a bit -- cleaning up the loop constructs in particular.

OCaml? Uh... there's F# for realists, Scheme for optimists, and Haskell for pedants. If you're actually writing OCaml web apps in OWebl I'd sure love to know why. Amazon released a native C++ SDK for AWS if you're into pain. If you're using "OPAM" to install the OCaml AWS client... wow.

rwmj 11 years ago

Let's see: opam as you mention. Loads of new libraries[0]. Lots of small evolutions in the language in OCaml 4 [1] [2]. Lots of work on optimization in ocamlopt. Aarch64, POWER7 & POWER8, replacement ARMv7 backends.
OCaml is totally a practical language for writing real world applications. As I usually point out in these threads, Red Hat pay me to write OCaml programs[3], that are used by thousands of customers and many more free software users worldwide. The advantages (over other languages, not C++) are: easy linking to C libraries, and builds a native binary with no extra dependencies, speed. Advantages over C++: safety, robustness, compact source code size.
[0] https://opam.ocaml.org/packages/
[1] https://ocaml.org/meetings/ocaml/2014/OCaml2014-Leroy-slides...
[2] https://ocaml.org/meetings/ocaml/2013/slides/leroy.pdf
[3] https://github.com/libguestfs/libguestfs
- tacos 11 years ago
  
  Looks like solid work all around. But if we could weigh it relative to the advancements in C++ the last decade, it wouldn't even show up on the scale.
  Red Hat could pay someone to do exactly what you're doing in many other languages with the exact same results. The only difference would be that the language in question would show up on the StackOverflow list of "most loved" or "most wanted" languages. Rust shows up in 50th place with less than 0.2% share and OCaml doesn't even make that list (TIOBE 9/2015).
  Sounds argumentative, but is a genuine question: why have you chosen OCaml for this? What is so remarkable about it for this use case that 50+ more popular languages wouldn't do it?
  Obviously there are downsides, so what is the upside?
  - rwmj 11 years ago
    
    Did you read my reply? To quote: "easy linking to C libraries, and builds a native binary with no extra dependencies, speed ... safety, robustness, compact source code size" There are certainly many more popular languages, but few with that combination of advantages. The project started in 2009, so you should discount any languages which were not mature before that date.
    There are a few people at Red Hat and outside who contribute. To my knowledge none had prior experience with OCaml, but it's pretty easy to pick up.
    
    zem 11 years ago
    
    even in 2015 i can only think of D, go and haskell that would work as viable alternatives.
- eatonphil 11 years ago
  
  You write OCaml for Red Hat programs that customers use? Or you write OCaml for Red Hat customers' programs? Either way, care to share any examples?
  - Scramblejams 11 years ago
    
    See rwmj's reference [3].
mercurial 11 years ago

Scheme and OCaml don't have that much in common really (unless you mean derivatives like typed racket, but even then).
OWebl, in spite of its shiny website, is not what I would call "popular". You want to look at opium[1] or ocsigen/eliom [2].
Don't know what you have against OPAM, it's the best language-specific dependency manager I know.
1: https://github.com/rgrinberg/opium 2: http://ocsigen.org/eliom/
- kxyvr 11 years ago
  
  My problem with OPAM is that there's no Windows port, which makes developing OCaml programs on Windows difficult at best. In theory, we can compile all of the dependencies by hand, but in practice, the Oasis build scripts refuse to work on Windows for most packages that I've tried.
  Personally, I would love to use OCaml for my company's projects, but the lack of true cross platform tooling makes it impossible at the moment.
  - mercurial 11 years ago
    
    That's a fair criticism. I think there is also an ongoing effort to make OPAM work on Windows, so this may get better in the future.
- tacos 11 years ago
  
  My only problem with OPAM from the precisely eight seconds I've spent looking at it: 1) unsigned packages 2) not named something goofy like "OPamL." Serious missed opportunity there, guys.
  - mercurial 11 years ago
    
    I think the daily quota of puns had been reached on that day. As for signing packages, I think it's being worked on, I remember reading something about it not that long ago.
    
    Drup 11 years ago
    
    http://opam.ocaml.org/blog/Signing-the-opam-repository/
Drup 11 years ago

Hum, OWebl doesn't exist. The author made some buzz because he's decent at marketing (and much better at it than the rest of the OCaml community, clearly), but it doesn't mean it's a thing.
There are other (good) web frameworks in OCaml (ocsigen, opium).
- tacos 11 years ago
  
  https://github.com/eatonphil/owebl
  - Drup 11 years ago
    
    Yes, please open up the code, notice how the implementation of HTTP handling is approximately 30 lines long, and ask yourself if you would actually want to use that. ;)

Settings

C++ vs. OCaml: Ray tracer comparison (2005)

Keyboard Shortcuts