Release Notes for Monte Carlo eXtreme v2016.4

Code name: Dark Matter, released on April 23, 2016

Click this link to download MCX v2016.4.

1. Introduction
2. About this release
3. What's new compared to 0.9.7-2
4. System requirements
5. Reference

1. Introduction

Monte Carlo eXtreme, or MCX, is a fast Monte Carlo simulation package for photon transport in 3D heterogeneous media. It uses Graphics Processing Units (GPU) for massively parallel simulations and offers hundreds times acceleration compared to a traditional single-threaded CPU-based simulation.

2. About this release

MCX v2016.4 (code named "Dark Matter") is a milestone release. It has accumulated numerous improvements over the past two years, and contains a list of major upgrades of new features and fixes of critical bugs. If you are using an older release of MCX, you are strongly recommended to upgrade immediately.

In this new release, MCX implements a fast precise ray-tracing algorithm to propagate photons inside a voxelated domain, replacing the old approximated ray-tracer. This update leads to significantly improved solution accuracy when a medium contains scattering contrast from the background. Moreover, MCX now enables atomic operations by default. This means more accurate results can be obtained near the source regions. Another major improvement is the support of multi-GPUs. One can now simultaneously use multiple GPUs in a computer to obtain proportional speed improvement!

Moreover, the MCX project now has a new domain name and website (at )! We adopted a modern and intuitive web design and highlighted the main functionalities of the website: download, documentation, and user participation. We will continue improving this new web portal to include more information and better serve our users.

To engage with our user community, we launched two new efforts through this new release

1. MCX Speed Challenge

As we announced earlier this month at GTC'2016, we cordially invite CUDA developers and MCX users around the world to join us for improving MCX simulation speed. If one can gain over 50% speed enhancement for 3 of our standard benchmarks, he/she will receive a cash award. The full announcement and detailed rules for the MCX Speed Challenge can be found at

2. MCX GPU Benchmark

We invite all MCX users to submit their GPU hardware benchmark data to our website, with which we will be able to compile a living document ranking all submitted GPU hardware in terms of MCX simulation efficiency. This benchmark report will provide users a guidance on building an efficient computer system for running MCX simulations. Also for fun, users can compete with each other in terms of finding new ways to run mcx efficient. To participate, please run the mcx_gpu_contest.m script in mcxlab.

In addition, starting from this release, we will adopt a time-based release strategy: we will release a new MCX version every 3 months. This way, new features and fixes can get to the hands of users with much shorter delays.

3. What's new compared to 0.9.7-2

Compared to the previous release (version 0.9.7-2, released in Sep. 2014), MCX v2016.4 gains the following new features and bug fixes:

  1. !!key!! new precise ray-tracing algorithm for accurate simulations
  2. !!key!! multiple GPU support
  3. !!key!! atomic operations enabled by default
  4. !!key!! two new RNGs - xorshift128+ and erand48, xorshift128+ is now the default
  5. !!key!! save detected photon seeds to enable photon replay
  6. !!key!! new MCX website:
  7. !!key!! significant (3x) speed improvement for Maxwell GPUs
  8. !!key!! add 3 standard benchmarks and launch "MCX Speed Challenge"
  9. !!key!! add support in MCXLAB to allow user-contributed GPU benchmark submissions
  10. !!key!! CUDA libraries are now statically linked in MCX/MCXLAB to simplify installation
  11. support two new source forms: 'line' and 'slit'
  12. fix bugs related to forward scattering bias due to round-off test
  13. better heuristics for determining thread/block configurations when using -A (Fanny Paravecino)
  14. fix JSON input crash
  15. fix refractive index mismatch bug in transmission calculations

Pre-compiled MCX are provided for Windows (64 bit), Linux (64bit) and Mac OS (64bit). In the case of MCXLAB, mex files for both Matlab and Octave on these platforms are provided. All binaries have been tested on Fermi/Kepler and Maxwell GPUs.

The provided binaries require a Fermi (Compute Capability 2.0) or newer GPU. If you have an older GPU (CC 1.0 or 1.1), you will have to recompile mcx using "make fast".

The detailed change logs can be found in the ChangeLog and Github commit history pages.

4. System requirements

To install MCX version v2016.4, you need

  • a CUDA-enabled graphics card made by NVIDIA, a full list of supported cards can be found here
  • a computer running GNU/Linux, Windows or Mac OS

The CUDA toolkit is no longer required in this r, however, if you run into CUDA errors, please download the latest CUDA driver, you can download from here

In this release, all precompiled binaries, including both mcx executables and mcxlab mex files, have built-in CUDA run-time libraries via static linking. Therefore, downloading/setting CUDA toolkit and the run-time librarie files (cudart.dll/ are no longer needed.

However, if you run into CUDA errors, please first try to update your NVIDIA graphics driver to the latest version

If the latest graphics driver still can not solve the problem, please download the "developer driver" for your GPU. You may download the developer driver as part of the CUDA Toolkit installation package.

To use MCXLAB v2016.4 in GNU Octave, you must install the following:

  • GNU Octave
  • libblas, libgfortran and libhdf5

Be aware, if you have a Maxwell GPU (GTX 980Ti and 980) and plan to run MCX on it, please first test the benchmark script "" or "run_benchmark1.bat" under the mcx/example folder. You are expected to see ~29,000 photon/ms for 980Ti and 20,000 photon/ms for 980. If your simulation speed is around 1,200 to 1,500 photon/ms, that means you are impacted by a bug in the CUDA driver. Please recompile MCX using CUDA 7.0 or 6.5, or wait for NVIDIA to release the updated driver. For details, please see

5. Reference

Qianqian Fang and David A. Boas, "Monte Carlo Simulation of Photon Migration in 3D Turbid Media Accelerated by Graphics Processing Units," Opt. Express, vol. 17, issue 22, pp. 20178-20190 (2009)

Powered by Habitat