HifiZine
The enthusiast's audio webzine

Virtualizing Roon Server – Part 2

This is a continuation of Virtualizing Roon Server, in which I set up a virtualized Roon Server on a couple of different small computers and ran some initial performance tests. In this part, I’ll do some more tests and wrap up with my thoughts on this experiment.

Contents

More load tests

This section continues the tests I started in the previous article. The aim is to better understand how Roon Server uses resources – in particular CPU cycles – to help with sizing a VM.

As before, each graph shows the CPU usage of the Roon VM as captured by Netdata, with percentage amounts on the left-hand scale relative to a single core. Full usage of the H2 CPU would be 400% and full usage of the NUC CPU would be 800%.

DSD oversampling

Previously I used oversampling to 768 kHz and a long convolution filter to try and stress the processor. Oversampling to DSD can be even more DSP-intensive. I set Roon to oversample to the maximum DSD rate that my ADI-2 Pro can accept, DSD256.

Odroid H2

With the ODroid H2 oversampling the synthetic test files to DSD256, CPU usage of the Roon VM sits right on 100%:

Roon’s DSP Engine page says that “Roon currently runs the DSP engine on one CPU core per zone,” so it seems that perhaps a single DSP thread is saturating one core. Sure enough, playing back regular audio files resulted in some stuttering, and playing back via Qobuz resulted in files stuttering then a skip to the next track.

One option is to take it down a notch and oversample to DSD128 instead. In this case, the H2 runs at 65% of a core and Qobuz playback worked fine, at least in some short listening tests.

However, it’s also possible to parallelize the DSD oversampling – Roon has the ability to use more than one core for DSP in this particular case only. I didn’t realize at first because the option to turn it on is hidden in the UI if the processor has only two cores!

If the VM is allocated 3 cores this option can be turned on and the DSD256 upsampling on the H2 runs about 112% of a core with 190% peaks:

In this case, there was no stuttering even with Qobuz playback. Incidentally, if the VM is set back to two cores, the option to parallelize the modulator will remain active although removed from the UI, and playback is still good.

NUC8i5BEH

The NUC had no problem with oversampling to DSD256, running around 50% of a logical core:

Streaming from Qobuz

Interestingly, streaming from Qobuz adds quite a bit of additional load to the processor while the file is downloading. For these tests, I queued up files at 16/44, 24/96, and 24/192 bit depth/sample rate. I started the first track, waited until about 30 seconds after the download stopped, then skipped to the next track. My network download speed is about 35 megabits/sec; someone with a higher network speed might see higher CPU usage.

ODroid H2

With no DSP, the ODroid H2 runs about 100% of a core while downloading:

All tracks are the same length – 4 minutes, within 10 seconds – so it’s curious that the 192k track is quicker to download than the 96k track.

With PCM oversampling to 768k and a 128k-tap convolution filter, the H2 runs at about 140% of a core while downloading, with peaks up to 190%:

With DSD256 oversampling, parallel delta-sigma modulation turned on and three cores allocated to the Roon VM, the H2 runs around 200% while downloading, with peaks up to 270%:

I didn’t hear any glitches or stuttering during this last test, but I’ll continue to use the virtualized Roon Server this way and update here if any occur.

NUC8i5BEH

With no DSP, the NUC runs about 70% of a logical core while downloading:

With PCM oversampling to 768k and a 128k-tap convolution filter, the NUC runs at around 90% of a logical core while downloading, with peaks of up to about 130%.

With DSD256 oversampling, the NUC runs around 85% while downloading, with peaks as high as 160%:

Other tasks

Roon Server needs to do other things besides music playback. For example, using the UI while Roon is idling produces some blips in the CPU usage graph:

Heavier activity is generated by a library import. Here are the first three minutes on the H2 importing from a direct-attached drive.

The library import works the two allocated cores pretty well. I wondered if this might upset audio playback in some way. In the next graph, playback of a 16/44 file from Qobuz with 768k oversampling and convolution starts just before 1:24:30.

While a solid amount of the two allocated cores is being used, there were no playback glitches or stuttering. Roon seems to do a great job of prioritizing audio playback over other tasks.

After the library import, Roon by default analyzes the actual audio data in every file for purposes such as volume normalization and the waveform display. The amount of CPU allocated to this background task can be found in Settings → Library:

The default setting is Throttled, which uses about 40% of a core (regardless of the processor). After an initial library load, this analysis takes a long time (days or weeks), so if one were really in a hurry the VM could be allocated more cores for this task. This is more interesting to see on the NUC – I allocated all 8 logical cores to the Roon VM, then increased the Analysis Speed setting from Off to Throttled, then 1, 2, 4 and 8 cores:

One must be careful, though, as there are always trade-offs! The NUC is not designed for continuous high workloads and CPU temperatures rose quickly until all (physical) cores were sitting at 100 degrees:

Summary of numbers

The graphs and numbers above are relative to a single core. To get a better understanding of resource usage per physical machine, here are some of the key numbers converted to a proportion of the CPU – that is, where full usage of the CPU would be 100%.

I’ve used the constant load figures here, but as shown in the graphs there will usually be some additional peak usage. These are all with a single audio stream. While I myself only ever run a single stream, someone running multiple simultaneous streams would see these load figures scale accordingly.

(*) Three allocated cores recommended for the H2.

Conclusion

I’ve been running a virtualized Roon Server for a few weeks now with two cores of the H2, with no issues. I’ve even tried running on a single core – without DSP – and had no problems with playback. For this article I tried stressing the virtualized Roon Server more and I’m impressed by how well it handled the loads I threw at it. I declare the experiment a success!

For regular playback, either processor is loafing along. In my efforts to use up CPU power with DSP, the only limitation I ran into is that the H2’s single-core performance is not enough for DSD256 upsampling, but turning on the (hidden) option to parallelize the modulator has solved that. As to whether these extreme oversampling frequencies are even a good idea, I guess that’s for each listener to decide.

A cautionary note: virtualization is not for everyone. For starters, if you don’t have other services that you want to run on your physical hardware, there’s not much point. And it does involve a deeper level of “IT,” although Proxmox VE makes it surprisingly easy.

Although I used the ODroid H2 and Intel NUC for this experiment, there’s no particular reason that either of these need be used for a virtualized Roon Server. In reality, I expect that anyone considering doing this will already have suitable hardware. I hope however that this article might encourage some to give it a try, and please post in the comments below if you do.

What about virtualizing Roon ROCK? I’ll take a look at that in the next article.

 

Postscript: Containerizing Roon Server

After completing this article, I decided to try running Roon Server in an LXC container. This could be thought of as “lightweight virtualization” and is directly supported by Proxmox VE. I won’t go into details on the procedure, but a few notes that may help if you decide to try it:

  • The container must be privileged (uncheck “Unprivileged” on the first screen when creating the container).
  • If you want to attach a drive to the container, mount and format it on the PVE host, then use a bind mount.

Here is the load graph on the ODroid H2 when streaming from Qobuz with the test playlist used earlier in this article:

Here is the load graph when streaming the playlist from Qobuz while upsampling to DSD256:

The Roon Server container uses less CPU than the Roon Server VM. The difference is greater than I expected.


Readers' comments

    I have a very different experience streaming from Qobuz. There’s a brief (~2s) spike at the start of each track (to ~160% of a core). But then the sustained CPU usage is no different from playing a file from my LAN.

    I am curious to know the reason for the disparity.

    * It could be because my music is remote-mounted from a NAS, rather than residing on a disk on the Roon Core machine.
    * It could be because my broadband connection (236 Mbps) is faster than yours.
    * Or perhaps it’s an artifact of your assigning only two cores to Roon during this test.

    Would be good to know which.

    More broadly, I really appreciate this series of posts, even though they do jumble together two separate matters for investigation.

    1. The benefits of virtualization
    2. Using netdata to explore the performance characteristics of Roon Core

    Of course, you need to do the latter if you want to configure the VM correctly for running Roon. But it’s also of independent interest to those of us profligate enough to consider devoting an entire Odroid H2 to running Roon.

    • Hah hah, profligate indeed! Thank you for trying that, that’s a very interesting result. Perhaps a combination of download speed and file size, the latter seems inconsistent. I’ll try some more tests, in the meantime here are the tracks I used:
      https://open.qobuz.com/track/1755947
      https://open.qobuz.com/track/63522440
      https://open.qobuz.com/track/53971256

      • I queued up your 3 tracks, waited till Roon quieted down (it’s been pretty busy this morning, doing SOMETHING, even without my playing any music), and then pressed play.

        With the first track (16/44.1), there was
        * a spike to ~140% which lasted about 6 seconds, then
        * Roon used ~5% for the rest of the track (with the usual random spikes)
        With the second track (24/96),
        * a spike to ~140%, which lasted about 12 seconds, then
        * Roon used ~9% for the rest of the track (with the usual random spikes)
        With the third track (24/192),
        * a spike to ~140% which lasted about 8 seconds, then
        * Roon used ~14% for the rest of the track (with the usual random spikes)

        So I’m definitely seeing (the familiar phenomenon) that high-resolution tracks chew up more CPU. But, aside from the initial spike (which is usually only about 2 seconds, for a local file), I don’t see any difference between the sustained CPU usage when playing a Qobuz file versus a local file.

        In the settings, you can cap the resolution sent by Qobuz (I usually keep it at 2/96, but removed the cap for this test). If you do that, then the CPU usage is reduced accordingly.

        • I added some graphs above when running Roon Server in a container, which doesn’t have the VM overhead. The download time is about a factor of 5, which seems like it could be reasonably be attributed to download speed. After that, it’s about 4, 8 and 10%.

          • I was gonna say that containerization seemed the more natural route, for this purpose, than virtualization. But, since I knew you were headed towards virtualizing Rock …

            Anyway, it’s good to see that, containerized, your Qobuz results are more in line with mine. My internet connection is 200Mbs down/10Mbs up, which might account for the faster download times.

            I guess my last question is: now that all the experiments are done, what are you gonna run RoonServer on?

          • “I guess my last question is: now that all the experiments are done, what are you gonna run RoonServer on?”
            Well, after all this time 🙂 I’m still using the virtualized RoonServer. I suppose because it’s fine and my library was already fully indexed.

        There is one important difference between playing local tracks and tracks from Qobuz. For local tracks, Roon buffers 30 seconds worth. So there’s a little spike in disk access every 30 seconds (like clockwork!), as Roon reads the next chunk of the file.

        For Qobuz files, Roon caches the whole track at the beginning (big spike in network traffic at the beginning of the track, then nothing till just before the next track starts).

Leave a Comment