Thursday, May 2, 2013

QUItIndicators: Performance considerations

(This is part III, please check also parts I and II)

Even with a small component like this, there are plenty of possibilities to improve (or sink) the performance. To make sure that our indicators perform as expected, we'll test them on Nokia N9 and and on Raspberry Pi.

These both devices are relatively low-end by current standards. N9 contains 1GHz Cortex-A8 CPU which is still quite beefy, but GPU (SGX530) on the other hand is getting old and unable to handle more complicated fragment shaders. For RPi these are just the opposite: CPU is slowish 700MHz ARM11 while the GPU (VideoCore IV) is more performant than N9 SGX530. Because of these qualities (and because both support Qt5, naturally) this duo is excellent for our performance ensurement.

So here's a short video showing how our indicators perform on Nokia N9 and on RaspberryPi:




Performs pretty OK on both, right? ProgressIndicator stress test isn't smooth on RPi, which indicates that its CPU can't handle 100 indicators animated like that. So stress test does what it's supposed to, normal use cases perform well.

Even with declarative QML language, good performance doesn't just happen, you need to work towards it. By looking at the sources of these indicators, at least these tips/notes can be made:

  • BusyIndicator and ProgressIndicator are separate components instead of just one component with e.g. "indeterminate" property. This allows using a more light-weight (only one texture, animating only vertex shader, less properties etc.) BusyIndicator component with indeterminate use-cases. Lesson to learn is to avoid making too general/bloat QML components.
  • There are four different sizes for indicators, from small (64x64px) to huge (512x512px). These sizes were selected because power-of-two texture sizes are more optimal for GPU. SourceSize property is used to scale and cache exactly the correct sized textures.
  • BusyIndicator animation is achieved purely on vertex shader. As vertex shader is run once for every vertex instead of once for every fragment (pixel), it can be much more GPU-friendly. In case of an 256x256px indicator, our vertex shader is executed over 650(!) times less than fragment shader.
  • To keep the amount of vertices as small as possible while making sure animation looks still smooth, the mesh size is allocated based on indicator size. GridMesh resolution for 256x256 indicator is 10x10 while 64x64 indicator manages with a 4x4 mesh.
  • When using items&images as sources for ShaderEffect and not needing the original one, remember to set its visibility to false to prevent it from rendering.
  • ProgressIndicator can show percentages in the center and applies vertex animation also for it, which requires that the whole Item is used as a source for ShaderEffect. Initialize of Item as a ShaderEffect source is slightly slower than initialize of Image. This is because Items (with their child Items) need to be rendered into FBO first while Image textures are instantly available. When percentages are disabled (showPercentages: false), Image is used directly and stress test of 162 ProgressIndicators starts pretty instantly. So if you need instantly appearing ProgressIndicators, don't show percentages on them.
  • Minimize the amount of property changes. Property bindings in QML are so easy to make that without paying attention you may forget to disable those when not needed. As an example, ProgressIndicator percentages Text element visibility was set to false when disabled and it wasn't even used as part of the shader source. But because of an oversight, its text property was still updated whenever ProgressIndicator value changed. That's bad, fixed now to update text only when showPercentages is true. Qt Creator QML Profiler is the tool to use to analyze your code.

So with all this, does it mean that these indicator components are fully optimized? Well of course not! There are still some generalization left and room for improvements. Additinally, although we have tried to offload most of the work from CPU to GPU, indicator animations like these should be run in a separate thread instead of the GUI thread. Only this would allow smooth 60fps even when processing under heavy CPU load. For details, please read this blog post by Gunnar: http://blog.qt.digia.com/blog/2012/08/20/render-thread-animations-in-qt-quick-2-0/

Summing up briefly: When implementing QML components for applications, developers and designers should co-operate closely to bend the design to be as CPU&GPU friendly as possible while still making designers happy. Designers dig perfect pixels, but they also appreciate smooth 60fps. By making wise compromises and utilizing Qt5 correctly, we can deliver both.

Sources of QUItIndicator components & examples are available from: http://quitcoding.com/?page=work#indicators

No comments: