GPU Acceleration Explained?

Modern GPUs utilize the most their transistors to perform calculations associated with 3D computer graphics. Along with the 3D hardware, today’s GPUs include basic 2D acceleration and framebuffer capabilities (usually with a VGA compatibility mode). Newer cards such as AMD/ATI HD5000-HD7000 even lack 2D acceleration; it should be emulated by 3D hardware. GPUs were initially used to accelerate the memory-intensive work of texture mapping and rendering polygons, later adding units to accelerate geometric calculations just like the rotation and translation of vertices into different coordinate systems. Recent developments in GPUs contain support for programmable shaders that may manipulate vertices and textures with many of the same operations supported by CPUs, oversampling and interpolation methods to reduce aliasing, and intensely high-precision color spaces. Because a number of these computations involve matrix and vector operations, engineers and scientists possess increasingly studied using GPUs for non-graphical calculations; they are specially ideal for other embarrassingly parallel problems.

Source: gpuhub

With the emergence of deep learning, the necessity for GPUs has increased. In research done by Indigo, it turned out found that while training deep learning neural networks, GPUs could possibly be 250 times faster than CPUs. The explosive growth of Deep Learning recently has been linked to the emergence of general purpose GPUs. There has been some extent of competition in this area with ASICs, most prominently the Tensor Processing Unit (TPU) created by Google. However, ASICs require changes to existing code and GPUs remain extremely popular.

GPU accelerated video decoding and encoding

The ATI HD5470 GPU (above) features UVD 2.1 which enables it to decode AVC and VC-1 video formats
Most GPUs made since 1995 support the YUV color space and hardware overlays, very vital that you digital video playback, and many GPUs made since 2000 also support MPEG primitives such as motion compensation and iDCT. This system of hardware accelerated video decoding, where portions of the video decoding process and video post-processing are offloaded to the GPU hardware, is generally known as “GPU accelerated video decoding”, “GPU assisted video decoding”, “GPU hardware accelerated video decoding” or “GPU hardware assisted video decoding”.

Newer graphics cards actually decode high-definition video on the card, offloading the central processing unit. The most frequent APIs for GPU accelerated video decoding are DxVA for Microsoft Windows operating-system and VDPAU, VAAPI, XvMC, and XvBA for Linux-based and UNIX-like os’s. All except XvMC can handle decoding videos encoded with MPEG-1, MPEG-2, MPEG-4 ASP (MPEG-4 Part 2), MPEG-4 AVC (H.264 / DivX 6), VC-1, WMV3/WMV9, Xvid / OpenDivX (DivX 4), and DivX 5 codecs, while XvMC is with the capacity of decoding MPEG-1 and MPEG-2.

Nvidia’s NVENC uses the system’s GPUs to accelerate video encoding.

The video decoding processes which may be accelerated by today’s modern GPU hardware are:

  • Motion compensation (mocomp)
  • Inverse discrete cosine transform (iDCT)
  • Inverse telecine 3:2 and 2:2 pull-down correction
  • Inverse modified discrete cosine transform (iMDCT)
  • In-loop deblocking filter
  • Intra-frame prediction
  • Inverse quantization (IQ)
  • Variable-length decoding (VLD), additionally known as slice-level acceleration
  • Spatial-temporal deinterlacing and automatic interlace/progressive source detection
  • Bitstream processing (Context-adaptive variable-length coding/Context-adaptive binary arithmetic coding) and
  • perfect pixel positioning.