GPU Clocking
Any electronic device needs a clock source to operate, and nVIDIA GPUs are no exception. In early GPUs, the system of generating phase-locked loops (PLL) was a derivative of the system used by the SGS-Thomson (now STMicro) STG-1764 "Van Gogh", which was used as the external RAMDAC on the NV1. Although which PLLs existed changed, the overall system implementation was very similar from NV1 until NV20. A partial break from this system was introduced with the implementation of multi-stage clocks in NV30 and a full break was achieved with NV40's implementation of clock domains[1] .
Initial clock system (1995-2002)
The clock system in NV1 to NV2x is based in the external DAC (NV1) or on chip within in the PRAMDAC functional block (NV3 or later). Where it is integrated onto the chip, it is exposed in all generations at MMIO address ranges between 0x680300 and 0x680FFF. In most cases (and in all cases prior to NV10), the registers staring at 0x680500 and ending at 0x6805FF are used for PLL configuration; each PLL has a 32-bit[2]
register assigned to it, and there is a second 32-bit register (typically at 0x68050C) that is used to configure these PLLs. Each PLL register is split into three dividers[3]
, with the low 8 bits making up MDIV, bits 15 through 8 making up NDIV, and a three-bit (bits 18 through 16) PDIV. These are combined with a base clock speed - which can be hardcoded by simply taking the signal coming directly from the on-board clock crystal, or configured via straps - in order to create the final clock speed using the following formula:
(base_clock_speed * NDIV) / (MDIV << PDIV)
PLLs on the NV1
The NV1 has three PLLs.
Configuration
Notes
^ Used to allow different parts of the GPU core to run at different clock frequencies.
^ The NV1 technically has a fourth divider, ODIV, but it is always set to a value of 1 and is effectively never used.
^ Only 19 bits are actually used