Databases 101 : What are UUIDs? should we care?

Saurabh Kumar
3 min readMay 10, 2022

UUIDs/GUIDs are formally defined in RFC 4122. They’re Universally Unique IDentifiers, that can be generated without the use of a centralized authority. All UUIDs are 128 bits in length but are commonly represented with 32 hexadecimal characters.

The great dilemma, UUID vs serial as a primary key :P

Some facts
1. Hex character can be represented in binary system using 4 bits from 0000 till 1111
2. A UTC timestamp has 60 bits
[12] [16] [32] this is the division which is used to generate UUIDs

UUID Generation
There are various ways to generate UUIDs, but all of the generation methods adhere to the common structure of UUIDs i.e

UUIDs Structure
xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx (M is version N is variant)

All the UUIDs follow the same [8] [4] [4] [4] [12] bits format

The following bullets explain how the first version of UUID is generated using the timestamp and mac address of the device.

  1. TimeLow : 4 Bytes (8 hex chars) from the integer value of the low 32 bits of current UTC timestamp ~6157329e
  2. TimeMid : 2 Bytes (4 hex chars) from the integer value of the middle 16 bits of current UTC time ~ 608a
  3. TimeHighAndVersion : 2 Bytes (4 hex chars) 4-bit “version” in the most significant bits, followed by the high 12 bits of the time ~ 4b40
    I don't know what a version number is neither I know what is multiplexing.
  4. ClockSequenceHiAndRes && ClockSequenceLow : 2 Bytes (4 hex chars) where the 1 through 3 (significant) bits contain the “variant” of the UUID version being used, and the remaining bits contain the clock sequence ~ b4b0
  5. Node : 6 bytes (12 hex chars) that represent the 48-bit “node id”, which is usually the MAC address of the host hardware that generated it. ~ 0a31cca02bf1
6157329e-608a-4b40-b4b0–0a31cca02bf1

Versions in UUIDs explain how they are generated, it can be identified easily by looking at M, which indicates the version of any UUID
xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx (M indicates version)

1. Version-1 UUIDs are generated from a time and a node ID (usually the MAC address).

2. Version-2 UUIDs are generated from an identifier (usually a group or user ID), time, and a node ID;

3. Versions 3 and 5 produce deterministic UUIDs generated by hashing a namespace identifier and name, by different hashing functions hence different versions.

4. Version-4 UUIDs are generated using a random or pseudo-random number (This is what we use very often)

Variants in UUIDs indicate the format and encoding of the UUIDs, which can be inferred by N.
xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx (N indicates variant)

Since every character in the UUID is a hex character so it has four bits for its representation, so the first 3 bits of N would indicate the variant of a particular UUID

MSB1 MSB2 MSB3

0 X X reserved

1 0 X current variant

1 1 0 reserved for Microsoft

1 1 1 reserved for future

Details to look out for!

  1. Why did I dig into so many facts about UUIDs, is it of any use 🥱?
    I wanted to understand this beautiful article here about the tradeoffs of considering UUID vs Serials as a primary key, which I was not able to understand without structuring these facts together ✌️.

leave me a clap if you found this article insightful 😄, Happy Reading !

--

--