Comparing most populous regency to provinces in Indonesia

Kab. Bogor population is larger than 23 provinces
Author

Farhan Reynaldo Hutabarat

Published

June 10, 2025

Modified

June 10, 2025

While scrolling through my RSS feed, I came across Kieran Healy’s post about the surprising population of LA County compared to entire U.S. states. It got me thinking, surely in Indonesia the most populous regency would be somewhere in DKI Jakarta. So, inspired by Healy’s visualization, I set out to create a comparison of Indonesia’s most populous regency to its provinces.

First, I needed a reliable dataset for Indonesia’s regency-level population. Navigating and retrieving the dataset from BPS website wasn’t as easy as I thought it would be, so I turned to Humdata’s Indonesia Subnational Population Data, where they sourced their data from the 2020 Indonesia Long Form Population Census.

The code below outlines the steps I took to download and process the data.

Import libraries and download data
library(here)
library(janitor)
library(tidyverse)
library(readxl)
library(sf)
library(systemfonts)

pop_excel_url <- "https://data.humdata.org/dataset/6daa36c4-9ea3-4844-958b-042d2cc3d8b7/resource/2c8fce76-8ad7-4bce-ab02-374318302383/download/indonesia_uscb_202405.xlsx"
pop_gdb_url <- "https://data.humdata.org/dataset/6daa36c4-9ea3-4844-958b-042d2cc3d8b7/resource/b1763b8a-1c6b-4a2b-8f79-ddf92bfc0f82/download/indonesia.gdb.zip"

if (!file.exists("data/indonesia_uscb_202405.xlsx")) {
    curl::curl_download(pop_excel_url, "data/indonesia_uscb_202405.xlsx", quiet = FALSE)
}

if (!file.exists("data/indonesia.gdb.zip")) {
    curl::curl_download(pop_gdb_url, "data/indonesia.gdb.zip", quiet = FALSE)
}
Load and preprocess data
pop <- read_excel("data/indonesia_uscb_202405.xlsx", sheet = 3, skip = 1) |> 
  janitor::clean_names() |> 
  select(country_or_subnational_area_name:both_sexes_total) |> 
  rename(total_pop = both_sexes_total)

pop_district <- pop |> 
  filter(administrative_level == 2)

max_pop <- pop_district |> 
  arrange(desc(total_pop)) |> 
  head(1) |> 
  pull(total_pop)

pop_province <- pop |> 
  filter(administrative_level == 1) |> 
  mutate(is_less_than_max = total_pop < max_pop)

adm_level_1 <- st_read(dsn = "data/indonesia.gdb.zip", layer = "ID_GEOG_ADM1_2020_uscb_202405", quiet = TRUE) |> 
  left_join(pop_province, by = join_by(GEO_MATCH == geographic_match))
adm_level_2 <- st_read(dsn = "data/indonesia.gdb.zip", layer = "ID_GEOG_ADM2_2020_uscb_202405", quiet = TRUE) |> 
  left_join(pop_district, by = join_by(GEO_MATCH == geographic_match))

kab_bogor <- adm_level_2 |> 
  filter(AREA_NAME == "KABUPATEN BOGOR") 

And boy, was I wrong about Jakarta! It turns out Kabupaten Bogor comes out as Indonesia’s most populous regency, with a staggering 5.56 million people. And what even more shocking is this 2,992 km² area is more populous than 23 of Indonesia’s provinces! You can see below which of those provinces have less population than Kabupaten Bogor.

Visualize data
label_mapper <- function(label_status) {
  ifelse(label_status == TRUE,
    "Province with less people than Kab. Bogor",
    "Province with more people than Kab. Bogor"
  )
}

ggplot() +
  geom_sf(data = adm_level_1, aes(fill = is_less_than_max), color = "white") +
  geom_sf(data = kab_bogor, fill = "darkorange", color = "white") +
  annotate("curve",
    x = 105.14, y = -9.46, xend = 106.81, yend = -6.94,
    arrow = arrow(length = unit(2, "mm")), linewidth = 0.4, curvature = 0.3,
    color = "darkorange"
  ) +
  annotate("text", x = 105.14, y = -10.56, label = str_wrap("Kabupaten bogor with 5.56 million people live", 30), size = 2.5, color = "darkorange", fontface = 2) +
  coord_sf(default_crs = sf::st_crs(4326)) +
  scale_fill_manual(
    labels = label_mapper, values = c("grey85", "steelblue"),
    name = ""
  ) +
  labs(
    title = str_wrap("Kabupaten Bogor has a larger population than the other 23 provinces", 50),
    caption = "The data is collected from Indonesia Long Form Population Census 2020"
  ) +
  theme_void(base_family = "Noto Sans") +
  theme(
    plot.margin = margin(5, 25, 5, 25),
    plot.title = element_text(
      size = 16, hjust = 0, margin = margin(0, 0, 0, 0),
      face = "bold"
    ),
    legend.text = element_text(size = 10),
    plot.title.position = "plot",
    legend.position = "top",
    legend.direction = "vertical",
    legend.margin = margin(0, 0, 10, 0),
    legend.justification.top = c(0, 0),
    plot.caption = element_text(size = 6, color = "grey50", hjust = 0.93, 
                                face = "italic")
  )

This finding really emphasizes how skew the population distribution in Indonesia is.

Back to top

Reuse

Citation

For attribution, please cite this work as:
Reynaldo Hutabarat, Farhan, and Farhan Reynaldo Hutabarat. 2025. “Comparing Most Populous Regency to Provinces in Indonesia.” June 10, 2025. https://weaklyinformative.com/posts/2025-06-10_comparing-most-populous-regency/.