Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Duplicate Image Problem: The Numbers Exposing a City Archive in Crisis

Thousands of redundant digital files are clogging public databases across Berlin's administration, and the data shows the clean-up bill is steeper than anyone budgeted for.

By Berlin News Desk · Published 4 July 2026, 9:00 pm

3 min read

Berlin's Duplicate Image Problem: The Numbers Exposing a City Archive in Crisis
Photo: Photo by Luca Severin on Pexels
Wird übersetzt…

Berlin's Senate Department for Urban Development holds an estimated 2.3 million digital images across its planning and property archives — and internal audits completed in early 2026 found that roughly 340,000 of those files are exact or near-exact duplicates. The redundant data is consuming approximately 18 terabytes of server capacity the city is paying to maintain, according to figures circulating among IT procurement officers ahead of this summer's budget review.

The problem matters right now because Berlin is mid-way through a flagship digitisation drive tied to the federal government's Onlinezugangsgesetz — the Online Access Act — which required all public services to be digitally accessible by the end of 2022. That deadline was missed by most German states, and Berlin's administration is now racing to comply while simultaneously discovering that the raw volume of images ported from legacy systems is far larger, and far messier, than the original migration contractors projected.

Where the Redundancy Is Concentrated

Two institutions sit at the heart of the problem. The Landesarchiv Berlin, based on Eichborndamm in Reinickendorf, holds digitised photographic records stretching back to the 1920s. A 2025 internal review — referenced in procurement documents reviewed for this article — flagged that automated scanning workflows had produced duplicate TIFF files for an estimated 12 percent of the photographic collection, adding up to tens of thousands of image pairs where only one version was needed. The second flashpoint is the Geoportal Berlin, the city's public-facing geographic information platform maintained by the Senate Department for the Environment, run jointly with district offices including those in Mitte and Pankow. Aerial survey images uploaded across multiple quarterly cycles have generated layered duplicates that slow query response times and inflate cloud storage invoices.

The BVG, Berlin's public transport operator, faces a related but operationally distinct version of the issue. The authority has been digitising maintenance documentation — photographs of U-Bahn infrastructure, signal equipment, and track conditions — since 2019 as part of a broader asset management overhaul. Sources familiar with BVG's IT contracts say the rail operator identified over 60,000 duplicate maintenance images during a 2025 system migration, a figure that complicated liability documentation for several stretches of the U5 line between Alexanderplatz and Hönow.

The Cost of Doing Nothing

Storage is cheap, the argument goes — until it isn't. Berlin's Senate IT service provider ITDZ Berlin charges public departments on a tiered model. Active archival-grade storage runs at roughly €0.08 per gigabyte per month under current framework contracts. At 18 terabytes of confirmed duplicate image data, that works out to around €17,280 a year in avoidable storage costs at current rates — modest on its own, but compounding annually and dwarfed by the staff and contractor hours required to clean the data manually.

The more serious cost is administrative. When planning officials in districts like Friedrichshain-Kreuzberg pull historical building records to assess development applications near the East Side Gallery or along Revaler Strasse, duplicate images embedded in property files slow retrieval and, in at least two documented cases in 2024, caused conflicting version records to appear simultaneously in the same case file. Planning decisions cannot be delayed indefinitely on technical grounds.

Automated deduplication software — tools that compare pixel-level hashes and metadata to identify redundant files — is not new technology. Several Bundesländer, including Hamburg and Bavaria, have already tendered and awarded contracts for exactly this work. Berlin has not yet put a citywide deduplication contract out to tender, though the Senate Chancellery's digital transformation unit has the framework to do so under existing procurement rules without requiring a new Abgeordnetenhaus vote.

Departments facing the worst backlogs should, at minimum, freeze new uploads to shared image repositories until a deduplication pass is completed on existing stock. For residents dealing with planning or housing applications that depend on archived imagery — particularly in redevelopment corridors like the area around Tempelhofer Feld — it is worth asking the relevant district office to confirm which image version is the active record before any documentation deadline passes. The numbers suggest that assumption should no longer be taken for granted.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.