Issue 01 / May 2026

Causal AI for synthetic
banking data,
regulator-ready.

Powered by CAUSA — our causal AI data engine. Production-grade synthetic banking datasets and the compliance artefacts that DORA, AI Act, BCBS 239, PRA, and FCA examiners expect to find. Polish, CEE, and UK regulatory specifics built in.

25 minutes. Research conversation about your DORA, AI Act, BCBS 239, or PRA SS programmes. No pitch. No follow-up unless you ask for it.

Built from practice

Thirteen years engineering data infrastructure across European financial services — across four jurisdictions, across the regulatory stack: BCBS 239 lineage, KNF risk reporting, Solvency II data quality, model risk validation. First version of CAUSA completed end of 2024 after 18 months of solo R&D. The gap between what synthetic data tools provide and what regulators accept as evidence has been a constant. CAUSA closes it.

EU regulatory coverage

DORA · AI Act · BCBS 239 · CRD VI · KNF Rec S

UK regulatory coverage

PRA SS1/21 · PRA SS1/23 · FCA SDEG · UK DUAA · Consumer Duty

Domain specifics built in

PESEL · NIP · REGON · IBAN PL · BIK · Województwo

Deployment

EU-resident · Self-hosted · Air-gapped option

§ 01 / Forcing functions

Six regulations.
One window.

Between January 2025 and December 2027, every European and UK retail or corporate bank will need defensible synthetic datasets for at least three concurrent regulatory regimes. Banks that begin preparation in 2026 will pass first cycles cleanly. Those that begin in 2027 will be in remediation mode while AI Act enforcement starts.

Digital Operational Resilience Act

DORA

In force / 17 Jan 2025

Article 26 mandates threat-led penetration testing on production-like environments. Article 28 requires third-party risk frameworks for every ICT vendor. First completed TLPT cycle deadline: January 2028. Real customer data in TLPT environments triggers GDPR Article 32 breach. CAUSA produces the only audit-defensible alternative.

Capital Requirements Directive VI

CRD VI / CRR III

In force / 11 Jan 2026

New output floor for internal models at 72.5%. Revised standardised approach for credit, market and operational risk. Banks running IRB models need parallel synthetic datasets to test recalibration without exposing real exposure. CAUSA preserves default-rate distributions, LTV cliffs, and counterparty concentrations.

EU AI Act / High-risk systems

AI Act

Enforcement / 02 Dec 2027 *

Deferred from 2 Aug 2026 to 2 Dec 2027 via Digital Omnibus (07 May 2026, provisional — pending Official Journal publication). Article 10 mandates training data quality. Article 11 requires technical files documenting composition, bias testing, traceability. Penalty: €15M or 3% of global turnover. CAUSA compliance layer produces Article 11 artefacts inline.

Basel Committee / Risk Data Aggregation

BCBS 239

ECB priority / 2025–2027

14 principles for risk data aggregation and reporting. ECB declared this its #2 supervisory priority for 2025–2027. Lineage, completeness, accuracy, timeliness — all auditable against synthetic test cases. CAUSA multi-table integrity enables auditor-defensible lineage testing.

PRA / UK Operational Resilience

PRA SS1/21

In force / 31 Mar 2025

UK Supervisory Statement requires impact tolerance testing for important business services. Continuous demonstrable compliance, not point-in-time. Masked production data fractures referential integrity and breaks SOC/SIEM behaviour. CAUSA produces structurally realistic environments for safe live testing.

FCA / Synthetic Data Expert Group

FCA SDEG

Published / Aug 2025

Nine governance principles for synthetic data in UK financial services. Officially positioned as Model Risk Management imperative under PRA SS1/23. CAUSA produces SDEG-aligned documentation alongside the dataset — explicit lineage, fidelity scorecards, privacy certificates.

Additional coverage: PRA SS1/23 Model Risk Management · KNF Rekomendacja S amendments · KNF Rekomendacja M · EBA SREP cycles · UK Data Use & Access Act 2025 · Polish-language regulatory artefacts available.

§ 02 / What we build

A synthetic
retail bank,
on demand.

Infundum produces multi-table synthetic datasets that look, feel, and behave like a real European retail bank — with the causal regulatory patterns your auditors expect to find.

Twelve linked tables. Fifty thousand customers. Five million transactions. PESELs that pass checksum validation. IBANs with valid PL26 control digits. Mortgage LTV distributions that show 2014 and 2020 KNF Rekomendacja S amendment cliffs — because the engine encodes regulatory history, not hand-sampled or retrofitted.

  1. Schema preservation

    Foreign keys, dependencies, referential integrity, time-series consistency across linked tables — without statistical artefacts auditors flag.

  2. Domain specifics

    PESEL checksums, NIP/REGON validity, IBAN PL26 control digits, województwo distributions, BIK scoring brackets, KNF amendment effects. UK formats on request.

  3. Compliance artefacts

    Generated alongside the dataset: DORA TLPT Data Safety Annex, AI Act Article 11 technical file, BCBS 239 lineage map, PRA SS1/23 MRM evidence.

§ 03 / Reference scenario

Bank Wisła.

A fictional Polish universal retail bank. Mid-tier scale. Currently migrating from Oracle Exadata to Snowflake. €5B turnover, 1.2M customers (50k in demo), 280 branches. The reference dataset behind every Infundum evaluation.

Schema

CUST ACCT TXN CUST_ADDR ACCT_HOLD TXN_TYPE CUST_EMPL PROD_HIST BRCH CUST_KYC BIK_RPT RISK_SCR

Coverage

1,946 days · 50k customers · ~5M transactions · 280 branches · Transaction history 2021–2026

TABLE CUST · first 8 rows of 50,000
CUST_IDFIRST_NAMELAST_NAMEPESELWOJEWÓDZTWOOPEN_DATESEGMENTBIK
C00041829MartaWojciechowska84051902948Mazowieckie2014-06-12RETAIL_PRIME587
C00041830TomaszNowak77112304671Małopolskie2009-03-04RETAIL_PREMIUM612
C00041831AgnieszkaKaczmarek91020517385Wielkopolskie2018-11-22RETAIL_STANDARD534
C00041832MarcinLewandowski82071402516Mazowieckie2011-09-08RETAIL_PRIME598
C00041833KatarzynaDąbrowska89030616829Dolnośląskie2020-04-15RETAIL_STANDARD451
C00041834PawełWiśniewski75091803742Pomorskie2007-02-19RETAIL_PREMIUM624
C00041835MagdalenaKowalczyk92112808157Śląskie2022-08-30RETAIL_STANDARD503
C00041836KrzysztofZieliński80042705294Mazowieckie2013-12-03RETAIL_PRIME574

Chart 01 / PROD_HIST

Mortgage LTV by origination year

Loan-to-value ratio aggregated by year of origination across 50,000 customer book.

Causal pattern · domain-encoded
95% 85% 75% 65% 55% KNF Rec S / 2014 KNF Rec S / 2020 '07 '09 '11 '13 '15 '17 '19 '21 '23 '25

The wow moment. CAUSA encodes KNF Rekomendacja S amendment logic directly in the generation layer. The 2014 and 2020 LTV cliffs are reproduced from regulatory domain knowledge — not retrofitted from output samples, not statistically fitted. Any mortgage analyst recognises them immediately.

§ 04 / CAUSA engine

Three layers.
One pipeline.

CAUSA is our causal AI data engine. Three composable layers, producing the synthetic dataset and the regulator-ready paper trail in one run.

Specific architecture details are available under NDA during formal evaluation.

Layer 01

Generation engine

Causal AI generation

CAUSA encodes the structural and behavioral patterns that make banking data behave like banking data — credit policy cliffs, channel migration patterns, default cascades, marriage/divorce transitions — without ingesting customer records. Unlike generators based on statistical mimicry, the engine reproduces the underlying decision logic from domain knowledge.

Multi-table relational 12+ entity schemas Causal pattern preservation

Layer 02

Domain layer

Polish, CEE, UK specifics

Identifier validation, regulatory pattern preservation, supervisory expectations. Generic synthetic data tools require manual configuration for any of these. CAUSA applies them natively.

PESEL checksum
NIP / REGON validation
IBAN PL26 control digits
Polish names dictionary
Województwo distribution
BIK score brackets
KNF amendment effects
UK sort code · IBAN GB

Layer 03

Compliance layer

Regulatory artefacts

For each generated dataset, CAUSA produces the documentation auditors and regulators expect. Generation and compliance artefacts arrive together — not as a separate consulting engagement.

DORA

TLPT Data Safety Annex

AI Act

Article 11 technical file

BCBS 239

Lineage map

PRA SS1/21

Resilience test evidence

PRA SS1/23

MRM validation pack

FCA SDEG

9-principles compliance

What each artefact closes

ArtefactRegulatory gap closed
Datasheet + Test ReportAI Act Art. 10(5)(a) exhaustion gate — proof that synthetic alternatives were assessed before real special-category PII processing
DORA TLPT AnnexDORA Art. 26 data exposure question — bank-controlled substitution rationale for the formal TLPT documentation pack
Lineage FileBCBS 239 Principles 2–3 audit trail gap — OpenLineage JSON-LD, ingest-ready for Collibra, Alation, Atlan
Validation DossierPRA SS1/23 Principle 4 independent challenge requirement + FCA SDEG Principle 6 fidelity evidence
Scenario ManifestPRA SS1/21 severe-but-plausible scenario evidence + DORA operational resilience scenario testing
VERA QC reportGDPR Art. 32 technical measure documentation + AI Act Art. 10 bias-testing evidence gate

§ 05 / Founder

Built
from practice.
v1 end of 2024.

AK

Arek Kordos

Founder · Kraków + London

"Thirteen years engineering data infrastructure across European financial services — across four jurisdictions, across the regulatory stack. The need for production-quality synthetic data with audit-grade provenance has been a constant. The available tooling has not."

Hands-on across the regulatory stack: BCBS 239 lineage, KNF risk reporting, Solvency II data quality, model risk validation, regulatory data delivery for IRB and stress testing. Founded Infundum after a focused R&D phase on the gap between general-purpose synthetic data tools and what a KNF, ECB, PRA or FCA reviewer accepts as evidence.

The gap I kept hitting: generic synthetic data tools either treated Polish identifiers as strings to be randomised, or generated statistical fidelity without the causal regulatory patterns auditors look for. The financial sector is the vertical where regulatory deadlines hit hardest between 2025 and 2028 — so that is where CAUSA started. Neither generic tools nor manual workarounds survive a serious KNF, ECB, PRA, or FCA review.

CAUSA is the engine I would have wanted to deploy myself in any of the programmes I worked across the past decade. Built for the people on the inside.

13

Years across European financial services

18mo

Solo R&D before v1

4

Jurisdictions delivered into

§ 06 / Research

Twelve essays.
Two markets.

A research cluster on synthetic banking data, regulatory compliance, and causal AI — written for Chief Data Officers, Heads of Operational Risk, Compliance Officers, and Systems Integrators preparing for the 2026–2028 regulatory window. Polish content for KNF and ECB context. English for UK PRA, FCA, and pan-European audiences.

01
PLEU

DORA Artykuł 26 TLPT: dane testowe dla polskich banków przed cyklem 2028

Pierwszy obowiązkowy cykl TLPT do stycznia 2028. Dlaczego dane prawdziwych klientów naruszają Art. 32 RODO w środowiskach pen-test, a CAUSA produkuje audytowo-defensywne alternatywy.

Read article →
02
PLPL

KNF Rekomendacja S a syntetyczne dane: testowanie modeli hipotecznych bez naruszenia RODO

Historia Rec S amendments (2006, 2011, 2014, 2020) z konkretnymi LTV cap zmianami. Jak CAUSA enkoduje efekty regulacyjne jako reguły kauzalne — bez ręcznego próbkowania danych wyjściowych.

Read article →
03
PLPL

Migracja Oracle do Snowflake w polskim banku: 5 powodów dla syntetycznych danych testowych

Najczęstsza polska migracja banking 2024–2026. Dlaczego legacy SQL subsetting 100+ tabel breaks UAT. SI engagement angle (Accenture, Capgemini, Sii).

Read article →
04
PLEU

BCBS 239 — priorytet ECB 2025–2027 dla polskich banków znaczących

14 zasad RDARR mapowanych do data engineering controls. Aktualne Pillar 2 capital add-on precedensy. Dlaczego żadna znacząca instytucja nie spełnia w pełni.

Read article →
05
PLEU

AI Act Artykuł 11: gotowość polskich banków na enforcement 02.12.2027

Digital Omnibus deferment (07.05.2026) wyjaśnione. Article 10 vs 11 unpack. Testowanie bias bez naruszenia RODO Art. 9. CAUSA compliance layer.

Read article →
06
ENEU

DORA Article 26 TLPT: A GDPR-Compliant Approach to Test Data for European Banks

TLPT cycle anatomy. Article 32 + 28 GDPR risks of real data exposure. DORA Article 28 implications for synthetic data vendors. CAUSA self-hosted advantage.

Coming July 2026
07
ENEU

AI Act Article 11 Technical File for Banking Credit Scoring: 2027 Readiness Guide

Why the December 2027 deferral is a clarifier, not a blocker. Article 10 vs 11 unpack. Bias testing without GDPR Article 9 violation. High-risk system inventory framework.

Read article →
08
ENEU

BCBS 239 Lineage Testing with Synthetic Datasets: ECB 2025–2027 Supervisory Priority

14 principles mapped to data engineering controls. Current Pillar 2 capital add-on precedent cases. CAUSA multi-table integrity for lineage testing.

Coming July 2026
09
ENUK

PRA SS1/21 Operational Resilience: Synthetic Data for Continuous Compliance

SS1/21 impact tolerance deadline (31.03.2025) passed. Continuous demonstrable compliance phase. Why masked data fails SOC/SIEM realism tests.

Coming July 2026
10
ENUK

PRA SS1/23 Model Risk Management: Validating Multi-Table Banking Models Without PII Exposure

SS1/23 effective from 17.05.2024. Principle 3 independent effective challenge requirement. Why masking fractures referential integrity. MRM lifecycle documentation.

Coming July 2026
11
ENUK

FCA SDEG: 9 Governance Principles for Synthetic Data in UK Financial Services

August 2025 FCA report. Each of 9 principles unpacked. PRA SS1/23 integration. UK DUAA February 2026 implications. CAUSA SDEG-aligned output.

Coming July 2026
12
ENUK

UK DUAA 2025 and FCA Consumer Duty: Synthetic Data for Demonstrable Compliance

UK Data Use & Access Act 2025 "Recognised Legitimate Interests" framework. Consumer Duty (31.07.2023 / 31.07.2024) intersection with synthetic data testing.

Coming July 2026

Subscribe for new essay alerts: arek.kordos@infundum.io (manual list, no automation, ~1 message per month).

§ 07 / Common questions

Questions
every evaluation
team

asks first.

What kind of AI powers Infundum?
CAUSA — our causal AI data engine. Unlike older approaches based on GANs, VAEs, or copula sampling, CAUSA encodes the structural and behavioral patterns of banking data as domain-aware causal rules — not just statistical distributions. This is fundamentally different from generic tabular generators that break down when schemas exceed 4–5 connected tables or when regulatory edge cases need to be preserved. Specific architecture details are available under NDA during formal evaluation.
How is this different from Tonic.ai, MOSTLY AI, or Gretel?
General-purpose synthetic data vendors are excellent at flat tabular generation. They are not built for Polish, CEE, or UK regulatory environments. PESEL handling, NIP/REGON validity, BIK scoring, KNF Rekomendacja S amendment effects, PRA SS1/23 evidence requirements — these are either absent or coarsely approximated. CAUSA is the layer that makes a synthetic dataset defensible in a KNF, ECB, PRA, or FCA review.
How do you guarantee no real customer data leaks into the output?
We never see real customer data. CAUSA operates on schema definitions and statistical parameters that you provide — or on our reference distributions if you prefer not to share even those. The auditable artefact bundle includes a formal dataset composition statement documenting the absence of any direct or indirect customer data path.
What is your position on DORA Article 28 third-party risk?
Infundum is architected as EU-resident SaaS or self-hosted in your environment — raw data never leaves your network in either model. We are pre-launch and accepting design-partner evaluations. Design partners receive a draft TPRM pack covering the Article 28 obligations: ICT services register entry, exit and termination provisions, audit rights, and reversibility clauses.
What does pricing look like?
Pricing is bespoke — set by deployment model (EU-resident SaaS or self-hosted), regulatory scope, and the presets you need. That mirrors how every serious vendor in this category contracts: no public rate card, because no two banking engagements are alike.

The shape is consistent. Evaluation begins with a paid, fixed-scope pilot — one preset, one use case (DORA TLPT data, AI Act bias testing, BCBS 239 lineage, or PRA SS1/23 MRM validation), six weeks, deliberately scoped to sit within a single project budget. The pilot fee is credited in full against a production licence. Production is an annual licence, multi-year terms discounted. Pre-launch design partners get preferential terms in exchange for reference participation.

We'll give you a concrete number on the research call, against your actual scope.
What exactly do I get in a pilot?
A fixed-scope, six-week engagement against one preset (for example, Polish retail banking) and one use case you choose — DORA TLPT data, AI Act bias-testing data, BCBS 239 lineage testing, or PRA SS1/23 MRM validation. You receive the multi-table synthetic dataset generated for your schema, the matching compliance artefact bundle for that use case (such as the DORA TLPT Data Safety Annex or the AI Act Article 11 technical file), a fidelity scorecard, and a dataset composition statement. Deployment is self-hosted, so raw data never leaves your environment. The engagement is scoped to sit inside a single project budget, and the pilot fee is credited in full against a production licence.

§ 08 / Talk to founder

Twenty-five
minutes.

If you are preparing for DORA TLPT, AI Act readiness, a BCBS 239 cycle, a PRA SS1/21 resilience programme, or a migration where real customer data cannot leave production — let's compare notes. Not pitch.

What happens after the call: a written summary back to you within 24 hours. If there is mutual interest, a follow-up demo with your specific use case. If not, no follow-up at all unless you ask.

Designed for

  • Heads of Data Engineering and Data Platform leads
  • Data Architects and Solution Architects
  • QA Leads and Test Managers
  • DORA programme leads
  • Heads of Model Risk Management
  • Heads of Operational Risk
  • Data migration leads
  • SI banking practice managers
  • Chief Data, Risk, and Compliance Officers

Coverage

Poland · UK · DACH · Benelux · CEE