Duplicate Checks

This guide explains how to implement effective duplicate checks using FrankieOne's KYC services.

Introduction

Duplicate checks help detect and prevent the creation of multiple customer records for the same individual within KYC systems. They are essential for maintaining data integrity, reducing fraud risk, and ensuring compliance with regulatory requirements.

Duplicate Step in KYC Workflows

The duplicate step is a critical component of the KYC workflow, enabling organizations to detect and manage potential duplicate entities during onboarding. This step provides:

  • Automated identification of possible duplicates using configurable, rules-based matching logic.
  • Manual review and resolution capabilities for operators to confirm or dismiss flagged duplicates.
  • Seamless integration with existing KYC processes, supporting both operational efficiency and regulatory compliance.

To implement duplicate checks, you must add the duplicate step to your KYC workflow. This step will automatically run during the onboarding process, checking for potential duplicates based on defined matching rules. Speak with your FrankieOne representative to configure the duplicate step in your workflow.

Key Benefits

Integrating duplicate checks into onboarding delivers key benefits:

  • Strengthens fraud prevention by detecting and blocking attempts to create multiple accounts for the same individual or entity.
  • Improves data integrity by consolidating duplicate records, ensuring a single, authoritative customer profile.
  • Enhances customer experience by reducing errors and inconsistencies across systems.
  • Supports compliance with anti-money laundering (AML) and know your customer (KYC) regulations by maintaining accurate and up-to-date records.

This approach helps organizations maintain accurate records, reduce operational risk, and meet compliance requirements.


How It Works

The duplicate step in FrankieOne’s KYC workflow identifies potential duplicate entities based on configurable matching rules. It allows operators to review and resolve duplicates during the onboarding process, ensuring data integrity and compliance.

Prerequisites

  • KYC Workflow: The duplicate step must be included in the KYC workflow.
  • Matching Rules: Define matching rules in duplicate.json to specify how duplicates are identified.

Configurations starting with EXACT are exact matches, while those starting with FUZZY allow for variations in data (e.g., typos). With the current implementation, only EXACT matching is supported. The FUZZY matching configuration is provided for future compatibility.

1{
2 "ruleSets": {
3 "default": {
4 "rules": [
5 {
6 "name": "External reference",
7 "matchFields": [
8 "EXTERNAL_REFERENCE"
9 ],
10 "method": "EXACT",
11 "riskFactor": "VERY_HIGH"
12 },
13 {
14 "name": "Document identifiers",
15 "matchFields": [
16 "DOC_PRIMARY_IDENTIFIER", "DOC_SECONDARY_IDENTIFIER", "DOC_COUNTRY", "DOC_SUBDIVISION", "DOC_TYPE"
17 ],
18 "method": "EXACT",
19 "riskFactor": "VERY_HIGH"
20 },
21 {
22 "name": "Phone number",
23 "matchFields": [
24 "PHONE_NUMBER"
25 ],
26 "method": "EXACT",
27 "riskFactor": "HIGH"
28 },
29 {
30 "name": "Email address",
31 "matchFields": [
32 "EMAIL_ADDRESS"
33 ],
34 "method": "EXACT",
35 "riskFactor": "MEDIUM"
36 },
37 {
38 "name": "Given + Family name",
39 "matchFields": [
40 "GIVEN_NAME", "FAMILY_NAME"
41 ],
42 "method": "EXACT",
43 "riskFactor": "MEDIUM"
44 },
45 {
46 "name": "Given + Family name + Date of birth",
47 "matchFields": [
48 "GIVEN_NAME", "FAMILY_NAME", "DATE_OF_BIRTH"
49 ],
50 "method": "EXACT",
51 "riskFactor": "VERY_HIGH"
52 },
53 {
54 "name": "Given + Family name + Short form normalised address",
55 "matchFields": [
56 "ADDR_NORM_SHORT", "GIVEN_NAME", "FAMILY_NAME"
57 ],
58 "method": "EXACT",
59 "riskFactor": "HIGH"
60 }
61 ]
62 },
63 "customRuleset": {
64 "rules": [
65 {
66 "matchFields": [
67 "GIVEN_NAME", "FAMILY_NAME", "DATE_OF_BIRTH"
68 ],
69 "method": "FUZZY",
70 "confidence": {
71 "threshold": 0.7,
72 "fieldThreshold": 0.5,
73 "fieldOverrides": {
74 "GIVEN_NAME": 0.8,
75 "FAMILY_NAME": 1
76 },
77 "isOverrideFieldNotRequired": true
78 },
79 "riskFactor": "HIGH"
80 }
81 ]
82 }
83 }
84}
  • Risk Profiles: Configure risk profiles in risk_profiles.json to handle duplicates appropriately.

    1{
    2 "riskProfiles": {
    3 "default": {
    4 "levels": [..],
    5 "factors": [
    6 {
    7 "name": "unresolved_duplicates",
    8 "description": "Unresolved duplicates",
    9 "handler": "unresolved_duplicates",
    10 "score_method": "lookup",
    11 "scores": [
    12 {
    13 "name": "Any unresolved duplicates",
    14 "range": {
    15 "min": 1
    16 },
    17 "score": 20
    18 }
    19 ]
    20 },
    21 {
    22 "name": "true_positive_duplicates",
    23 "description": "Resolved duplicates",
    24 "handler": "true_positive_duplicates",
    25 "score_method": "lookup",
    26 "scores": [
    27 {
    28 "name": "Any resolved duplicates",
    29 "range": {
    30 "min": 1
    31 },
    32 "score": 2
    33 }
    34 ]
    35 }
    36 ]
    37 }
    38 }
    39}

Process Overview

1

Create Entity

Start by creating an entity using the KYC API, including attributes like name, date of birth, and identifiers.

$curl -X POST \
>'{{baseHost}}/v2/individuals?level=base64' \
>-H 'Content-Type: application/json' \
>-H 'Accept: application/json' \
>-H 'api_key: YOUR_API_KEY' \
>-H 'X-Frankie-CustomerID: YOUR_CUSTOMER_ID' \
>-H 'X-Frankie-CustomerChildID: YOUR_CUSTOMER_CHILD_ID' \
>-H 'X-Frankie-Channel: YOUR_CHANNEL' \
>-H 'X-Frankie-Username: YOUR_USERNAME' \
>-d '"individual": {
> "addresses": [
> {
> "type": "RESIDENTIAL",
> "streetName": "Phillip Street",
> "streetNumber": "10",
> "streetType": "Street",
> "locality": "Newtown",
> "district": "Sydney",
> "subdivision": "NSW",
> "country": "AUS",
> "postalCode": "2042",
> "status": "CURRENT"
> }
> ],
> "documents": {
> "IDENTITY": [
> {
> "primaryIdentifier": "000734130",
> "type": "PASSPORT",
> "country": "AUS",
> "attachments": [
> {
> "filename": "passport.jpg",
> "pageNumber": 0,
> "side": "FRONT",
> "type": "PHOTO",
> "data": {
> "base64": "R0lGODlhAQABAIAAAP///wAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw=="
> }
> }
> ]
> }
> ]
> },
> "name": {
> "givenName": "John",
> "familyName": "Doe",
> "displayName": "John Doe"
> },
> "dateOfBirth": {
> "year": "1990",
> "month": "01",
> "day": "01"
> },
> "gender": {
> "gender": "MALE"
> },
> "nationality": "NZL",
>}
2

Run Workflow

Execute the workflow that includes the duplicate step. This will trigger the duplicate check against existing entities.

$curl -X POST \
>'{{baseHost}}/v2/individuals/ENTITY_ID/serviceprofiles/SERVICE_NAME/workflows/WORKFLOW_NAME/execute?level=base64' \
>-H 'Content-Type: application/json' \
>-H 'Accept: application/json' \
>-H 'api_key: YOUR_API_KEY' \
>-H 'X-Frankie-CustomerID: YOUR_CUSTOMER_ID' \
>-H 'X-Frankie-CustomerChildID: YOUR_CUSTOMER_CHILD_ID' \
>-H 'X-Frankie-Channel: YOUR_CHANNEL' \
>-H 'X-Frankie-Username: YOUR_USERNAME' \
>-d '{
> "comment": {
> "text": "Update after speaking to customer over phone directly"
> }
>}'
Execute a workflow for an individual's service profile.

This operation initiates the configured workflow, including the duplicate step, for the specified individual.

Replace all placeholder values (e.g., YOUR_API_KEY, ENTITY_ID, SERVICE_NAME, WORKFLOW_NAME, etc.) with your actual data before executing.

{{baseHost}} should be set to your API base URL (e.g., https://api.frankieone.com).

3

Check for Duplicates

The duplicate step will automatically check the entity against existing records based on the defined matching rules. If duplicates are found, they will be returned in the response.

4

Review Results

Review the results of the duplicate check. The response will include potential duplicates and their matching fields.

1{
2 "class": "DUPLICATE",
3 "createdAt": "2025-05-05T00:48:37.323521Z",
4 "entityId": "6547ff56-fcb6-4df4-96ad-4aa2cf273387",
5 "objectId": "6547ff56-fcb6-4df4-96ad-4aa2cf273387",
6 "objectType": "INDIVIDUAL",
7 "processResultId": "01JTEYN3YBFC8DM4247JGAB7EZ",
8 "providerResult": {
9 "source": "builtin"
10 },
11 "requestId": "01JTEYN009B4EG2XW76H6YX7VM",
12 "result": "HIT",
13 "schemaVersion": 2,
14 "state": "COMPLETED",
15 "stepName": "DUPLICATE",
16 "supplementaryData": {
17 "duplicateEntityId": "cec1c265-961b-4961-a9f3-a7cffd7ac4a8",
18 "matchedFields": [
19 {
20 "duplicateObjectId": "9204812d-a8d1-43f5-a0a8-cc500b0ff1c8",
21 "matchStrength": 100,
22 "objectId": "ba87928d-cb38-483b-bd4f-0560c6e8f809",
23 "objectType": "NAME"
24 }
25 ],
26 "matchedRules": [
27 {
28 "name": "Given_+_Family_name",
29 "strength": "MEDIUM"
30 }
31 ],
32 "type": "DUPLICATE"
33 },
34 "systemStatus": "VALID",
35 "updatedAt": "2025-05-05T00:48:37.323521Z"
36}
5

Resolve Duplicates

After reviewing the results, the operator can resolve duplicates by updating the process results. This is done by sending a PATCH request to the results endpoint with the appropriate manualStatus value.

There are three possible outcomes when resolving duplicates, each indicated by the manualStatus field:

  • FALSE_POSITIVE (FP):
    Indicates the result is not actually a duplicate. The process result is marked as FP, and no duplicate relationship is created.

    1{
    2 "entityId": "focus-entity-id",
    3 "manualStatus": "FALSE_POSITIVE",
    4 "duplicateEntityId": "potential-duplicate-id",
    5 "result": "HIT"
    6}
  • TRUE_POSITIVE_ACCEPT (TPA):
    Confirms the found entity is a duplicate, but onboarding continues for the focus entity. The duplicateEntityId is the duplicate, and the PRO belongs to the focus entity (entityId).

    1{
    2 "entityId": "focus-entity-id",
    3 "manualStatus": "TRUE_POSITIVE_ACCEPT",
    4 "duplicateEntityId": "duplicate-entity-id",
    5 "result": "HIT"
    6}
  • TRUE_POSITIVE_REJECT (TPR):
    Confirms the found entity is a duplicate, and onboarding is stopped for the focus entity. The entityId is the duplicate, and the duplicateEntityId is the non-duplicate. The PRO still belongs to the focus entity.

    1{
    2 "entityId": "focus-entity-id",
    3 "manualStatus": "TRUE_POSITIVE_REJECT",
    4 "duplicateEntityId": "non-duplicate-entity-id",
    5 "result": "HIT"
    6}

Note:
It is possible for a single entity to have multiple duplicate results pointing to it, indicating it is a duplicate for several other entities. Always consider the full set of duplicate results when managing entity state.

$curl -X PATCH \
>'{{baseHost}}/v2/individuals/{entityId}/results/duplicate' \
>-H 'Content-Type: application/json' \
>-H 'Accept: application/json' \
>-H 'api_key: YOUR_API_KEY' \
>-H 'X-Frankie-CustomerID: YOUR_CUSTOMER_ID' \
>-H 'X-Frankie-CustomerChildID: YOUR_CUSTOMER_CHILD_ID' \
>-H 'X-Frankie-Channel: YOUR_CHANNEL' \
>-H 'X-Frankie-Username: YOUR_USERNAME' \
>-d '{
> "processResults": [
> "PROCESS_RESULT_ID"
> ],
> "manualStatus": "FALSE_POSITIVE",
> "comment": {
> "text": "Update after reviewing duplicate results"
> }
>}'

Key Concepts

Entity Deletion and Duplicate Relationships

When an entity is deleted, any duplicate relationships involving that entity—whether it is the focus or the duplicate—are also removed. There may be multiple such relationships. After deletion, the system re-evaluates the duplicate state of any impacted entities. If an entity no longer has any relationships indicating it is a duplicate, its duplicate state may be revoked.

Note:
The duplicate state may or may not change immediately after deletion, depending on whether other duplicate-indicating relationships remain. See Multiple Duplicate-Indicating Results for more details.


Duplicate State and Relationship Management

The DUPLICATE state is an entity-level state (also reflected in the service profile) managed exclusively by the results of the Duplicate step and subsequent operator actions.

  • When a potential duplicate is detected, a Process Result Object (PRO) is created.
  • The entity’s state is not immediately changed; the operator must review and update the result.
  • Marking an entity as a duplicate also creates a relationship between the focus entity and the duplicate.
  • These relationships are updated as results are reviewed or as the workflow is rerun.

How a PRO Indicates a Duplicate

An entity is marked as a duplicate if there is at least one relationship (from a PRO) indicating it is a duplicate.


Result-Based Duplicate Marking

The outcome of operator review determines both the duplicate state and relationships:

  • TRUE_POSITIVE_ACCEPT (TPA) or TRUE_POSITIVE_REJECT (TPR):
    • The entity may be marked as a duplicate.
    • A duplicate relationship is created.
  • FALSE_POSITIVE (FP):
    • The entity may have its duplicate state revoked.
    • The duplicate relationship is removed.

Example Scenarios

Scenario 1: Single Focus Entity

  1. Entities: A, B, C.
  2. Onboard A; duplicate step finds B and C as potential matches.
  3. Operator marks B as FP:
    • No relationship created, no duplicate state change.
  4. Operator marks C as TPA:
    • Relationship created between A and C.
    • If C was not previously a duplicate, it is now marked as DUPLICATE.

Scenario 2: Multiple Entities Reference the Same Duplicate

  1. Entities: A, B, C.
  2. Onboard A; duplicate step finds B as a potential match.
  3. Operator marks B as TPA:
    • Relationship created (A → B).
    • B marked as DUPLICATE if not already.
  4. Onboard C; duplicate step finds B as a potential match.
  5. Operator marks B as TPA:
    • Relationship created (C → B).
    • B remains in DUPLICATE state.
  6. Later, operator marks A → B as FP:
    • Relationship (A → B) removed.
    • B remains in DUPLICATE state due to C → B.

Step-Based Duplicate Marking

When the duplicate step is rerun (e.g., after entity data changes), previously detected duplicates may no longer match:

  • If an entity previously marked as a duplicate is no longer matched, the corresponding relationship is removed.
  • The PRO is marked as stale.
  • If no relationships remain indicating the entity is a duplicate, its duplicate state is revoked.

Example

  1. Entities: A, B.
  2. Onboard A; duplicate step finds B as a match.
  3. Operator marks B as TPA:
    • Relationship created (A → B).
    • B marked as DUPLICATE.
  4. Later, A’s data changes and the duplicate step is rerun.
  5. B is no longer matched:
    • Relationship removed.
    • If B has no other duplicate relationships, its DUPLICATE state is revoked.

Multiple Duplicate-Indicating Results

An entity can have multiple PROs and relationships indicating it is a duplicate for several other entities. The duplicate state is only revoked when all such relationships are removed.


Summary Table

ActionRelationshipDuplicate State
Mark as TPA/TPRCreatedMay be set
Mark as FPRemovedMay be revoked
Entity deletedRemovedMay be revoked
Step rerun, match disappearsRemovedMay be revoked

Notable Objects

  • Results:

    • Process-Result-Manual-StatusEnum-Duplicate
    • Process result should reference matching fields
  • Relationships:

    • Duplicates relationship in entity response
  • Audit Events:

    • Supplementary-Data-AuditEvent-Duplicate
    • Types: DUPLICATE, ENTITY_PROFILE_STATE_UPDATE
    • Profile state changes: Supplementary-Data-AuditEvent-Entity-Profile-State-Change