Case Study:

 

Race Condition Issues and Solutions in Batch Import Processes

Real Production Issue

Actual business logic will not be disclosed. Some details may be simplified, and service names or identifiers may be replaced or generalized.

  • Background

  • Evaluation

  • Trial and Error

  • Lessons Learned

Background

It's easy to design a system without any limitation
we colud import data one by one

Consider time, cost, and risk

We should batch-import data

  1. Two different souceBills might affect same record in our db
  2. If the operation happend at the same time, a race condition will occur

 

Here comes the problems

Race Condition #1:

Duplicate Insert                                          
                                           

INSERT ... ON CONFLICT DO NOTHING

 

Race Condition #2:

Lost Update
                                           

Transaction A: SELECT → DATA [a: 100]
Transaction B: SELECT → DATA [a: 100]  ← read same old DATA

Transaction A: UPDATE → [a: 100, b: 50]
Transaction B: UPDATE → [a: 100, c: 80]  ← Override A’s update

 

result: [a: 100, c: 80]
expect: [a: 100, b: 50, c: 80]

async upsertDerivedBill(configId: string, yearMonth: string, input: Input) {
  const key1 = hashToInt32(configId);
  const key2 = hashToInt32(yearMonth);

  return await db.transaction(async (tx) => {
    await tx.execute(sql`SET LOCAL lock_timeout = '5000ms'`);

    await tx.execute(sql`SELECT pg_advisory_xact_lock(${key1}, ${key2})`);

    const existing = await tx.select().from(DerivedBill).where(eq(name, input.name));

    if (existing) {
      return await this.mergeAndUpdate(tx, existing, input);
    } else {
      return await tx.insert(DerivedBill).values(input);
    }
  });
}

pg_advisory_xact_lock

Solutions

Idempotent Upsert  

Not enough,

Resolve duplicate insert, but did not resolve Lost update issue

Transaction-Level Advisory Lock  

One more thing

Integration test

Case Study:Race Condition

By Jay Chou

Case Study:Race Condition

  • 24