iRODS Logical Locking
June 8-10, 2021
iRODS User Group Meeting 2021
Virtual Event
Alan King
Software Developer, iRODS Consortium
iRODS Logical Locking
What is a Data Object? What is a Replica?
Data Object: a logical representation of data that maps to one or more physical instances (Replicas) of the data at rest in Storage Resources
Replica: an identical, physical copy of a Data Object
How do we create and modify data in iRODS?
iRODS supports a POSIX-like interface for opening, writing, and closing. Every data movement operation in iRODS boils down to:
Open replica, move data to replica, close replica
Most users deal with high-level APIs (put, cp, repl, etc.) which are built using these lower-level APIs.
iRODS also has the concept of policy enforcement points which are triggered by operations and can themselves trigger additional operations.
How do we define the state of data? What is Truth?
Truth: The latest data known to be "correct"; or, how the data "should" be
Replica status: The state of the data as it relates to the physical storage, the catalog, and the Truth
Good: Data is at rest, matches the catalog, and reflects the Truth
Stale: Data is at rest, but does not meet all criteria for being Good
- It may not match what is in the catalog: data transfer errors, mismatched checksum, corruption, etc.
- It may not reflect the Truth (anymore): more-recently-written data understood as being correct exist (may or may not differ!)
- Note: stale does not necessarily mean the data are incorrect, it is just at least not known to reflect the Truth
Replica Statuses
Value |
ils |
Status |
Description |
---|---|---|---|
0 |
X |
stale |
- data at rest may not match catalog |
1 |
& |
good |
- data at rest matches catalog |
2 |
? |
intermediate |
- data is not at rest |
3 |
? | read lock | - allows open for read - locks out open for write |
4 | ? | write lock |
- locks out all opens for this replica - when sibling replica marked intermediate |
Why Locking? Concurrency in a Distributed System
Uncoordinated, concurrent writing to a single replica can lead to data corruption.
Uncoordinated, concurrent writing to multiple replicas of the same data object causes truth corruption.
Uncoordinated, concurrent operation execution can lead to policy violations.
All of these things endanger our understanding of the state of the data, which is how we know that our data is stored and cataloged safely.
Example: iput to resource hierarchy
Data Corruption: Intermediate Replicas
Uncoordinated, concurrent writing to a single replica can lead to data corruption.
Problem: In-flight replicas can be opened and modified concurrently by multiple agents in an uncoordinated fashion, and the catalog does not reflect the current, true state of the data.
Solution: Mark in-flight replicas as intermediate at open time and update the status at close to reflect the state of the replica
- Status of the replica is accurately represented in the catalog
- The system and users can take appropriate action based on whether or not the replica is at rest
Truth Corruption: Logical Locking
Uncoordinated, concurrent writing to multiple replicas of the same data object causes truth corruption.
Problem: It is unclear which replica for a given data object represents the Truth when multiple replicas are in flight at the same time.
Solution: Prevent opening any replica for a given data object when any one of the replicas opened for write.
- The opened replica is marked intermediate, as shown previously
- The other replicas are write locked which prevents any additional opens for read or write; it is clear which replica represents the Truth
What write locks do NOT solve:
- Database race conditions
- Protection against rogue administrators
Future Work
Policy Violation: Operation Locking
Uncoordinated, concurrent operation execution can lead to policy violations.
Problem: If a data-modifying operation is impacted by policy execution which leads to other data-modifying operations, other concurrent, uncoordinated data-modifying operations can lead to violations in said policy.
Solution: Keep data object locked over the lifetime of any given data-modifying operation.
Future Work: Read Locks and Lock Checker
Text
iRODS Logical Locking
By Alan King
iRODS Logical Locking
iRODS User Group Meeting 2021 - iRODS Logical Locking
- 803