IOStreams and Filesystem
February 20, 2019
Renaissance Computing Institute
UNC-Chapel Hill
Kory Draughn
korydraughn@renci.org
Software Developer, iRODS Consortium
IOStreams and Filesystem
C API - iRODS and POSIX
iRODS provides a C API for managing data objects and collections.
The functions making up this API follow the POSIX standard as much as possible.
However, unlike the standard POSIX C API, using the iRODS API to manipulate objects requires a lot of code and easily leads to errors.
Enter New Libraries!
Coming in v4.3.0 are two new C++ libraries:
- iRODS IOStreams
- iRODS Filesystem
Our goals with these libraries are:
- Provide familiar interfaces
- Make it harder for developers to introduce bugs
- Make it easier for developers to interact with the virtual filesystem of iRODS
Agenda
- iRODS IOStreams
- iRODS Filesystem
- iput prototype using new libraries
What is iRODS IOStreams?
iRODS IOStreams is a collection of classes and functions that simplify data object I/O.
Features:
- Simple to use
- Built on top of the C++ IOStreams classes
- Provides a familiar interface to all C++ developers
- Works on the client-side and server-side
- Equivalent to C++'s std::fstream
What's Included in iRODS IOStreams?
At this time, the library consists of a single header file called dstream.hpp.
Four classes are defined by this header file:
- idstream - A stream class supporting only input
- odstream - A stream class supporting only output
- dstream - A stream class supporting input and output
- basic_data_object_buf - A stream buffer class in which the following classes are implemented in terms of
Old vs New - Open, Write, Close
Old
New
What's next for IOStreams?
We envision the library to consist of multiple header files that provide different capabilities. For example, making the transport layer customizable.
- dstream.hpp
- transport/default.hpp
- transport/rdma.hpp
- transport/udt.hpp
Abstractions of common patterns, seen in the standards already.
Input welcome - tell us how to do this well...
What is iRODS Filesystem?
iRODS Filesystem is an implementation of the standard filesystem library introduced in the ISO C++17 standard.
It provides abstractions that simplify management of iRODS-based filesystem components such as paths, data objects, and collections.
Features:
- Implements a standardized interface
- Readable code
- Works on the client-side and server-side
- Users of Boost.Filesystem will feel comfortable with this library
- Throws detailed exception messages for common errors
- Works with C++ standard library algorithms
iRODS Filesystem Facilities
Standardized Functions:
copy
copy_data_object
create_collection
create_collections
exists
equivalent
data_object_size
is_data_object
is_collection
is_other
is_empty
last_write_time
remove
remove_all
permissions
rename
status
status_known
iRODS Specific Functions:
data_object_checksum
set_metadata
remove_metadata
Standardized Types:
path
collection_iterator
recursive_collection_iterator
Old vs New - Iterating Over A Collection
Old
New
iput prototype
4.2.4 | Prototype | Improvement | |
---|---|---|---|
Lines of Code | 242 + 1149 | 281 | +80% |
1000 512k files | 17s | 5s | +70% |
2000 256k files | 32s | 6s | +81% |
4000 128k files | 58s | 8s | +86% |
8000 64k files | 111s | 10s | +90% |
16000 32k files | 212s | 18s | +91% |
1 10G file | 91s | 94s | -0.03% |
Uses both IOStreams and Filesystem.
This is a single test run.
Network: 1000T
Prototype: Used 16 threads
2 Machines: 32 cores each
TRiRODS February 2019 - IOStreams and Filesystem
By iRODS Consortium
TRiRODS February 2019 - IOStreams and Filesystem
- 1,191