Getting Started

June 5-7, 2018

iRODS User Group Meeting 2018

Durham, NC

Justin James

Applications Engineer, iRODS Consortium

Getting Started

Little Slips of Paper & Dependencies

curl -LO https://github.com/irods/irods_training/raw/ugm2018/training.pem
chmod 600 training.pem
ssh -i training.pem ubuntu@#.#.#.#

Logging in to your VM:

Install iRODS Build Requirements:

sudo apt-get update

 

sudo apt-get -y install git g++ make python-dev help2man unixodbc libfuse-dev libcurl4-gnutls-dev libbz2-dev zlib1g-dev libpam0g-dev libssl-dev libxml2-dev libkrb5-dev unixodbc-dev libjson-perl python-psutil python-jsonschema super python-exif odbc-postgresql

Little Slips of Paper & Dependencies

  1. Download Putty             http://www.putty.org/
  2. Download Private Key   https://github.com/irods/irods_training/raw/ugm2018/training.ppk
  3. Open Putty
  4. Enter Username
  5. Select Private Key
  6. Enter Target Host Name (or IP address)
  7. Save Session, Connect! (click Yes at the first-run Security Alert) 

Windows (using Putty):

Little Slips of Paper & Dependencies

Open Putty

Little Slips of Paper & Dependencies

Enter Username

Little Slips of Paper & Dependencies

Select Private Key

C:\Users\Student\Desktop\training20

Little Slips of Paper & Dependencies

Enter Host Name (or IP address)

 

Name Session

 

Save Session

 

Open Connection

Little Slips of Paper & Dependencies

First-Run

Security Alert

Little Slips of Paper & Dependencies

Google Chrome can also connect via "Secure Shell" extension:

 

https://chrome.google.com/webstore/detail/secure-shell/pnhechapfaindjhompbnflcldabbghjo

 

Download public and private keys:

 

 https://github.com/irods/irods_training/raw/ugm2018/irodstraining

 https://github.com/irods/irods_training/raw/ugm2018/irodstraining.pub

 

 

Then:

  1. Install extension
  2. Open New Connection
  3. Enter "ubuntu@#.#.#.#"
  4. Import public/private keypair (select both files at the same time)
  5. Connect!

Acquire the Prerequisites

Clone the training repository:

git clone https://github.com/irods/irods_training

Install and configure Postgres

ubuntu $ sudo apt-get -y install postgresql

ubuntu $ sudo su - postgres

Prepare database system for iRODS use:

postgres $ psql

CREATE DATABASE "ICAT";

CREATE USER irods WITH PASSWORD 'testpassword';

GRANT ALL PRIVILEGES ON DATABASE "ICAT" to irods;

\q

postgres $ exit

Configure the Repository, Install, Run setup

wget -qO - https://packages.irods.org/irods-signing-key.asc | sudo apt-key add -
echo "deb [arch=amd64] https://packages.irods.org/apt/ $(lsb_release -sc) main" | \
  sudo tee /etc/apt/sources.list.d/renci-irods.list
sudo apt-get update

sudo apt-get -y install irods-server irods-database-plugin-postgres

Install public key and add repository:

Install from repository:

sudo python /var/lib/irods/scripts/setup_irods.py < /var/lib/irods/packaging/localhost_setup_postgres.input

Run setup with provided input file:

Setup iRODS for Auditing

Before we continue with the training, we are going to set up some auditing in iRODS so that we can report on all of the activities in our iRODS instance.

 

At the end of today's training, we will revisit this and visualize what has happened in iRODS throughout the day.

 

The first action is the install the auditing plugin...

sudo apt-get -y install irods-rule-engine-plugin-audit-amqp

Setup the iRODS Audit Plugin

Edit /etc/irods/server_config.json

  • add a new stanza to the rule_engines array after the irods_rule_engine_plugin-irods_rule_language-instance definition

  • add the audit namespace

        "rule_engines": [
            {             

               "instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
                ...

                ...
                "shared_memory_instance": "irods_rule_language_rule_engine"
            },
            {
                "instance_name": "irods_rule_engine_plugin-audit_amqp-instance",
                "plugin_name": "irods_rule_engine_plugin-audit_amqp",
                "plugin_specific_configuration" : {
                     "amqp_location" : "ANONYMOUS@localhost:5672",
                     "amqp_options" : "",
                     "amqp_topic" : "audit_messages",
                     "pep_regex_to_match" : "audit_.*"
                 }
           },

           {
                "instance_name": "irods_rule_engine_plugin-cpp_default_policy-instance",

...

...

    "rule_engine_namespaces": [
        "", 
        "audit_"
    ], 

Setup Monitoring

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install -y docker-ce
sudo usermod -aG docker ${USER}

The iRODS plugin produces AMQP messages for each dynamic policy enforcement point.
 

We will now setup a docker container which will accept these messages, store them in an Elastic database, and provide a visualization web tool.

 

First we need to install docker.

After doing this log out of your virtual machine and log back in to make sure your group list is updated.

Setup Monitoring

docker pull irods/irods_audit_elk_stack
docker run -p 8080:15672 -p 5672:5672 -p 80:5601 -p 9200:9200 -dit irods/irods_audit_elk_stack

From within your virtual machine,

download the docker image and run it.

You now have a docker container instance running within your virtual machine.

 

This instance is running the following:
 

  • RabbitMQ - Message broker that stores the AMQP messages

  • Elasticsearch - Database that stores the AMQP messages.

  • Logstash - Reads message from RabbitMQ and writes them to Elasticsearch

  • Kibana - A data visualization plugin for ElasticSearch 

Configure the visualization

Port 80 on your VM is now mapped to the Kibana web tool.

 

Let's configure Kibana.

 

  1. In a web browser, navigate to http://#.#.#.#/ to open the Kibana dashboard.
     

  2. Click on Management (left pane) -> Index Patterns.
     

  3. Type "irods_audit" in the index pattern field and click next step.
     

  4. Select "@timestamp" in the time filter field name and click create index pattern.

 

Configure the visualization

Now we want to create visualizations and a dashboard to view iRODS activity.  We have already created a dashboard for you to load.

 

  1. Save the irods_training/advanced/example_kibana_dashboard.json to your local computer (not the VM).
     

    • You can copy/paste this from your VM into a local file or clone the irods_training repository on your local machine.
       

  2. Click on Management (left pane) -> Saved Objects and click Import.
     

  3. Select the file you saved in #1.  Confirm the changes.
     

  4. Click on the Dashboard (left pane) -> and then iRODS Dashboard.
     

  5. Go ahead and click on the Auto Refresh button at the top and change the refresh period to 30 seconds.
     

  6. Click on the clock icon (Last 15 minutes) at the top of the screen and select "today".

 

Configure the visualization

You should see a dashboard that looks similar to the screenshot below.

 

If you have not executed a put or get to iRODS the Bytes Read Per Minute and Bytes Written Per Minute will report no data.

 

What to Consider in an iRODS Deployment

Things to consider

  • Number of users and expected simultaneous connections

  • Expected ingest rate

  • Sizes of files

    • many small files (more overhead per connection)

  • Partial read / write vs whole file usage

  • Replication for durability

  • Replication for locality of reference

  • Load balancing vs High Availability

iRODS will run on a laptop or a rack of servers

Upgrading Large Installations

Things to consider

  • Database Snapshots

  • Attempt a graceful grid-wide shutdown ahead of time

  • Test Zones - do not upgrade blindly

  • Conformance Tests - try your edge cases

  • Federated Zones - how mixed is your deployment

 

 

Maintenance Window

  • In the Lab:

    • PostgreSQL 9

    • 10M Data Objects

    • VM with 10GB RAM, 4 VCPUs, Rotational Disk

    • Upgrade from 4.1.8 to 4.2.0 took 13 minutes

  • Estimate:

    • 100M Data Objects to take ~2-3 hours

Questions?

Anatomy of an iRODS installation

/etc/irods/core.* - iRODS Rule Language

/etc/irods/database_config.json - database configuration

/etc/irods/host_access_control_config.json - hostname filtering

/etc/irods/hosts_config.json - local /etc/hosts style configuration

/etc/irods/server_config.json - primary server configuration

/etc/irods/service_account.config - service account information

/usr/bin/* - iCommands

 

/usr/sbin/irodsAgent

/usr/sbin/irodsPamAuthCheck

/usr/sbin/irodsReServer

/usr/sbin/irodsServer

/usr/sbin/irodsXmsgServer

 

/var/lib/irods - service account home directory

 

/usr/lib/irods/plugins - plugins location

Introduction to iCommands

iRODS command line equivalent to standard Unix operations

  • ils

  • icd

  • ipwd

  • iput

  • iget

  • irepl

 

use -h to get help with any particular iCommand

 

ihelp will show all available iCommands

 

Questions?

UGM 2018 - Getting Started

By iRODS Consortium

UGM 2018 - Getting Started

iRODS User Group Meeting 2018 - Advanced Training Module

  • 2,461