COSCUP 2025

Peter

2025/08/09

My Best Practices about Integrating NiFi with the PostgreSQL database

Slide

About me

Peter, GitHub
Active open source contributor
Speaker
- COSCUP, MOPCON and so on
An engineer
- DevOps
- Back-end
- System Architecture Researching
- Web Application Security
- PHP, Python and JavaScript
Industrial Technology Research Institute
- Smart Grid Technology (2017~2021)
Institute for Information Industry
- Database, Data platform architecture (2021~)

Outlines

What's NiFi?
NiFi Components Introduction
Integrating PostgreSQL with the NiFi
- Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline
- Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline
How to deploy the NiFi Cluster?
How to measure the NiFi performance?
Summary and Future Work

Outlines

What's NiFi?
NiFi Components Introduction
Integrating PostgreSQL with the NiFi
- Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline
- Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline
How to deploy the NiFi Cluster?
How to measure the NiFi performance?
Summary and Future Work

What's NiFi？

Apache NiFi is developed by NSA
Since 2014, it's the Apache Top-Level Project
It's a Data Pipeline or Data Flow Control service

Outlines

What's NiFi?
NiFi Components Introduction
Integrating PostgreSQL with the NiFi
- Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline
- Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline
How to deploy the NiFi Cluster?
How to measure the NiFi performance?
Summary and Future Work

NiFi Components Introduction (1/5)

FlowFile
- A piece of data or a single record in a file
- Or a dataset has many records
- E.g. 100 records have 100 FlowFiles
- It has own attribute and content
  - The attribute is metadata or key-value
    - It includes size, path and permission
  - The content is the real data. CSV or JSON

NiFi Components Introduction (2/5)

Processor
- It can imagine that a logical processing unit
- NiFi has many built-in processors
- It can use the configuring approach to
  - generate, process, convert and output the FlowFile

Connection
- When creating the data flow, we can establish the relationships between processors
- It can be flexible to establish many connection and every route state and meaning
- It can also configure the queue state

NiFi Components Introduction (3/5)

Process(Processor) Group
- If we have two data pipelines, then have the same one, and we can let same pipeline be the Process Group. It can be reusable
- We can also split many Process Group with projects or teams
- It can configure Processor Group permissions for different users

Funnel
- Let many source connections be a single connection. It can be helpful for making data pipeline readable
- We can have many processor and connect the same processor

NiFi Components Introduction (4/5)

Controller Service
- A third-party connection and it can establish the connection with the Internet service
- Using the Controller Service to manage and reuse the third-party connection

Reporting Task
- It plays an important role in monitoring the NiFi
- It can let hardware metrics be informations and publish them
  - These hardware metrics can be CPU, memory and disk usage.
- We send above monitoring information to the Prometheus, DataDog or CloudWatch
  - Using these third-party services to visualize hardware utilization changes

NiFi Components Introduction (5/5)

Template
- It can be used to transfer/migrate the NiFi environment
- For example, we migrate the NiFi machine A and migrate to the NiFi machine B

NiFi Architecture Introduction

Architecture
- NiFi is developed by Java and it's run in the JVM
- FlowFile Repository, it stores every processor process state and metadata
- Content Repository, it has individual Directory and store real FlowFile contents in instances
- Provenance Repository, it stores individual FlowFile about tracking events
- Flow Controller, the core of NiFi. It's used to dispatch processor resource usage
- Web Server, users can use the drag-drop to generate data flow in the Web UI

Outlines

What's NiFi?
NiFi Components Introduction
Integrating PostgreSQL with the NiFi
- Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline
- Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline
How to deploy the NiFi Cluster?
How to measure the NiFi performance?
Summary and Future Work

Building your own NiFi Environment (1/14)

Building the NiFi environment has many approaches
- Running the NiFi in the dedicated or virtual machine directly
- Running the NiFi with the Docker Container
- Running the NiFi with the Docker Compose
We use the Docker Compose approach to build the NiFi
- NiFi 1.28.x is at end of life since 2024-12-08!
- NiFi 2.x doesn't have good enough deployment script on the Internet
- Don't worry. The open source project is available here:
  - https://github.com/peter279k/nifi_persistence

Building your own NiFi Environment (2/14)

Since the NiFi 2.0.0 version is released,
- The official NiFi Docker image has been dropped HTTP mode
- My first contribution to the NiFi project
  - https://github.com/apache/nifi/pull/10178
  - https://issues.apache.org/jira/browse/NIFI-14828

Building your own NiFi Environment (3/14)

Step1: Using the git clone command to clone the project

$ git clone https://github.com/peter279k/nifi_persistence

Step2: Using following commands to configure file persistence

Prerequisites
- Docker is available. It should be 28.3.2 or later
- keytool command is available. It should be Java 17 or later
  - sudo apt-get install -y openjdk-17-jdk

$ cp providers.xml.file providers.xml
$ cp docker-compose-file.yml docker-compose.yml

Step3: Using the command to enable HTTPS certificate
- Remember to configure correct IP address and DNS for the SNI (Server Name Indication)

$ cat SAN.txt
$ ip:127.0.0.1,ip:146.71.85.60,dns:nifi.peterli.website,dns:nifi

Building your own NiFi Environment (4/14)

Step4: Executing the run_nifi_file.sh shell script to setup NiFi

$ ./run_nifi_file.sh
$ Starting tricky NiFi persistence...
[sudo] password for lee:
......
[+] Running 5/5
.......
[+] Stopping 3/3
 ✔ Container nifi_persistence-registry-1   Stopped                                                     0.4s
 ✔ Container nifi_persistence-nifi-1       Stopped                                                    10.5s
 ✔ Container nifi_persistence-zookeeper-1  Stopped                                                     0.6s
Setting related nifi and nifi_registry permissions is done.
Generating the truststore.p12 and keystore.p12 with the keytool
......
Trust this certificate? [no]:  Certificate was added to keystore
Moving truststore.p12 and keystore.p12 to the ./nifi_conf directory
Remove the nifi-cert.crt file
Uncomment the next line after copying the /conf directory from the container to your local directory to persist NiFi flows
[+] Running 3/3
 ✔ Container nifi_persistence-registry-1   Started                                                     0.8s
 ✔ Container nifi_persistence-nifi-1       Started                                                     0.7s
 ✔ Container nifi_persistence-zookeeper-1  Started                                                     0.4s
Starting tricky NiFi persistence has been done.

Building your own NiFi Environment (5/14)

Step5: Using the Netcat and cURL commands to verify NiFi

$ nc -vz localhost 8091
Connection to 127.0.0.1 8091 port [tcp/*] succeeded

$ curl -k https://127.0.0.1:8091/nifi/
<!--
  ~ Licensed to the Apache Software Foundation (ASF) under one or more
  ~ contributor license agreements.  See the NOTICE file distributed with
  ~ this work for additional information regarding copyright ownership.
  ~ The ASF licenses this file to You under the Apache License, Version 2.0
  ~ (the "License"); you may not use this file except in compliance with
 ......

Building your own NiFi Environment (6/14)

Step6: Using the Cloudflare Tunnel to secure the NiFi
- According to the above steps, our NiFi is using the self-signed certificate
- We assume the SNI has the nifi.peterli.website
- And the domain has been configured by the Cloudflare
- Firstly, go to the Cloudflare Zero Trust configuration page

Building your own NiFi Environment (7/14)

Step7: Clicking "Tunnels" and create the Cloudflare Tunnel

Building your own NiFi Environment (8/14)

Step8: Choose the "Cloudflared" and naming the tunnel

Building your own NiFi Environment (9/14)

Step9: Choose the Docker Tunnel to setup the Cloudflared
- Using "docker run -d cloudflare/cloudflared:latest " to run the daemon tunnel
- Once the connection is connetced, the list will have the available connector

Building your own NiFi Environment (10/14)

Step10.1: Creating the subdomain DNS record
- It should ensure that the subdomain DNS record has been deleted
- After creating the DNS record, creating the forwarding local service type
- Then clicking the button to complete the configuration

Building your own NiFi Environment (11/14)

Step10.2: Creating the subdomain DNS record
- Clicking the "其他應用程式設定", then enabling "no TLS certificate verification"

Building your own NiFi Environment (12/14)

Step11: Verifying the Cloudflare Tunnel list
- We can see that our nifi.peterli.website setting is in the Cloudflare Tunnel list now

Building your own NiFi Environment (13/14)

Step12: Verifying the NiFi website via the web browser
- Open web browser then browse the "https://nifi.peterli.website" to verify certificate

Building your own NiFi Environment (14/14)

Step13: Verifying the NiFi website via the web browser
- After logging in NiFi account has been completed, it's expected to retrieve the UI

Building your own PostgreSQL (1/2)

Step1: Creating the docker-compose.yml file
- Using the Docker Compose file to define the PostgreSQL service
- We use the PostgreSQL 16 for these case studies
- Running following commands to complete PostgreSQL database setup

$ mkdir postgres-16
$ cd postgres-16
$ vim docker-compose.yml
$ cat docker-compose.yml
services:
  postgres_db:
    image: postgres:16
    ports:
      - "5433:5432"
    container_name: postgres_db
    environment:
      POSTGRES_DB: regions
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres123
    volumes:
      - ./postgres_data:/var/lib/postgresql/data
    networks:
      - postgres-db-net

networks:
  postgres-db-net:
    driver: bridge
$ docker compose down
$ docker compose up -d --build

Verify the PostgreSQL DB Connection

Setup the PostgreSQL DB with Docker Compose

Building your own PostgreSQL (2/2)

Step2: Entering this PostgreSQL Docker container
- Creating the countries_count table

$ docker compose exec -it postgres_db bash
root@943ad56ebb06:/# psql -U postgres
psql (16.9 (Debian 16.9-1.pgdg120+1))
Type "help" for help.

postgres=# \c regions
You are now connected to database "regions" as user "postgres".
regions=# CREATE TABLE countries_count (
    country_id SERIAL PRIMARY KEY,
    counter INT,
    event_time TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE

Outlines

What's NiFi?
NiFi Components Introduction
Integrating PostgreSQL with the NiFi
- Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline
- Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline
How to deploy the NiFi Cluster?
How to measure the NiFi performance?
Summary and Future Work

Case (1) Building an aggregating data pipeline

This data pipeline is about aggregating the data
We design a data pipeline to complete this case
The building steps are as follows:
Creating the InvokeHTTP Processor
- Then send an HTTPS request to https://covid-api.com/api/regions
Creating the EvaluateJsonPath Processor
- Using the JSONPath to filter the data field ($.data) and store "data" attribute
Creating the ExecuteScript Processor
- Using the customized Groovy Script to read "data" attribute and retrieve array length
Creating a PutSQL Processor
- Execute the SQL to store the aggregated results

Case (1) The InvokeHTTP Processor

Creating the InvokeHTTP Processor
- Then send HTTPS request to https://covid-api.com/api/regions
- Using the GET method to send the request to the above URL

Expected JSON Response

InvokeHTTP Processor Setting

InvokeHTTP Processor Relationships

Case (1) The EvaluateJsonPath Processor

Creating the EvaluateJsonPath Processor
- Using the JSONPath to filter the data field ($.data) and store "data" attribute

Expected JSON Response

EvaluateJsonPath Processor Setting

EvaluateJsonPath Processor Relationships

Case (1) The ExecuteScript Processor

Creating the ExecuteScript Processor
- Apache Groovy. A flexible and extensible Java-like language for the JVM
  - https://groovy-lang.org
- Using the customized Groovy Script to read "data" attribute and retrieve array length
- Creating a Groovy script and it's as follows:

import groovy.json.JsonSlurper


FlowFile flowFile = session.get()
if (flowFile == null) {
    return
}

String responseBody = flowFile.getAttribute('data')

if (responseBody != null) {
    log.info("Response JSON Data: " + responseBody)
    def jsonSlurper = new JsonSlurper()
    def data = jsonSlurper.parseText(responseBody)

    session.putAttribute(flowFile, "data_size", data.size().toString())
    session.transfer(flowFile, REL_SUCCESS)

    return
}

session.transfer(flowFile, REL_FAILURE)

ExecuteScript Processor Setting

Case (1) The JDBC Controller Service(1/3)

Creating a JDBC Connection Pool Controller Service for the PutSQL Processor
- Creating the JDBC Controller Service for connecting the PostgreSQL DB

Case (1) The JDBC Controller Service(2/3)

Creating a JDBC Connection Pool Controller Service for the PutSQL Processor
- Creating the JDBC Controller Service for connecting the PostgreSQL DB
- Download the PostgreSQL JDBC Driver from https://jdbc.postgresql.org/download
  - wget https://jdbc.postgresql.org/download/postgresql-42.7.7.jar

$ cd /path/to/nifi_persistence/
$ wget https://jdbc.postgresql.org/download/postgresql-42.7.7.jar
$ docker compose cp postgresql-42.7.7.jar nifi:/home/nifi/
$ docker compose exec nifi ls /home/nifi/
postgresql-42.7.7.jar

JDBC Controller Service Setting

Case (1) The JDBC Controller Service(3/3)

Creating a JDBC Connection Pool Controller Service for the PutSQL Processor
- Enabling the configured JDBC Controller Service

Case (1) The PutSQL Processor

Creating a PutSQL Processor
- Execute the SQL to store the aggregated results
- Configuring this processor and add the INSERT SQL statement
  - INSERT INTO countries_count (counter) VALUES (${data_size})

import groovy.json.JsonSlurper


FlowFile flowFile = session.get()
if (flowFile == null) {
    return
}

String responseBody = flowFile.getAttribute('data')

if (responseBody != null) {
    log.info("Response JSON Data: " + responseBody)
    def jsonSlurper = new JsonSlurper()
    def data = jsonSlurper.parseText(responseBody)

    session.putAttribute(flowFile, "data_size", data.size().toString())
    session.transfer(flowFile, REL_SUCCESS)

    return
}

session.transfer(flowFile, REL_FAILURE)

Groovy Script Body is from ExecuteScript Processor

Case (1) The Completed Data Pipeline

The completed data pipeline looks like a flow control chart
It determines how the data routed in this pipeline
The LogAttribute Processor is used to record the message

Case (1) Run the Completed Data Pipeline

Making all data pipeline running!

Case (1) Verifying the Data Pipeline (1/3)

We ensure processed data will be stored in the DB correctly

Case (1) Verifying the Data Pipeline (2/3)

Entering the PostgreSQL shell and ensuring these records

Case (1) Verifying the Data Pipeline (3/3)

Entering the PostgreSQL shell and ensuring these records

Case (1) Configuring the schedule

We can configure and schedule the InvokeHTTP Processor
- For example, configuring fetch the remote URL every 5 minutes
- Using the timer driven to complete the setting
  - The configuration can be "1 sec", "1 min", "1 hour" or "1 day"

Case (1) Verifying the schedule configuration

We can configure and schedule the InvokeHTTP Processor
- Verifying the schedule configuration

Outlines

What's NiFi?
NiFi Components Introduction
Integrating PostgreSQL with the NiFi
- Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline
- Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline
How to deploy the NiFi Cluster?
How to measure the NiFi performance?
Summary and Future Work

Case (2) Building a collecting data pipeline

This data pipeline is used to collect the data
We design a data pipeline to complete this case. The building steps are as follows:
Creating two InvokeHTTP Processors
- Sending HTTPS request to https://covid-api.com/api/regions
- Sending HTTPS request to https://covid-api.com/api/provinces/TWN
Creating two EvaluateJsonPath Processors
- Using the JSONPath to filter the data field ($.data) and store "data" attribute
- Using the JSONPath to filter the data field ($.data) and store "prov_data" attribute
Creating two ExecuteScript Processors
- Using Groovy Script to read "data" attribute and retrieve length
- Using Groovy Script to read "prov_data" attribute and retrieve length
Creating two PutSQL Processors
- Execute SQL to store the collected results for both the "data" and "prov_data" records

Case (2) The ExecuteScript Processor (1/2)

This data pipeline is similar with the case (1)
- The different things are about ExecuteScript Processor-1

import groovy.json.JsonSlurper


FlowFile flowFile = session.get()
if (flowFile == null) {
    return
}

String responseBody = flowFile.getAttribute('data')


if (responseBody != null) {
    log.info("Response JSON Data: " + responseBody)
    def jsonSlurper = new JsonSlurper()
    def data = jsonSlurper.parseText(responseBody)

    session.putAttribute(flowFile, "data_size", data.size().toString())
    session.putAttribute(flowFile, "data_type", "data_size")
    session.transfer(flowFile, REL_SUCCESS)

    return
}

session.transfer(flowFile, REL_FAILURE)

Case (2) The ExecuteScript Processor (2/2)

This data pipeline is similar with the case (1)
- The different things are about ExecuteScript Processor-2

import groovy.json.JsonSlurper


FlowFile flowFile = session.get()
if (flowFile == null) {
    return
}

String responseBody = flowFile.getAttribute('prov_data')


if (responseBody != null) {
    log.info("Response JSON Data: " + responseBody)
    def jsonSlurper = new JsonSlurper()
    def data = jsonSlurper.parseText(responseBody)

    session.putAttribute(flowFile, "prov_data_size", data.size().toString())
    session.putAttribute(flowFile, "data_type", "prov_data_size")
    session.transfer(flowFile, REL_SUCCESS)

    return
}

session.transfer(flowFile, REL_FAILURE)

Case (2) The PutSQL Processor (1/2)

This data pipeline is similar with the case (1)
- The different things are about PutSQL Processor
- Reusing the DBCPConnectionPool Controller Service
- Creating the new data_count table in the PostgreSQL DB

regions=# CREATE TABLE data_count (
    count_id SERIAL PRIMARY KEY,
    counter INT, data_type VARCHAR(20),
    event_time TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE

Case (2) The PutSQL Processor (2/2)

This data pipeline is similar with the case (1)
- The different things are about PutSQL Processor
- Noticing that the PutSQL Processor SQL Statement
  - INSERT INTO data_count (data_type, counter) VALUES ('${data_type:toString()}', ${data_size})
  - INSERT INTO data_count (data_type, counter) VALUES ('${data_type:toString()}', ${prov_data_size})

Case (2) The Completed Data Pipeline

The LogAttribute Processor is used to record the message

Case (2) Verify the Data Pipeline

We check these records in the data_count table to verify the Data Pipeline

Think Twice: Simplify Case (2) Data Pipeline

We try to consider simplifying the Case (2) Data Pipeline
Let API URLs be stored in the SQL database and remove some "two" processors
- And query them to use batched approach to send the API request
Here is the simplified Data Pipeline

Outlines

What's NiFi?
NiFi Components Introduction
Integrating PostgreSQL with the NiFi
- Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline
- Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline
How to deploy the NiFi Cluster?
How to measure the NiFi performance?
Summary and Future Work

How to deploy the NiFi Cluster? (1/2)

To make NiFi work with the high availability feature,
- It should consider deploying the NiFi cluster
Using the Zookeeper to coordinate the nodes in the NiFi Cluster
- The NiFi Cluster is deployed as the zero-master feature
- Coordinator Node
- Primary Node
- Isolated Processors

How to deploy the NiFi Cluster? (2/2)

Deploying the NiFi Cluster strategy references are as follows:
- Using Docker compose to deploy the NiFi Cluster
  - https://github.com/hindmasj/my-nifi-cluster
  - https://github.com/naddym/nifi-deploy
- Using NiFiKop to deploy the NiFi Cluster in the Kubernetes Cluster
  - https://github.com/konpyutaika/nifikop
  - https://medium.com/@alexandreguitton_12701/3b3674938f33

Outlines

What's NiFi?
NiFi Components Introduction
Integrating PostgreSQL with the NiFi
- Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline
- Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline
How to deploy the NiFi Cluster?
How to measure the NiFi performance?
Summary and Future Work

How to measure the NiFi performance?

There're some ways about measuring the NiFi performance
- Measuring specific Processor performance with the status history

How to measure the NiFi performance?

There're some ways about measuring the NiFi performance
- Measuring specific Processor performance with operating information

Outlines

What's NiFi?
NiFi Components Introduction
Integrating PostgreSQL with the NiFi
- Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline
- Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline
How to deploy the NiFi Cluster?
How to measure the NiFi performance?
Summary and Future Work

Summary and Future Work (1/2)

NiFi is a great tool to build routine data collection processing
It can consider avoiding complicated data pipeline flow
- We can consider following approaches to simplify data flow:
  - Using the ExecuteStreamCommand Processor
  - Build your own Processor
Consider using OIDC to integrate authorization server
- For example, the authorization server can be Keycloak or Google Identity OIDC
  - Keycloak
  - Google OIDC
NiFi can be used as part of a Data Middle Platform
Consider deploying the NiFi Clusters to make high availability
- NiFiKop can be easy for you to complete this goal

Summary and Future Work (2/2)

Monitoring the NiFi or NiFi Data Pipeline is important
- It can consider using the NiFi REST API to complete this purpose
To make automated NiFi Data Pipeline management,
- It can consider using the NiFi REST API to complete this purpose
- https://nifi.apache.org/nifi-docs/rest-api.html
Using the NiFi to collect data scenario is very appropriate
Using the LogAttribute Processor to collect logs
- For example, it can use the Elasticsearch to collect logs
Considering deploying an event-driven architecture
- Using the Kafka or RabbitMQ to integrate the NiFi and develop a EDA

References

NiFi Expression Language Reference
- https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html
- Apache Nifi Expression Language Cheat Sheet
  - https://www.nifi.rocks/documents/nifi-expression-language-cheat-sheet.pdf
Apache Groovy Script Reference
- https://groovy-lang.org
BYOP: Custom Processor Development with Apache NiFi
- https://www.youtube.com/watch?v=v2u0WsPs2Ac
More NiFi Data Pipeline References

Any Questions?

COSCUP 2025: My Best Practices about Integrating NiFi with the PostgreSQL database

By peter279k

COSCUP 2025: My Best Practices about Integrating NiFi with the PostgreSQL database

COSCUP 2025

Peter

2025/08/09

My Best Practices about Integrating NiFi with the PostgreSQL database

Slide

About me

Outlines

What's NiFi?

NiFi Components Introduction

Integrating PostgreSQL with the NiFi

Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline

Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline

How to deploy the NiFi Cluster?

How to measure the NiFi performance?

Summary and Future Work

Outlines

What's NiFi?

NiFi Components Introduction

Integrating PostgreSQL with the NiFi

Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline

Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline

How to deploy the NiFi Cluster?

How to measure the NiFi performance?

Summary and Future Work

What's NiFi？

Apache NiFi is developed by NSA

Since 2014, it's the Apache Top-Level Project

It's a Data Pipeline or Data Flow Control service

Outlines

What's NiFi?

NiFi Components Introduction

Integrating PostgreSQL with the NiFi

Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline

Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline

How to deploy the NiFi Cluster?

How to measure the NiFi performance?

Summary and Future Work

NiFi Components Introduction (1/5)

FlowFile

NiFi Components Introduction (2/5)

Processor

Connection

NiFi Components Introduction (3/5)

Process(Processor) Group

Funnel

NiFi Components Introduction (4/5)

Controller Service

Reporting Task

NiFi Components Introduction (5/5)

Template

NiFi Architecture Introduction

Architecture

Outlines

What's NiFi?

NiFi Components Introduction

Integrating PostgreSQL with the NiFi

Case(1)、Using the NiFi and PostgreSQL to build a Aggregating Calculating Data Pipeline

Case(2)、Using the NiFi and PostgreSQL to build a Collecting Data Pipeline

How to deploy the NiFi Cluster?

How to measure the NiFi performance?

Summary and Future Work

Building your own NiFi Environment (1/14)

Building the NiFi environment has many approaches

We use the Docker Compose approach to build the NiFi

Building your own NiFi Environment (2/14)

Since the NiFi 2.0.0 version is released,

Building your own NiFi Environment (3/14)

Step1: Using the git clone command to clone the project

Step2: Using following commands to configure file persistence

Prerequisites

Step3: Using the command to enable HTTPS certificate

Building your own NiFi Environment (4/14)

Step4: Executing the run_nifi_file.sh shell script to setup NiFi

Building your own NiFi Environment (5/14)

Step5: Using the Netcat and cURL commands to verify NiFi

Building your own NiFi Environment (6/14)

Step6: Using the Cloudflare Tunnel to secure the NiFi

Building your own NiFi Environment (7/14)

Step7: Clicking "Tunnels" and create the Cloudflare Tunnel

Building your own NiFi Environment (8/14)