Apache NiFi - What it Offers and Where and How it can be used

I am sure many of you would have heard about NiFi and the people who have had the opportunity to work in NiFi, would surely appreciate what it offers. Having used Apache NiFi more than once in my software journey, i would like to share what is NiFi, common and regular scenarios where it can be used and how it can be used. simply put, NiFi is an Integration Framework and a Data Flow Product. NiFi offers the following:

  • Transformation
  • Routing
  • Service Orchestration
  • Supports Multiple Formats
  • Supports Multiple Protocols
  • Flow Management and Flow Control
  • Auditing and Log Management
  • Exceptional Handling
  • Very easy to use
  • Good documentation
  • Active user community
  • Web Based UI

Now, having also worked in Middleware, the above points are also offered by ESB's but there are a few differences that ESB's offers that NiFi does not offer

  • Traditionally, ESB Products have their own language and semantics for message creation, navigation and extraction. NiFi does not have such a feature. instead NiFi makes use of java semantics.
  • NiFi does not support XA and complete transaction management. Though it supports atomicity at processor level, it does not support at flow level. A flow in NiFi is a collection of processors that work together to solve a business use case. flow can be equated and visualized to that of Pipe and Filter architecture pattern. ESB's support XA and transaction management. NiFi supports only atomicity and durability. Atomicity is at the processor level.
  • NiFi does not offer Extensive Application Monitoring and BPM. ESB's offer them in their integrated suite.

Components of NiFi

Processor: This is the entity where the actual work happens. All the transformation, routing logic is written here. you can create custom processors or make use of in-built processors. if you want to write your own business logic, you will need to create your own processor. Processors can be launched from NiFi UI.

FlowFile: It is the object that flows in the NiFi system. objects are represented as flowfile entities. you commit or rollback flowfile object inside a processor.

Connection: They act as links between processors. they behave like queues. back pressure can be enabled by setting upper bounds on these queues. back pressure is the amount of data that a particular connector can hold. say for example, if the producer is producing at a fast pace that the consumer is unable to handle, then you can set a threshold on the connection object. any messages that is beyond the threshold value would not be accepted and would be added to the previous connection

Controller: They are shared services that can be used by processors , reporting entities. For example, reusable objects like Database Connection, JMS Connection or any Caching Elements can be wrapped up in the controller and can then be used in processors.

There are three Repositories in NiFi namely Flow File Repository, Content Repository, Provenance Repository

Flow File Repository: Its a Meta Data of each of the flowfiles that are there in the system. Its a Write Ahead Log. The metadata contains pointer to where the actual flowfile content in the Content Repository is, state of the flowfile, which connections the flowfile is tied to.

Content Repository: Its the place where the flowfile content actually exist. NiFi achieves guaranteed delivery through this repository and Write Ahead Log.

Provenance Repository: It maintains history of the flowfile. Each time a flowfile event is triggered, provenance event is created. Here the flowfile events refer to flowfile Creation, Updation, Deletion, Cloning etc.

Web Based UI: NiFi provides an UI that allows users to start , process flows. users can select custom defined processors or predefined processors and include them in the flows through this web based UI.

Common Scenarios where it can be used:

  • Run jobs that has business logic at specific/periodic intervals. NiFi has a Quartz like functionality. you can build custom processors and schedule to run at periodic intervals.
  • Pull files from remote SFTP Server, download, read and process them. Make used of In Built SFTP Processors of NiFi and write custom processors to process the file contents and apply business logic.
  • Put files in remote SFTP Server.
  • In addition to the above, NiFi has whole lot of features like acting as HTTP Client, acting as a HTTP Server, connecting to SQL Databases, JMS Queues, email, parsing files etc. depending upon our use cases, we can make use of available processors or we can build a new one.

How it can be used:

  • Execute the following in Command Prompt :mvn archetype:generate. when you do that you would need to enter archetype number. For this, you would need to search org.apache.nifi:nifi-processor-bundle-archetype in the command prompt in the list returned and enter the archetype number. also, enter the number that is having latest NiFi version number.
  • you will see nar and processors folder. in processors, you will create custom processors inside java and inside resources you would need to include the processor name which you have created in org.apache.nifi.processor.Processor file. custom processors are the place where we write the business logic.
  • Once you build the pom, you will get a nar.
  • Install NiFi locally. you can download NiFi from https://nifi.apache.org/download.html. The latest version as of the date of this published article is 1.13.
  • Take the nar file and deploy in NiFi lib. NiFi lib has predefined nar's that are part of the downloaded version of NiFi as well as custom built nar's that we generate.
  • In NiFi.properties in the config folder, change the repository(Flow File, Content,Provenance) location details. since NiFi contents are stored on the disk, durability is guaranteed. Also, change the http host and port number to launch the NiFi admin. There are many other properties that can be changed depending upon our business needs but for most of the cases, this should be enough.
  • go to NiFi bin and start NiFi.
  • In the browser, enter the host and port number that you have set in NiFi.properties . you will see NiFi UI. you can start selecting custom processors that you have created or select predefined processors and start creating flows. You can configure dynamic properties in the flow in NiFi UI which gets passed to custom processors by NiFi engine.

Srini Meenavalli

Lead Platform Architect Customs Manifest Platform at A.P. Moller - Maersk

3 年

Good one

要查看或添加评论,请登录

社区洞察

其他会员也浏览了