Real-time Course and Use Case: Johannesburg, South Africa
This week I had the opportunity to teach a real-time data integration course in Johannesburg, South Africa. We were able to discuss how HDF, Storm, Kafka and Spark can be used to create applications that deliver data in anear- real-time latency. I want to thank all students for their hard work and overcoming the wifi restrictions we had. During the course we discussed an interesting Use Case scenario that I would like to share especially since it involved ELK ( Elasticsearch, Logstash and Kabana) and transition to the HDF platform.
Use Case:
Currently Elasticsearch is being used to capture logs from multiple application servers. Their are some performance hinderance trying to identify all logs by host and application name. Although grok is being used to parse the logs, all logs aren’t being ingested into the index in a successful state. The solution is being transition to HDF( Nifi) in JSON format but their are issues adding flow file attributes to the content.
Solutions:
- To use the ReplaceText Nifi processor to get the flow file attributes and the append replacement strategy to add the attributes to the content.
- To use the JOLT (JoltTransformJSON ) Nifi processor to create a JSON transformation on the raw JSON file to append the attributes for the output JSON file. ( Good but older article on JOLT https://community.hortonworks.com/articles/44726/json-to-json-simplified-with-apache-nifi-and-jolt.html )
- Use the Update Record Processor and RecordReader/RecordWriter controller services API to update the contents of a FlowFile.
I was able to experience an African Safari. As well as the beautiful scenery and people South Africa has to offer!!!
Technology Executive | Strategy & Portfolio Mgmt. | Technology & Business Transformation | Health Equity & Value Based Care | MBA in Healthcare Management & Marketing
6 年Way to go Damien, love the pictures from your trips, nice touch...