My 2nd golang program

My 2nd golang program

I have another draft blog which says "My 1st golang program not a hello world" . There are a few pending work, hence I am posting this upfront and the next post would become a prequel to this post :-)

Architecture

I was recently involved in an activity to do some basic log processing activity on the VPC flow logs generated inside Amazon AWS. The application life cycle is divided into 3 steps.

  • VPC Flow Log Data --> Amazon Cloud Watch
  • Amazon Cloud Watch --> Amazon Kinesis
  • Amazon Kinesis -- > AWS Lambda --> NoSQL/Any other persistent storage

Objective

The primary objective of the program is to transfer the logs for processing. The secondary objective of the application is to provide a complete serverless architecture/platform wherever possible/applicable.

We wanted to test a minimum of 100G of data to be generated by the VPC logs. Unfortunately, the dev accounts that we possess does not generate the required traffic. Hence the thought of simulating the VPC flow log triggered, I was intrigued by Golang off late and the resultant is the following repo which generates VPC flow log data and pushes it to a Kinesis stream

Source

VPC Flow Log Generator

Components

There are 3 components on this repo

  • flowlogs/vpcflowlogs.go --> The program which generates the actual flow log data for a given size of batch N.
  • flmain/kinesisproducer.go --> The program generates batch using goroutines, and ingests the records to Kinesis
  • flmain/kinesisconsumer.go --> Reads data from Kinesis Stream --> Still buggy yet to fix

The schema which the Cloud watch generates slightly differs from what is being generated but the core vpc log structure is maintained. The structure is as follows:

Structure

type Vpcflowlog struct {

Id     string

Version  string

Account  string

Eni    string

Source   string

Destination string

Srcport  string

Destport  string

Protocol  string

Packets  string

Bytes   string

Windowstart string

Windowend string

Action   string

Status   string

}

Execution!!!

Checkout repo

Copy the folders flmain & flow inside to the GOPATH

go get dependent frameworks like aws

go run flmain kinesisproducer <streamname> <No. of Threads T> <Batchsize N> <Iterations I>

The above execution, if succeeds compilation, will generate an array of VPC Flow log data of size <BatchSize N> for each iteration for a total of <I iterations > in <T threads> For eg. go run flmain kinesisproducer kstream 100 300 200 will spawn 100 threads; each thread will run for 200 iterations and on each iteration 300 new records would be created and a total of 100*200*300 , 6M records will be ingested.

Before opting for batch ,I was ingesting 1 record at a time into the stream.

For 1M records with a batch size of 1000, 100 iterations and 10 threads, it nearly took 18 minutes to complete the ingestion into a Kinesis stream with 2 shards.

After opting for batch ingestion, the same 1M record ingestion with a batch size of 500 (?! Kinesis limitation) 100 iterations and 20 threads took less than 10 seconds to ingest into Kinesis

Hats Off

There were a series of links which really helped me to optimise at various stages and learn golang. I have added a few and haven't noted a few unfortunately. There is sill scope for lots of optimization but would like to hear more from the public forum.

This gentlemen has written a lot about Golang. Though I didn't understand a few, his blogposts were helpful at many places.

Golang nuts helped me to understand and resolve a few issues where I was clueless about.

Nightmares:

Multithreading in Java is my biggest nightmare. Python & Go provides very simplistic multithreading frameworks. Easy for anyone to kick off with a few reads. The more you read on channels, pointers, it required a repetitive reading to understand them well!! Otherwise it is happy GOing

Misc:

In addition to what was stated above, the kinesis producer has some additional info on. Would like to write more posts on that as well!!

  • statsd-graphite/grafana integration for metrics collection
  • golang instrumentation/profiling
  • Compression /Decompression (pending) to Kinesis

Hope you like it or is useful for someone somewhere. Looking forward to write such more adhoc posts!!! Excuse typos!!

Thanks for visiting by!!!

Rahul Khode

Technology Advisor | Cloud & AI Strategy | Digital Transformation Generative AI | Open AI | Microsoft Azure | AWS | GCP | IIOT | Analytics

6 年

Thats Nice Article Mukundaraman V as Golang is new language there are several myth that need to be break though to read more visit : https://bit.ly/2QkAq7n

要查看或添加评论,请登录

Mukundaraman V的更多文章

  • DoR vs DoD

    DoR vs DoD

    Definition of Ready vs Definition of Done I have tried my best to apply Agile principles and scrum at a lot of…

    1 条评论
  • Building the right product vs Building the product right

    Building the right product vs Building the product right

    So should we build the right product or should we build the product right? What do I mean by building the right…

    1 条评论
  • Explorers vs Exploiters

    Explorers vs Exploiters

    This is in continuation to my previous article where Mr. Ram Shivakumar of #chicagobooth shares great insights on…

    4 条评论
  • Companies die of indigestion than starvation!!!

    Companies die of indigestion than starvation!!!

    Well, this is in continuation to my previous post https://www.linkedin.

  • Change of course

    Change of course

    I came across a great article from Mr. Ram Shivakumar of ChicagoBooth where various companies pivoted to become much…

社区洞察

其他会员也浏览了