Coding Challenge #9 - grep
Photo by Markus Winkler on Unsplash

Coding Challenge #9 - grep

This week your challenge is to build your own version of the Unix command line tool grep.

The Unix command line tools are a great metaphor for good software engineering and they follow the Unix Philosophies of:

  • Writing simple parts connected by clean interfaces - each tool does just one thing and provides a simple CLI that handles text input from either files or file streams.
  • Design for simplicity; add complexity only where you must.
  • Design programs to be connected to other programs - each tool can be easily connected to other tools to create incredibly powerful compositions.

Following these philosophies has made the simple Unix command line tools some of the most widely used software engineering tools which can be chained together to create far more complex and powerful set of tools that you’d expect.

You can read more about the Unix Philosophy on the Coding Challenges blog.

Following these philosophies has made the simple unix command line tools some of the most widely used software engineering tools - allowing us to create very complex text data processing pipelines from simple command line tools. There’s even a Coursera course on Data Engineering with Bash!


The Challenge - Building grep

The functional requirements for grep are concisely described by it’s man page - give it a go in your local terminal now:

% man grep
		 The grep utility searches any given input files, selecting lines that
     match one or more patterns.  By default, a pattern matches an input line
     if the regular expression (RE) in the pattern matches the input line
     without its trailing newline.  An empty expression matches every line.
     Each input line that matches at least one of the patterns is written to
     the standard output.        

N.B. grep can support basic regular expression and extended regular expressions. In GNU grep there is no difference between the two, both are extended. For none GNU versions of grep the -E option enables the extended pattern matching which is what we are going to look at in this challenge.


Difficulty - You Choose!

You can tackle this challenge at two different levels of difficulty:

  • Easy - use the regular expression library from your programming language.
  • Hard - implement your own regex engine.

You can do it either way, or even both! For both, build it the easy way then come back and re-implement the regex with your own engine. If you decided to do that, there’s a pretty good article explaining it at Let’s Build a Regex Engine.


Step Zero

Like most programming languages we’re zero indexed!

For this step, I’ll leave you to setup your IDE / editor of choice and programming language of choice. After that here’s what I’d like you to do to be ready to test your solution.

Download the following text: [<https://www.gutenberg.org/cache/epub/132/pg132.txt>](<https://www.gutenberg.org/cache/epub/132/pg132.txt>) and save it as test.txt. After that please download the additional test data from my dropbox here ****and unzip them, you should then have the following in your test directory:

% tree
.
├── rockbands.txt
├── symbols.txt
├── test-subdir
│   └── BFS1985.txt
└── test.txt        

If you’re on Windows you can use GoW or GitBash to get a bash console, or if you’re a PowerShell user I’m sure you’re capable of translating the commands! Well better than I can! ??


Step 1

In this step your goal is to implement support for an empty expression. An empty expression matches every line so the command grep "" test.txt will write every line of the file test.txt.

Implement your solution, run the command and check you get every line written out, you can automate this test with the Unix command line tools:

% grep "" test.txt | diff test.txt -
%        

Shows that your grep output the same contents as the file test.txt because no diff is output. If your grep is not correct for this step there will be a diff output.


Step 2

You can find Step 2 and beyond on the Coding Challenges website as Write You Own grep.

Or if you'd rather get the whole challenge delivered to you inbox every week, you can subscribe on the Coding Challenges Substack.

Mohit Jain

Software Tech Lead @Zemetric | ex Co-Founder @Evy Energy (Acquired)

1 å¹´

Here's my solution using the inbuilt RegExp: https://github.com/jainmohit2001/coding-challenges/tree/master/src/9 Too lazy to implement the whole RegEx considering we already build 3 parsers till now xD.

赞
回复
Anton Zaides

Writing about Engineering Management | Director of Engineering

1 å¹´

Nice challenge :) I'm less interested in building my coding skills at this stage, but your challenges are very well executed!

Jose ?? Estremadoyro Fort

Software Engineer | AI Assisted Coding Champ @Blend | Ex-Googler

1 å¹´

Just a friendly tip, Check the code on the final step, I can't see the -i.

John Crickett

Helping you become a better software engineer by building real-world applications.

1 å¹´

If you'd like it emailed to you, you can subscribe here: https://codingchallenges.substack.com/

要查看或添加评论,请登录

John Crickett的更多文章

  • From The Challenges - Memcached Server

    From The Challenges - Memcached Server

    ?? ?? Happy Birthday Coding Challenges - Two Years Old! ?? ?? Coding Challenges is turning two this week! To celebrate…

    13 条评论
  • Coding Challenge #85 - Time Zone Converter

    Coding Challenge #85 - Time Zone Converter

    Coding Challenge #85 - Time Zone Converter This challenge is to build your own Time Zone Converter. That is a tool to…

    12 条评论
  • From The Challenges - IRC Client

    From The Challenges - IRC Client

    Welcome To Coding Challenges - From The Challenges! In this Coding Challenges “from the challenges” newsletter I’m…

    5 条评论
  • Coding Challenge #84 - Mandelbrot Set Explorer

    Coding Challenge #84 - Mandelbrot Set Explorer

    This challenge is to build your own Mandelbrot set explorer. The Mandelbrot set is a set of fractals that exhibit great…

    4 条评论
  • From The Challenges - Cat

    From The Challenges - Cat

    Welcome To Coding Challenges - From The Challenges! In this Coding Challenges “from the challenges” newsletter I’m…

    7 条评论
  • Coding Challenge #83 - Markdown Presentation Tool

    Coding Challenge #83 - Markdown Presentation Tool

    Coding Challenge #83 - Markdown Presentation Tool This challenge is to build your own version of Go’s Present or…

    3 条评论
  • From The Challenges - Shell

    From The Challenges - Shell

    Welcome To Coding Challenges - From The Challenges! In this Coding Challenges “from the challenges” newsletter I’m…

    3 条评论
  • Coding Challenge #82 - Markdown To PDF Editor

    Coding Challenge #82 - Markdown To PDF Editor

    Coding Challenge #82 - Markdown To PDF Editor This challenge is to build your own tool to convert markdown to PDF. It…

    14 条评论
  • From The Challenges - Diff

    From The Challenges - Diff

    Welcome To Coding Challenges - From The Challenges! In this Coding Challenges “from the challenges” newsletter I’m…

    7 条评论
  • Coding Challenge #81 - Brainf*ck Interpreter

    Coding Challenge #81 - Brainf*ck Interpreter

    This challenge is to build your own Brainf*ck Interpreter. Just in case you’re wondering, yes the * should be a u, but…

    34 条评论

社区洞察

其他会员也浏览了