登录查看更多内容

Groovy Fun with Git - Part 3 of 3

Nabil Hijazi

Retired from Software Engineering. Available for prompt engineering roles.

发布日期: 2018年2月27日

Design and Coding of the Groovy Script

When I started considering Groovy as an alternative for Bash scripting my goals were simple: to have an alternative to Bash, with the simplicity of Bash and Bash scripting life cycle, but with the power of Groovy (lists, maps, closures, and more).

In particular, I wanted a script, like a Bash script, that:

Runs standalone, with shebang first line
Can be distributed in one script file, and can run anywhere were Groovy is installed.
Does not need to be built as a full-fledged Groovy/Java application, to be distributed as a jar.

Running as an executable jar may introduce classpath issues, and introduce the need for a wrapper shell script that sets classpath. This is an unwanted complexity.

To accomplish that "simplicity goal", I needed to prohibit the creation of new public classes, and subclasses, with the possible introduction of the complexity of inheritance hierarchies, polymorphisms, design patterns, etc., that is needed when creating full Java applications, frameworks, or libraries. We are doing none of that.

I decided the only classes I would introduce have to be embedded classes, preferably restricted to static inner classes, if I really must have classes.

My original design included 4 classes to represent the data structure of Git objects: Node, Blob, Tree, Commit. Blob, Tree, and Commit being subclasses of Node, to handle variations of the content of the node by type. The .git/objects entire directory can then be represented as List<Node>. The rest of the design is a loop on this list, producing lines of text. We need a few more classes to represent the output of this directory scan: a Report class that represents the output listing of objects, and a GvScript class that produces the GV file of graphic description language commands, that is used to create the graphic file using the <dot> utility command, using the DOT language format.

That design does not meet the simplicity goal, although, by making all the classes inner classes, I can avoid build and package.

So I tightened my constraints further: no use of any classes or OO design. The cost is not being able to use polymorphic calls and having if statements based on the type of node. That seemed not too bad of a compromise. So all the classes were out. With this new constraint, the simplicity goal becomes simpler: "No classes. No OO".

<note>

We can easily refactor the script into classes, in almost a mechanical way. Group methods that make a cohesive group, and put them in an inner class. Make all the methods static. Once you have a set of classes you're happy with, you can refactor further, and introduce more OO concepts: type hierarchies, polymorphism, packages, and design patterns. But remember, you may have to start building and packaging in a jar. The complexity of the script will mushroom. I don't think it is worth the effort

</note>

Adhering to the simplicity rule, the resulting script template now looks like this:

#!/usr/bin/groovy

# No global variables - only constants derived from args

final A = get-a-from-args()

final B = get-b-from-args()

def x

def y

def method-x () {

}

def method-y () {

}

Which is the general outline of a Bash script. This should make it possible to translate, perhaps through the use of a translation tool, a given Bash script to an equivalent Groovy script. That should be a fun project.

So, let's look at the working and tested code, to make sure the simplicity goal is not merely theory.

Basically, we are looping thru the .git/objects directory, which has directory names of two hex characters, with one file per directory that has a name of 38 characters (we don't need to know that these names came from the SHA1 of the file contents). We need to collect any data we need from that file into a tuple and add the tuple to a list.

Once we have the objects available in a list, we can produce different outputs by looping thru the list. We need two output lists (List<String>) with content of the node described by a tuple, the other with lines of graphic description language for that tuple/

That's the design. The rest is Groovy coding details.

Below is a listing of the main pieces of the code. You can pull the code from my GitHub gitobjects repository.

In a Bash script, the commands we need to produce printing of Git objects are:

git cat-file -t <objectname>

git cat-file -p <objectname>

git branches -v

To get the compressed file size of a Git object file, we need the OS command:

stat -c% <pathname>

The script needs to make calls to the OS. So the first utility function to code is:

def callOS (command) {

command.execute().text()

}

We need to check the pre-conditions before the execute(), and check for errors after. But essentially that is the Groovy call to OS function.

The top-level statements in the script do their work by setting up the environment and calling methods. The script parses the script args, then calls loadObjects(), which scans the .git/objects directory, collecting data in the files in it, and returns a list of tuples. Each tuple describes one file: (objectname, type, content, size). The size is the actual compressed size on disk. The size of the uncompressed file is contained in the content field.

Once we have this object list of tuples, we can produce the two required outputs: the GitReport, through getReport(), and the GvFile, through getGvScript().

These three methods (loadObjects, getReport, and getGvScript) are the meat of this script. In addition, there is a bunch of utility methods grouped together into a utils section that can be turned into a Utils class - if we are so inclined.

These utility methods are self-explanatory. You can see them in the code.

You can pull the full source code from my GitHub at https://github.com/nabilh/groovy-scripts.git

Below is a list of all the methods in the script, the top-level script statements (outside of all the methods), and the three main methods.

Hope you will find this script useful in your Git travels, and a fun script in your Groovy scripting travels.

final LINE_LIMIT = 1
final BLOB_MAX_LINE_LENGTH = 40
final GV_MAX_LABEL_LENGTH = 10

def (REPO_DIR_NAME, GV_FILE) = parseArgs(args)

println "\nGit Report for $REPO_DIR_NAME"
println "Graphiz GV file ${GV_FILE}\n"

def objects = loadObjects(REPO_DIR_NAME)
def report = getReport(objects, LINE_LIMIT, BLOB_MAX_LINE_LENGTH)
def gvScript = getGvScript(objects, GV_FILE, GV_MAX_LABEL_LENGTH)

println report
writeGvFile(gvScript, GV_FILE)

def loadObjects(repoDirName) {

    final hexChars = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f']

    def objectList = []

    final objectsDirName = repoDirName + ".git/objects"

    final objectsDir = new File(objectsDirName)

    def objectsDirSorted = objectsDir.listFiles().sort {file ->
        -file.lastModified()
    } as List<File>

    objectsDirSorted.each {dir ->
        if (dir.name[0] in hexChars) {
            dir.eachFile(FILES) {file ->
                def objectName = dir.name + file.name
                // makeNode (objectName, type, size)
                // how do we get the type?
                def type = callOS("git cat-file -t $objectName")
                def content = callOS("git cat-file -p $objectName")
                def size = fileSize(objectName, objectsDirName)
                def tuple = new Tuple(objectName, type, content, size)
                objectList.add(tuple)
            }
        }
    }
    objectList
}

def getReport(objects, lineLimit, maxLineLength) {

    def lines = []

    objects.each {tuple ->

        def objectName = tuple[0]
        def objectType = tuple[1]
        def content = tuple[2]
        def size = tuple[3]

        def line = sprintf("%s %s", objectName, objectType)
        lines.add(line)

        def noOfLines = numberOfLines(content)
        def length = content.length()

        if (objectType.equals('blob')) {
            content = checkContent(content, lineLimit, maxLineLength)
        }
        content = addLineNumbers(content)
        if (noOfLines == 1) {
            line = sprintf("content %d line, %d chars, %s compressed:\n%s\n", noOfLines, length, size, content)
        } else {
            line = sprintf("content %d lines, %d chars, %s compressed:\n%s\n", noOfLines, length, size, content)
        }
        lines.add(line)
    }
    return lines.join("\n")
}

要查看或添加评论，请登录

Nabil Hijazi的更多文章

Groovy Fun with Git - Part 2 of 3 - Using the Groovy Script

2018年2月21日

Groovy Fun with Git - Part 2 of 3 - Using the Groovy Script

In Part 1, I introduced a simple script to help explore the Git data structures, as we do simple experiments with git…
Groovy Fun with Git - Part 1 of 3

2018年2月18日

Groovy Fun with Git - Part 1 of 3

Pro Git, Scott Chacon's great book on Git, has a chapter on Git internals that is a must read, if you want to take a…
Microservices and Database Replication

2016年7月31日

Microservices and Database Replication

In a previous post, I discussed briefly the issue of data sharing in microservices. The consensus seems to be that each…

2 条评论
Microservices - It's Not The Size That Matters!

2016年7月28日

Microservices - It's Not The Size That Matters!

The diagram above is NOT something you want! That is how to do microservices the wrong way. In many ways "micro" is not…

9 条评论
Why Microservices Are Hard

2016年7月27日

Why Microservices Are Hard

Microservices are the latest incarnation of a "software brick" - an independent software component. A software…
Database Considered Harmful?

2016年7月26日

Database Considered Harmful?

Think "Events" (not CRUD) As you dip your toes into the world of microservices, you start thinking this is great stuff,…
Data and Microservices

2016年7月25日

Data and Microservices

When you first meet the concept of microservices, you find it striking how simple the ideas are. They are also not new.

3 条评论
Decomposing into Microservices

2016年7月4日

Decomposing into Microservices

Event Partitioning: Old Idea from Structured Analysis. A Perfect Fit for Microservices Thinking.
Dependency Hell in Microservices and How to Avoid It

2016年7月3日

Dependency Hell in Microservices and How to Avoid It

In my previous post I talked about independence being THE defining characteristic for a microservice. It is also the…

1 条评论
Why I am Passionate about Microservices

2016年7月2日

Why I am Passionate about Microservices

In the last 30 years or so I have designed and implemented a fair number of OO (Object Oriented) systems. I have always…

See all articles

Groovy Fun with Git - Part 3 of 3

Nabil Hijazi

Retired from Software Engineering. Available for prompt engineering roles.

Design and Coding of the Groovy Script

Nabil Hijazi的更多文章

社区洞察

其他会员也浏览了

Integrating Groovy with Kubernetes and Jenkins (DevOps Task 6)

Code quality - Tools

Day 22 - Getting Started with Jenkins

Build & Delivery CI/CD Pipeline for your Maven project with the help of Jenkins and Ngrok

Configuring CI/CD on Kubernetes with Jenkins and Groovy

REGex Software Services - Master Class on Git & GitHub

REST API to Execute GroovyScripts on Server

GIT & GITHUB

Day-20: Getting Started with Jenkins.

ARTH - Task 29 ??????

Design and Coding of the Groovy Script

Nabil Hijazi的更多文章

Groovy Fun with Git - Part 2 of 3 - Using the Groovy Script

Groovy Fun with Git - Part 1 of 3

Microservices and Database Replication

Microservices - It's Not The Size That Matters!

Why Microservices Are Hard

Database Considered Harmful?

Data and Microservices

Decomposing into Microservices

Dependency Hell in Microservices and How to Avoid It

Why I am Passionate about Microservices

社区洞察

其他会员也浏览了

Integrating Groovy with Kubernetes and Jenkins (DevOps Task 6)

Code quality - Tools

Day 22 - Getting Started with Jenkins

Build & Delivery CI/CD Pipeline for your Maven project with the help of Jenkins and Ngrok

Configuring CI/CD on Kubernetes with Jenkins and Groovy

REGex Software Services - Master Class on Git & GitHub

REST API to Execute GroovyScripts on Server

GIT & GITHUB

Day-20: Getting Started with Jenkins.

ARTH - Task 29 ??????