Shift Smart Lesson #7: don't interrupt the pipeline with false positives

Shift Smart Lesson #7: don't interrupt the pipeline with false positives

Tanya Janca?had a great talk at the RSA conference the week before last on the?Top 15 DevSecOps Worst Practices. Last week, I posted my thoughts on a few of her "worst practices," including a comment that the first one about breaking the build on false positives was the "absolute worst."?Anton Abashkin?replied, "If we assume that no tool is free of false positives then is the correct strategy to never fail builds?" I decided to make this post to answer the question and go deeper.

I usually start with foundational concepts when discussing the Shift-Left or Dev[Sec]Ops transformation journey. However, this post begins in the middle, which is why the title says "Lesson #7" rather than "Lesson #1". I'll save 1-6 for later posts, but I need to familiarize you with two new roles to answer Anton's question. We'll touch on the coaching role now and introduce the other one, toolsmithing, a little later.

While a coach can be responsible for as many as 100 different development teams, their coaching is always in the context of a single two-pizza development team. The key to successful coaching is providing a shallow ramp for improvement with small achievable steps. The coach will help each development team pick and commit to adopting 1-3 risk-reducing practices every 90 days. So, the first part of my answer to Anton is...

Start with the policy dial very low and turn it up in small steps

For one of the team's apps, pick one vulnerability category, say vulnerabilities in the 1st party code that the team is writing. Depending on the team's situation, it might be better to start with vulnerabilities in the 3rd party libraries their app imports. Regardless, we're going to start by installing the tool in the pipeline set to complain only if critical-severity findings are found and only in the narrow category (1st or 3rd party code). We're also going to install it in warn or inform mode. In GitHub pull request (PR) parlance, it'll be a branch protection status check, but it won't be "required."

The coach then asks the team if they'd be willing to commit to fixing all such findings in the next 1-3 sprints. If not, the coach keeps suggesting a smaller subset (arbitrarily picking half of the existing criticals, for instance) until they are willing to commit to it.

Once they get that small subset to zero, you switch the check to "required" so that it won't allow the merging of a PR with any new such vulnerabilities in the future. I prefer to call this "interrupting the pipeline" rather than "breaking the build," but I'm sure it matches Tanya's intent.?

At the same time, you install a second non-required check that increments the policy dial one notch. Either spread to another app, 3rd party code vulnerabilities, or increment to high-severity vulnerabilities for 1st party code. Again, make this increment small enough that the team will drive it to zero in 1-3 sprints.

Continue incrementing the policy dial like this with one or more blocking checks and a single non-blocking one until you decide the juice is no longer worth the squeeze.

Since feature branch PRs are meant to be short-lived, following this approach, I've observed that you can drive the median time to resolve (MTTR) down to about one day for findings in scope for the current policy dial, whereas MTTR without this approach is often north of 200 days.

Allow findings to be snoozed for up to 3 sprints

Even after you have branch protection status checks in place, there will be exceptional findings that will take longer to resolve than is allowed for the PR merge decision. In these cases, allow the development team to create a ticket to fix it later and "snooze" the finding.?

Note you can quickly adopt this suggestion with a bit of manual process, but the ideal snooze button requires toolsmithing because most tools don't have a native snooze feature. It is best to have middleware between the tool's raw output and pipeline check for those tools. Hopefully, this will be available off the shelf someday, but for now, your toolsmiths (the other new role) will need to build this for your pipelines. So, there are better places to start, but if you implement everything else I mention here, this will be the next thing on your list.

The security team is responsible for dealing with false positives

Allow development teams to mark anything they believe is a false positive as such but treat every false positive as a bug in the tool that the toolsmiths I mentioned before on the security team are responsible for getting fixed. Turn the rule off temporarily, possibly for more than just the team that reported the issue. Then involve the vendor or fix it yourself by modifying the rule. If you are using a SAST tool (as opposed to IAST discussed below), I strongly recommend that you only consider SAST tools where it's easy to modify the rules. Most false positives come from unidentified sanitizers, so the minimal capability here is to allow you to specify approved sanitizers, but complete rule editing can also be helpful.

With this approach, the burden is on the folks choosing and running the tool, security, rather than the poor suckers stuck using it, development. This will motivate the security team to...

Pick a tool with low false positives

All static application security testing (SAST) work by inspecting your application's source code. Some things are easy to detect this way, but the highest severity vulnerabilities, mainly various injection vulnerabilities, are challenging to detect statically.

Think about the superhero concept of the metaverse, where every decision creates a new reality. Similarly, every branch, loop, or local state manipulation in an application creates a new possible global state. Static tools use statistics to figure out which global states are most likely, but it's little more than an informed guess, which is why the very best SAST tools cannot do much better than 30% false positive rates without modifying the rules, and SAST tools commonly have false positive rates north of 70%. So, if you must use SAST, you want one at the lower end of this spectrum.

However, there is another class of tools that is installed in your application as a dependency or injected at runtime in some other way. These are commonly referred to as intelligent/interactive application security testing tools (IAST). IAST tools observe the actual state of your program as it is exercised, so it's never wrong. It can, however, be incomplete if you aren't robustly exercising your application. However, if you do any amount of automated testing of your applications in the pipeline, and the most basic expectation of DevOps is that you have robust automated testing, you will have more than enough to find more true positives than SAST tools with near zero false positives.

Other characteristics of IAST make it a better fit for DevOps than SAST. It aligns with the gradual improvement approach mentioned in my first suggestion above. As you add tests to bring up test coverage, you'll expect those tests to find new defects. With IAST, some of those defects will be vulnerabilities, and they'll get incrementally resolved along with quality defects uncovered by your new tests. Also, as your test suite grows, you will invariably parallize it. IAST automatically parallizes since it's a function of the tests you are running. SAST just takes longer and longer as your application grows.

I was so impressed with the accuracy of Contrast Security 's IAST solution that I left my role as head of Dev[Sec]Ops Transformation at Comcast to work here.

As always please let me know what you think in the comments.

J M.

TTX designer, SOC builder ?? DM or book a call below ??

1 年

How many slices of pizza am I allowed to eat at a two pizza meeting?

Anton Abashkin

Application / Software Security Engineer & Researcher

1 年

Thanks for the detailed answer Larry! Subscribed

要查看或添加评论,请登录

Larry Maccherone的更多文章

社区洞察

其他会员也浏览了