Software Security Bootcamp:

Architect's Edition

Susan Sons

IU-CACR, ICEI

sons@security.engineering

Today's Program:

  • Code Triage & Rescue
  • Building and Maintaining Security Programs
  • Security Culture 101
  • Distribution Logistics
  • Communicating Security
  • Vulnerability Response

Goals For the Day

Code Rescue

Disasters Happen.

An ounce of prevention is worth a pound of cure...

 

...but sometimes a pound, or a ton, of cure is needed.

 

We cannot go back in time.  Facts are, we live in an imperfect world and some people will need to clean up messes: messes of their own making, and of others'.

Working a critical disaster, joys and challenges

  • The work is important.
  • The bar for success is known.
  • Success requires managing as many social problems as technical ones.
  • The ground will change beneath you.
  • There's pleasure in taking on the things that others are afraid to.

Why get good at this?

What does it take to be good at this?

Step Zero: Decide that you will be responsible.

  • Understand the problem: this requires technical acumen, social skills, and domain knowledge.  If you aren't yourself an experienced architect in this domain, you must have a very close partnership with one.

  • Set a clear, concrete, finite scope.

  • Spend time with people -- split technical and social leadership positions if needed to make this investment possible.

  • Expect drama.  Forgive drama.

  • Keep perspective: the purpose of a rescue is long-term sustainability.  Any other goal may be sacrificed to support this one.

How to Train for the Unknown

Breaking it down: Priorities

Three themes must be considered at each step in planning and carrying out the refactor:

 

  • The Timeline Balancing Act:
    Playing the long game while dealing with immediately
    pressing concerns, keeping perspective
     
  • Project Management Concerns:
    Resourcing, communication, stakeholder priorities
     
  • Technical and Architectural Strategy:
    Supporting toolchains, architecture principles, testing,
    code cleanliness, maintainability, security

Your refactor's most precious and finite resource is TIME.

  • Go for the snowball effect.

  • Cluster disruptions in order to minimize them.

  • Avoid Mythical Man-Month errors.

  • Stay out of rabbit holes.

  • Put long-term gains ahead of immediately pleasing people.

Q & A

Project Management:

You will not know the depth, breadth, or nature of the social and technical problems until you are halfway down into the abyss.

 

There is always another problem lurking.

 

That's okay.

Code Longevity:

  • Resources

  • Personnel (devpower & expertise)

  • Repository & Access

  • Build System

  • Tests

  • Documentation

  • Communication Channels

Pony Factor

How many currently active committers account for >50% of the code base?

Based on research by Daniel Gruno of Snoot.io

Start With Recon and Comms

When Sputnik crashes down on your head, resist the urge to react immediately, unless it's to prevent immediate loss of life.  Gather information, start identifying the problem and scoping a response, and talk to people.


Write.  Write down your background planning, your thinking, your project scope.  Then, communicate with stakeholders face-to-face (or by teleconference) and follow up in writing.

 

Be kinder to everyone than you need to be, be empathetic even when people are being wrong.  Not because you're a sap, because it's how you get people to do communicate freely. Every disaster got that way somehow, and everyone near it fears blame. Leading a major refactor/rescue means keeping your focus more on outcomes than blame.

DO NOT try to plan a smooth-running project.

You must plan for drama and messiness so that you are able to absorb it.

 

Give yourself--and your team--healthy margins for error.  This is how you beat code that is full of unknown landmines.

Stakeholders don't care how hard your job is.

  • It's your job to find the stakeholders; they won't come find you.
     
  • What are they trying to do with the software or system? What constraints do they operate under?
     
  • It's also your job to sort out the XY problems.
     
  • Manage expectations, and minimize negative impact on stakeholders.

It's under control.  I have a process for this.

Even though I'm mostly (completely) winging it.

A complex refactor requires a team.

An effective refactor team is a group of humans who:

  • Have complementary skill sets, and a diversity of outlooks.
     
  • In aggregate, have all of the skills needed to complete the refactor.
     
  • Have or can quickly build a working rapport that allows for comfortable, informal conversation.
     
  • Have bought in to the refactor process.
     
  • Have enough resources to do the work that's needed.
     
  • Can check some ego at the door.

Q & A

Triage

Triage is not about understanding the situation in total.

 

Triage is discovering the greatest points of crisis and how they relate to one another, so that the patient can be stabilized to the point that we can worry about their general health.

BUGS

Let's talk about

Fixing bugs is temporary. More bugs are coming.


Long-term impact comes from making bugs easier to fix, and eliminating or preventing classes of bugs.
 

A good refactor results in a long tail of bug fixing.

Do bugs exist?

Are those bugs vulnerabilities?

High-Return Technical Improvements:

  • Code Access
  • Build Process
  • Testing Infrastructure and Automation
  • Documentation
  • Refactors that accomplish:
    • Major code reduction
    • Major improvements in internal compartmentation
    • Major tightening of internal APIs
    • Migration away from dangerous dependencies
  • ​Bugs that are immediate security crises.

Information Security Practice Principles (ISPP)

  • Comprehensivity:   Am I covering all of my bases?
  • Opportunity:   Am I taking advantage of my environment?
  • Rigor:  What is correct behavior, and how am I ensuring it?
  • Minimization:  Can this be a smaller target?
  • Compartmentation: Is this made of distinct part with limited interactions?
  • Fault Tolerance:  What happens if this fails?
  • Proportionality:  Is this worth it?

Finding Your Way In the Dark: Information Security From First Principles

16:35 Friday

What makes devs' work harder?

High-Return Technical Improvements:

  • Code Access
  • Build Process
  • Testing Infrastructure and Automation
  • Documentation
  • Refactors that accomplish:
    • Major code reduction
    • Major improvements in internal compartmentation
    • Major tightening of internal APIs
    • Migration away from dangerous dependencies
  • ​Bugs that are immediate security crises.

What about interim maintenance?

Cyber-physical systems (ICS/SCADA/etc): emergency-fixing only...any other changes are at cross-purpose to the refactor.

With software, and with systems made up of general-purpose hardware components...you have a choice to make:

 

Emergency fixes only, focus all resources on the refactor.
​--OR--

Develop in parallel: trade-off of lower end-user friction for MUCH higher resource and coordination needs during the rescue.

Build-and-Replace

vs.

Multi-Stage Refactor

Breaking it down: Priorities

Three themes must be considered at each step in planning and carrying out the refactor:

 

  • The Timeline Balancing Act:
    Playing the long game while dealing with immediately
    pressing concerns, keeping perspective
     
  • Project Management Concerns:
    Resourcing, communication, stakeholder priorities
     
  • Technical and Architectural Strategy:
    Supporting toolchains, architecture principles, testing,
    code cleanliness, maintainability, security
  • Go for the snowball effect.

  • Cluster disruptions in order to minimize them.

  • Avoid Mythical Man-Month errors.

  • Stay out of rabbit holes.

  • Put long-term gains ahead of immediately pleasing people.

About 10% of your team's time.

The Cost of Continuous Time Estimation:

Continuous Time Estimation with approval (assuming slowest response is 2-4 working hours):

15-20% of your team's time.

Q & A

Q & A

Security Programs

(and why you care)

Security Needs

  • Rigor and regularity : do it consistently, and iterate over time
     
  • Authority without split agency:
    The authority and the responsibility lie in the same place
     
  • Resources
     
  • The ability to deal with incidents promptly and effectively

A Program Is:

  • A set of policy documents that capture scope, goals, and responsibilities as well as process.
     
  • A way to update the documents and policy/process.
     
  • A budget.
     
  • A policy for modifying or granting exceptions to the policy.

Minimum Viable Program

As NTPSec Project Manager, Mark Atwood accepts risk on behalf of the project.  In the event of a security incident, Information Security Officer Susan Sons is empowered to declare the incident and manage the NTPSec Project's response.  Security documents are maintained at <url>.

Policy Is a Tool

  1. Set up roles and responsibilities that work.
    **Avoid split agency at all costs.
     
  2. Document process so it can be iterated on.
     
  3. Capture what is being done and why, so that one can resource those efforts.
     
  4. Carefully document process for known scenarios, and responsibilities and goals for unknown scenarios.

Develop, Adopt, Educate, Follow, Enforce, Revise

The life of an infosec policy:

Develop, Adopt, Educate, Follow, Enforce, Revise

What usually happens:

Roles and Responsibilities

No one can eliminate every risk.

 

Neither the architect nor the ISO/CISO should accept risks.

Process:

Start light, iterate over time based on experience.

 

It can help to break out separate policies to minimize editing.

Goals and Scope:

The why tends to be as important or more important than the how.

 

If you don't agree what your team is responsible for, and what your priorities are, then anything not explicitly stated in policy can't be solved.

Emergency Procedure:

  • Make lines of authority and communication clear.
     

  • Start with your "black swans" and "grey pigeons"
     

  • Don't try to document every possible scenario: rely on
    people and resources.
     

  • This is all worthless if no one practices.

Questions To Answer:

  • Who accepts risk?
  • Who declares and/or runs an incident?
  • Are all potential vulnerabilities incidents?
  • What outside resources (Company CISO and security team, vendor, etc.) are we depending on and what are our agreements with them?
  • Which things are not our problem?
  • What's most important?  Integrity of the software, integrity of the data the software processes, uptime of the software, cross-platform compatibility, predictable releases?  Other concerns?  What's the list in order of priority?
  • What communication paths are used for regular reporting?  What about vulnerability handling?
  • How do we deal with outside vulnerability reports?  How do outsiders know this?

Questions To Answer (2):

  • How are we managing the authoritative copy of the code base?
  • What is our minimum testing and assurance level?  What is our desired level?
  • What are our coding standards?
  • What tools will we use to ensure all of this happens?
  • Who has the power to make exceptions to policy, and how is this accomplished?
  • How do we rate vulnerabilities, and how do we communicate severity (likelihood x impact) to stakeholders?
  • What kind of process (architectural reviews, code reviews, etc.) are built into our project in order to solve problems as early in the lifecycle as possible, where they are easiest to solve?
  • What are our black swans and grey pigeons?  How have we planned to mitigate these?

Money and Other Resources

This will always be your limiting factor.

Winning the budget game requires:

  • Being able to communicate exactly what your team is doing, and how it benefits the organization.
     
  • Showing what you should be doing, and what resources are required.
     
  • Offering ways to measure effectiveness:
    • Relative to multiple priorities
    • In response to changes in team, program, and organization
       
  • Seating risk in the lap of those who control the resources.

Security Culture

We all want to write secure code.

What we actually write is the best code that we can, while meeting all the other constraints and stressors on our work.

Distribution

Communicating About Security

Vulnerability Management

(if you think you have no vulns, you aren't looking hard enough)

Using and Sharing This Work:

Creative Commons License  Software Security Bootcamp: Architects' Edition by Susan Sons is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Permissions beyond the scope of this license may be available; send inquiries to sons@security.engineering .

Software Security Bootcamp: Architect's Edition

By Susan Sons

Software Security Bootcamp: Architect's Edition

Slides from the full-day training "Software Security Bootcamp: Architect's Edition" at CraftConf2018

  • 189
Loading comments...

More from Susan Sons