cat slave diary

Blog

In defense of Microsoft’s SDL
Richard Richard Bejtlich says on Twitter:

I would like fans of Microsoft’s SDLC to explain how Win 7 can contain 4 critical remote code exec vulns this month

I am surprised that Richard – an old hand in our circles – can say such things. It assumes defect free commercial code is even possible, let alone what everyone else but MS produces. As much as we’d all like to have defect free code, it’s just not possible. It’s about risk reduction in a reasonable time frame for a acceptable price. The alternative is no software – either cancelled through cost overruns or delayed beyond use. This is true of finance industry, health, government, mining, militaries, and particularly ISVs, even ISVs as well funded as Microsoft.

In the real world,
- We create building codes to reduce fires, damage from water leaks, damage from high winds, and improve earth quake survivability. But houses still burn down, water floods basements all the time, tornadoes destroy entire towns, and unfortunately, many buildings are damaged beyond repair in earth quakes.
- SOX requires organizations to have good anti-fraud / governance, yet still IT projects fail and still companies go out of business due to senior folks doing the wrong thing or auditors stuffing up
- PCI requires merchants and processors to handle CC details properly, yet we still have CC fraud (albeit much less than before PCI)
- We engineer bridges not to fall down, but they still do.
- The SDL requires certain calls not to be used. This should prevent common classes of buffer overflow. However, you can still write code like this:
```
char *MyFastStrcpy(char *dest, const char *src)
{
   char *save = dest;
   while(*dest++ = *src++);
   return save;
}
```
Does code using calling that function likely to have buffer overflows? Sure does. Standards and better design eliminate stupid issues like the above.

It’s not a perfect world.

The code MS works on nearly all dates back to prior to the SDLC push in 2001. Windows 2008 has roots in code started in the late 1980’s. They literally have a billion + lines of code running around with devs of all competencies poking at it. The idea that there should be zero defects is ludicrous.

Richard, if you’ve completed a non-trivial program (say over 100,000 lines of code) that does not have a security defect from the time you started writing it, you’re a coding god. Everyone else has to use programs like the SDL to get ahead. Those who don’t, and particularly those that do no assurance work are simply insecure. This is risk management 101 – an unknown risk is consider “HIGH” until it is evaluated and determined.

Let’s take the argument another way. If the SDL has failed (and I think it is succeeding), what would be the signs?

We know empirically that LOC ~= # of security defects. However, the number of critical remotely exploitable issues affecting Windows 7 is dramatically less than that of XP at the same time of release. Like 10x less. That’s an amazing achievement that no one else in the entire industry has managed to do, despite knowing how Microsoft has achieved that amazing effort.

What are the alternatives? Until Oracle saw the light a few years ago, they had the hilarious “Unbreakable” marketing campaign. Sadly for them, they were all too breakable. See D Litchfield for details. Not reviewing or keeping dirty secrets secret does not make things secure. Only through policies requiring security, standards that eliminate insecure calls like dynamic SQL calls or strcpy(), careful thought about security in the requirements process, secure design, secure coding, code reviews, and pen tests to validate the previous steps do you have evidence of assurance that you are actually fairly secure. The SDL is a framework that puts that cycle into motion.

Oracle got it. They’re now pumping out 30-40+ CPU’s per quarter for several years in a row. I’d prefer 4 remotely exploitable issues once or twice a year than 40 per 3 months thanks. But even so, I’m glad Oracle has jumped on the SDL bandwagon – they are fixing the issues in their code. One day, possibly in about 5 to 10 years, they’ll be at the same or similar level that MS has been at for a few years now.

I agree that monocultures are bad. I use a Mac and I have been unaffected by malware for some time. But do I believe for even one second that my Mac is secure just because it’s written by Apple and not Microsoft? Not in a million years. Apple have a long way to go to get to the same maturity level that Microsoft had even in 2001.

All code has defects. Some code has far fewer defects than others, and that code is written by Microsoft in the last few years.
October 8, 2010
Code of Hammurabi – or 4000 years later, we still haven’t got it

The Code of Hammurabi is one of the earliest known written laws, and possibly pre-dates Moses’ descent from the Mount.

In it, we get a picture of the Babylonian’s laws and punishments. In particular, there’s this one:

If a builder builds a house for someone, and does not construct it properly, and the house which he built falls in and kills its owner, then the builder shall be put to death.(Another variant of this is, If the owner’s son dies, then the builder’s son shall be put to death.)

(Source: Wikipedia)

So essentially, this is one of the earliest building codes. Pretty harsh, but you know…

What this means is that only qualified builders prepared to take the risk of death built houses. This obviously focuses the mind.

In our industry, we have hobbiests and self-taught folks working side by side with software engineers and computer scientists, but they usually share one thing in common: they know nothing of security.

This is like an accountant graduating without knowledge of auditing principles or GAAP. It’s exactly like a civil engineer being unaware of load stresses and envioronmental factors necessary that require safety and tolerances to be built into every structure.

When the average person goes to a builder or architect, and asks for a house to be built, we expect them to know how to build the two or three story building such that it not only complies with minimum code requirements, but that it will not collapse. When they do, we strike those builders off the master builder’s register and they can no longer build homes. We can sue them for gross negligence.

When the average small company does their books, they expect the accountants they hire to know how to do double entry book keeping, and be aware of local, state and federal tax rules. When they fail to do so, they lose their CPA accreditation and we can sue them for gross negligence.

When a city or state wants to build a new bridge, they expect the winning tenderer to design the bridge to last for the expected period of time, satisfy all state and federal road and safety laws, and obtained specialist advice for key elements of constructions, such as wind tunnel tests. If the bridge falls down, this is usually the end for that building group and they are sued out of existence.

Why is so different in our field? What we do is not art. SQL injection is so utterly preventable and has been for over 10 years that I truly believe it is gross negligence to have injectable code in any running code today.

There is a huge difference between using MYOB to run a small business and building a cubby house. Yet this is all 99.9% of all developers are capable of today. They lack the most basic awareness of software security, the only key non-functional requirement of all software – from games through national treasury finance systems.

Efforts like Rugged Software and OWASP are vital. We must get out to Universities and employers and make sure that security is taught and that all IT, CS, and software engineering graduates have done at lease one 13 week subject on it, and make it the easiest possible path to major in software security. We must get out to employers and make sure they require all new hires to know about it and be able to code for it. Moreover, if they buy off the shelf software, we must get them to include clauses in contracts, such as the OWASP Secure Software Contract Annex to protect themselves from gross negligence such as SQL injection or XSS. We must reach out to frameworks and make them utterly aware that what they do affects millions of developers and they simply must be better at security than everyone else.

It’s time for the software industry to grow up, realize that fortunes, privacy and lives really are at risk, and we’re doing a repeatable engineering process, and not some black art. We have to have consequences.

September 4, 2010
OSCON 2010 Wrap Up

Well, OSCON is over for another year. It’s been a great conference. Shame there were essentially no security talks (1/216 talks is not good enough). I will have to talk to them next year about including a Security track or let OWASP organize a Security Camp, like Scala and the cloud folks had this year.

I went to a great number of interesting sessions. Most were not that well attended, which probably means that I’m a freak who loves odd ball stuff. That’s a shame, because I got a heap out of the conference overall.

Some highlights:

Cloud talks were everywhere. This is the new Ajax. I went to enough cloud talks to be all clouded out. A common theme is who owns the data and how open are cloud systems, really? Open core versus open source was a huge meme.

Breaking it open: How one consulting firm took it open. This was easily the most thought provoking session I attended in all of OSCON. It’s a shame only about 20 others caught it too. Rob and Alexandra were totally engaging and gave heaps of insights to what worked, and more importantly some of the hiccups that hit them unexpectedly, like the Excel spreadsheet from hell.

Moving to the cloud with NYTimes. This was one of those cloudy talks I told you about. I didn’t learn a lot that I didn’t already know, but it was interesting to learn how it went down at NYT.

Deploying an open source cloud on a shoestring. This topic was close to my heart as we’re doing it, but I sank a little when I learnt of the exact scale of the AT&T Labs deployment. In the end, I think different folks have different meanings for “shoe string”. Good talk, as I learnt a fair amount about realistic cloud architecture.

Eucalyptus: the open source infrastructure for cloud systems. Another cloud talk.

Data center automation with Puppet. I went to the latter part of the Puppet tutorial, so I didn’t learn much new at this session, but that’s okay. Puppet will probably end up as part of our infrastructure (need to talk to the guys).

Driving Apache Traffic Server. This talk rocked. Lots of cool information about how Yahoo does their CDN, and the resurrection of a really old (and closed) code base into what is ATS today. I’m going to try it out, but it may not suit our needs as it doesn’t do true SSL load balancing (today).

The hall way track was also pretty dang fine. I met and introduced myself to many folks I’d only seen on Twitter or the blogosphere. I took in the latter half of the phpBB BoF and met the guys, which was cool as I could finally put names to faces. I ran the OWASP BoF session on Thursday night, which had a few folks turn up, including the Portland Chapter Leader.

Internet sucked big time most days. I think the next time I come, I’ll bring a smaller laptop or an iPad or something with a long battery life as finding space near power all the time sucked. This is partially because my Mac’s battery or logic board is failing, but it is also partially a home truth – there’s only so much coding I did during the days as I found most of the talks so engaging and relevant to my interests.

It’s interesting to see the latest fads. For those who had > 13″ laptops, Macs were about 20 to 1 the favorite choice. Netbooks were very common (probably about 15-20% of the crowd), but I saw more iPads than netbooks, which surprised me as it’s so read only, and this conference is not a read only crowd.

All in all, a satisfying and interesting conference let down by the complete lack of security talks.

July 24, 2010
OSCON 2010 – Day 2
Woke up at 5.55 am. Mr Body is seriously confused. I finished breakfast by 7 am. This is not right.

Scalable Internet Architecture – Theo Schlossnagle

I’m very sorry Theo, but I couldn’t take much more hand waving and so I left at half time. I think this is more about where I am in my career – most folks seemed interested and what not, but this session was the wrong one for me. So I bailed.

Using Puppet – A beginner’s tutorial – James Turnbull and Jeff McCune

James and Greg were on song and I really should gone with my initial gut feeling and gone for this talk from the get go. Excellent hands on tute filling in the gaps in my Puppet knowledge. I’ll be taking the lessons learnt from the half tute I managed to attend back. We might even implement Puppet! 🙂 Seems fairly straightforward.

Request Tracker Bootcamp – Jesse Vincent

This is the primary reason I plumped for OSCON 2010. There are a few talks over the next few days, but we use RT … badly … within our organization, and that needs fixing. I learnt nearly everything I needed to use it properly, including:
- The RTFM module – mark out a solution as the appropriate answer. This is exactly what we need
- The RTIR incident response module – a solution for CERT style incident handling. This is not quite what we do, but I will look at it anyway.
- The PGP plugin. Definitely going to try and get this going.
- How to fix a few niggling issues. I entered some new tickets for me to handle when I get back. 🙂
- How to configure custom fields … good to know for future enhancements to our use of RT
I’m glad I attended – this was a great session and definitely recommended to anyone using RT. Jesse is a good tutor, and as the original author, he definitely knows his stuff.

In other news

I headed into Portland city for the first time to go get my Mac serviced. The light rail is eerie – it’s just like Melbourne trams, but more segregated from traffic. Getting taxis is a fools errand in Portland. Public transport is king here baby.

My Mac’s battery had been dying unexpectedly over the last month or so, especially at the least reasonable times. At other times, the battery would last a good two and a bit hours (like today), but the randomness of it all is distressing, especially when there’s data loss. So I made an appointment with the Apple Genius Bar yesterday, and popped in today.

They ran some diagnostics. My battery was found to be acceptable albeit towards the end of its working life. The power adapter is fine. That means if the issue continues, I will need a new logic board. On an out of warranty Mac. DOH! Could get expensive.

I came back and had dinner with an ex-colleague of mine – Paul Hanchett. They’re doing it hard in the USA. We didn’t really have much of a GFC in Australia, but here… whoa. I see on Twitter (#oscon) I missed OSCON Ignite. The in crowd liked it very much, but … I’d still rather have dinner with an old friend than do more slides today.
July 21, 2010
OSCON 2010 Day 1

Travelling to the USA was as exhausting as ever.

I flew on the new A380 with Qantas. Nice plane. As per usual, there’s a mix of flight attendants – the openly hostile, the “can’t see you, didn’t see you”, and my favorite, the “never around”. We were down the back of the aircraft, which is fun if you like turbulence (I do), but not so fun for the elderly couple next to me. There was a party of teens in the middle section who had no in flight entertainment units. The units are ancient and have been recycled from other aircraft – and take about 10 minutes to reboot. They run Red Hat Linux from 2002 on some VIA Cyrus processor. So it was like a little party. I got like one hour of fitful sleep in the 16 hour flight.

LAX is better. They ripped out the old customs hall, and replaced it with something like an airport instead of a manifestation of hell on earth. The customs folks even smiled. I’m not sure what about, but it feels more human than previous times.

There was a stuff up with my hotel and teknology fangummy (they’ve heard about it but don’t think it’ll catch on), but luckily, Australian business hours had just begun and I was in about 28 hours after leaving my folks place.

PHP Quality Assistance – Sebastian Bergmann

My first tutorial was Sebastian Bergmann of PHP Unit fame.

This was an awesome tutorial, and I found out a lot about tools that I had only just started to scratch the surface with. I am definitely going to setup a continuous integration server for my projects whilst I’m in Portland.

Sebastian was a good speaker, but I would have liked more demos in the first half. The demo of Hudson was possibly more informative than the slides itself. Definitely recommend seeing more Sebastion Bergmann talks!

Slides

Productive Programmer – Neal Ford

I attended this session with high hopes as most Thought Works folks I’ve met have been very switched on. Neal seems very switched on, but … this talk started out very slow and covered blindly obvious things that I think we’re all familiar with (source code control, comfy chairs, etc). The tutorial was definitely looking like a hated “hand waving” tutorials.

I considered bailing but none of the other talks in this time slot really were yelling my name. I might have tried Chris Shiflett’s tute as he’s a friend, but I wouldn’t learn much there, so I stayed for the second half.

The second half was a Top 10 with a small Top 10 “Corporate Code Smells” inside. Luckily, the second half was a bit more edifying and informative, but more from a “food for thought” point of view rather than any special insights into enterprise architecture or techniques that I’ve never heard before. This could be due to the point where I am in my career, but I was hoping for more.

The main things I learnt were the hard lessons learnt from Neal’s career. I wish there was more war stories with solutions, and far more detail throughout. If I was Neal, I’d look hard at thinking about the OSCON audience. These guys are mostly devs looking to make the jump to architect. Refactor the talk to be about that jump, the patterns, the scalability of ideas, and so on. Then it would be a HUGE improvement over the comfy chair talk we got today.

The thing I really didn’t like was the slamming of WebSphere (“#1 Code Smell. There’s a reason that WSAD is not WHAPPY”). Slammed not once, but twice in the same list. I don’t like WSAD that much either (it’s an overpriced Eclipse + J2EE reference container + IBM’s own special plugins and “enterprise” / cluster juice), but it’s like saying “your tool sucks”. Yes, but it didn’t need to be said twice in the same list, and I think most folks in the room who actually use it are forced to use it, and are unlikely to be able to move away from it. If you need the things WSAD can do, there’s few alternatives today.

Slides TBA

OWASP – Birds of a Feather

I’ve set up a OWASP Birds of a Feather session at 8 pm on Thursday night in D136. Hope to see you there!

July 20, 2010
FIFA Fraud – Football Federation Australia must be investigated

In today’s Age, there’s an article on how Australian taxpayer money is being used to bribe FIFA and other national soccer body officials to garner support for Australia’s World Cup Bid in 2022.

Item 1. It’s is actually illegal to spend Australian government money on bribes, gifts, holidays, and so on. This is contrary to the Bribery Act.

Item 2. Bribery is most likely illegal in all other FIFA playing countries, such that asking for or receving kick backs and gifts such as pearl necklaces and holidays is illegal.

The Federal Police should go in an investigate these claims, and prosecute them to the maximum extent that the law allows. We send folks who hold up a 7-11 for a measly $250 to jail for a couple of months to a few years depending on how stupid the crooks are. In this case, the “crooks” (in my opinion) are running double books and stealing Australian tax payer money to the tune of several millions of dollars per year. Bribery is theft pure and simple and is dealt with that way under Australian law.

Why is bribery and fraud so insidious? It is an opportunity cost. If the bribe did not need to be paid (and it NEVER does), then you can use that money for other things, such as health care, education, social programs, roads, and infrastructure. The more fraud you accept, the higher our taxes and the less you receive for it. In Australia’s case, $20m per year is nothing and the consultants and FFA are busy laughing it off. Wrong. For a third world country where the bribes are most likely to be accepted, this is actually death – actually no roads – actually no infrastructure. It’s evil and that’s why we and many countries have laws against it.

FIFA must immediately sack those who received or asked for gifts and change their processes to be bribe / fraud resistant and with huge sanctions on those who breach them – such as a 20 year disqualification from holding the World Cup for the countries involved, and immediate life bans from FIFA level competitions for those who seek to profit from their position.

FFA must immediately sack these “consultants”, and anyone in FFA who thinks running double books is a good idea. They must change their processes so that when they spend Australian tax payer funds, they adhere to all our laws, including the Bribery Act.

The AFP must look into these allegations and prosecute. This is like a thousand 7/11’s being held up, except Australian tax payer funds were in the till.

My guess? Nothing at all will happen. Welcome to your corrupt World Cup, a poisoned chalice for all those who covet it.

June 30, 2010
Risk Management 103 – Choosing Threat Agents
A key component in deciding a risk is WHO is going to be doing the attack. The above image is from the excellent OWASP Top 10 2010, and I will be referencing this diagram a great deal.

We’re talking about the attackers (threat agents) on the left today. So you’re busy doing a secure code review or a penetration test (how I loathe that term – so sophomoric) and found a weakness. You’ve written up a fantastic finding and need to rate it so that your client (whether internal or external, for money or for free) can do something about it. It’s vital that you don’t under or over cook the risk. Under cooking the risk looks really really bad when you get it wrong and the wrong business decision is made to go live with a bad problem. Overcooking the risk erodes trust, and often leads to the wrong fixes being made or none at all, which is worse. You can tell if you’re overcooking a risk if your clients are constantly arguing with you about risk ratings. Let’s get to a more realistic risk rating first time every time.

Risk Management 103 – Establishing the correct actor

I am more likely to be successful than a script kiddy who is more likely to be successful than my mum. Unfortunately, there’s just one of me, but there’s a million script kiddies out there. That doesn’t mean you should use them. Script kiddies are simply unlikely to find business logic flaws and access control flaws, such as direct object references. So you should reflect this in your thinking about risk – even though it might be simpler to go with what everyone already knows:
- Skill level – what sort of skill does the threat agent bring to the table? 1 = My mum. 5 = script kiddy (generous), 9 = webappsec master
- Discovery – how likely is it that this group of attackers will discover this issue?
- Ease of exploitation – how likely will this group of attackers exploit this issue?
- Size of attacker pool – 0 – system admins or similar, 9 – The Entire Internet (==script kiddies)
So you need to do the calculation for the weakness you found for these various groups to determine the maximum likelihood. This often leads into impact. Let’s go with an indirect object reference, such as the AT&T attack

Likelihood – AJV
- Skill level – 9 web app sec master
- Motive – 4 possible reward
- Opportunity – 7 Some access or resources required
- Size – 9 anonymous internet users (remember, this attack relied upon a User Agent header for authentication)
- Ease of Discovery – 7 easy
- Ease of exploit – 5 easy
- Awareness – 9 public knowledge
- IDS – Let’s go with Logged without review (8)
This brings us a total of 54 out of 72. I put this as a “HIGH” likelihood in my risk charts.

Likelihood – Script kiddy
- Skill level – 3 some technical skills (script kiddy)
- Motive – 4 possible reward
- Opportunity – 7 Some access or resources required
- Size – 9 anonymous internet users (remember, this attack relied upon a User Agent header for authentication)
- Ease of Discovery – 1 Practically impossible
- Ease of exploit – 1 theoretical
- Awareness – 1 Unknown
- IDS – Let’s go with Logged without review (8)
This brings us to 34. So we shouldn’t consider script kiddies when there might be a motivated web app sec master on the loose. But is that entirely realistic? Honestly, no.

Who is really going to attack this app?

Think about WHO is likely to attack the system:
- Foreign governments – check.
- Web app sec masters – Our careers are worth more than the kudos.
- Bored researchers trying to make a name for themselves – check even though quite dumb (see previous bullet)
- Script kiddies – check but fail. Realistically, unless someone else wrote the script, they wouldn’t be able to do this attack.
- Trojans – check but fail for the same reason as script kiddies.
- My mum doesn’t know what a direct object reference is. Not going to happen.
- Terrorists – check, but seriously, remember dying by winning lotto, buying a private plane with the lotto winnings, having the plane struck by lightning on its four leave clover encrusted hull eight times, parachuting out and then for the main and the secondary to both fail is more likely than a terrorist attack. Don’t use this unless you’re after Department of Homeland Security money as everyone else will just laugh at you. Especially if you use it more than once.
So let’s go with #1 as this is an attack that they would be interested in. They have resources and skilled web app sec masters, so this attack likelihood is a HIGH. So let’s work out the impact for this scenario:

Sample Impact Calculation

There’s a lot of subjectivity here. You can close that down significantly by talking it over with your client. This doesn’t mean you should go with LOW every time you have the conversation, but instead set out objective parameters that suits their business and this application. Yes, this takes a fair amount of work. You can either do it before you deliver the report, or you can do it after you deliver the report. If you choose the latter path too often, your reputation as a trusted advisor can be found in the client’s trash bin along with your reports and the client relationship.

Let’s do the calculations based upon the sketchy information I have from third hand, unreliable sources and vastly more reliable Tweets. i.e. I’m almost certainly making this up, but hopefully, you’ll get the picture.
- Loss of confidentiality. Check big time. All data disclosed (9)
- Loss of integrity. In this case, no data was harmed in the making of this exploit 0
- Loss of availability. If every government tried it at once, I’m sure there’d be a DoS but let’s be generous and say minimal primary services interrupted (5) as the system would have to be taken offline or disabled after it was discovered
- Loss of accountability. It’s already anonymous, so 9
- Financial damage. AT&T is big. Really really big. In the grand scheme of things, this probably didn’t hurt them that much. That said, it has to be in the millions. So let’s go with Minor effect on annual profit (3)
- Reputation damage. AT&T’s reputation is somewhat already tarnished, so let’s go with loss of major accounts (4) as I’m sure RIM will pick up all of those .mil and .gov accounts very soon now.
- Non-compliance. PII is about names and addresses, but AFAIK, e-mail addresses are not protected at the moment. Happy to hear otherwise – leave comments. Let’s go with “clear violation” 5
- Privacy violation. 114,000 is the minimum number, so let’s go with 7 and it could tip towards 9
This gives us 42 / 72, which is a MEDIUM impact (just shy of “HIGH at 46), giving an overall risk of HIGH. That is about right, and thus should have been caught by a secure code review and fixed before go live.

Next … Risk Management 104 – Learning to judge impacts
June 21, 2010
Looking for inspiration

Like many technical writers, I am constantly looking for ways to improve my writing skills. I don’t think there will ever be a time when I think “Okay, that’s good enough” and stop criticizing my own work.

I am constantly in awe of other authors, particularly those that have published great works. I seek out author interviews on scholarly websites, and places like Galley Cat, in an effort to glean small insights into the life of an author.

I started out by reading author interviews for any morsels on how they organized their day and their writing space. This accelerated once I started working from home. This is a futile project – each author, if they mention it at all, has a completely different day structure and writing space than the author before. Some write early in the morning (impractical for me), some write late in the night (I’d love to, but I have a 2.5 year old who says “close eyes” and means it); some write in glorious writing palaces redolent in over stuffed furniture and old books others write in long hand at the local coffee shop or library. No two are the same. Sigh.

The one common theme is that they write every day. Iain Banks, one of my favorite authors, writes for only part of the year and takes the rest off, but still manages a punishing schedule and daily word count to pump out beautiful works of art.

Another common theme is supportive family and friends. I can attest to that – my cats led a lonely life whilst I was tapping away at the OWASP Guide 2.0 for several months. I don’t think I could ever do that again – not least for family reasons.

Technical writing for web application security is far different from any form of fiction. It’s different from most non-fiction – and it’s dramatically different from sports writing. We’re expected to dumb down (“communicate”) with our peers in a way that nearly no other technical field would allow. In my field, respect is paid to those who can communicate highly technical, very advanced concepts in a way that could be understood if they were on the back of a “Fantale” wrapper.

I am not disparaging my field, for I love it, but I do object that our terms of art – our short hand – is so easily sacrificed. I need to learn how to write dumber, become one with my inner dumb writer, and make sense even when it makes no sense to write for the average tabloid reader. I think we underestimate our reader’s intelligence and insult them terribly every time we pump out a report in basic English (that is – using only the 500 most common words).

Viva la revolucione! Whoops, that wasn’t basic English. My bad. As for more author interviews – it’s like reading a good autobiography – hard to put down. I think I will continue to seek out author interviews, even though I think they will in the end not shape my writing style nor my work space to any great degree.

June 20, 2010
Risk Management 102 – when is a high a high
There’s a lot of consultants (and clients) who know little to nothing about proper risk management. This is not their fault – it was never taught at computer science or most similar courses. If you get good at it, you’re unlikely to be a developer or a security consultant. That’s a shame, because risk management has a lot to offer both consultancies and their clients if done properly.

The problem is that most consultants think technical risk, and will happily assign “Extreme” risks to things like server header info disclosures. Many clients actively campaign to reduce risk ratings for whatever reason, some for valid reasons, others not. And they will win if the risk ratings are wishful thinking or outright wrong. This could cost the organization billions of dollars if a HIGH risk becomes a LOW risk and is accepted, when really it’s a sort of a MEDIUM to HIGH risk depending on the situation.

We as consultants have a responsibility to THINK about the findings we put into reports. Don’t be a chicken little, but also don’t be bullied into reducing bad risks as you’ll be chosen for your outcomes rather than your honesty and integrity. Be open and honest about how you came to that risk decision, talk over the factors, and help the client understand and agree to the choices you’ve made. So don’t just stick “HIGH” in there, you need the entire enchilada. Lastly, be reasonable when you’ve made a mistake and ensure there’s as few as possible as that’s a huge reputation risk.

Clients have a responsibility to talk over the risk ratings so they fully understand the risk. All parties should agree that they document the original risk, the discussion about the risk, and any revisions to the rating and / or vulnerability. Maybe there’s a control that’s being missed, or may be there’s a misunderstanding of how easy it is to perform. Otherwise, there’s no accountability. In the end, consultants should never change a risk without documenting that change.

How to improve the situation

I like the OWASP Risk Rating methodology. The primary reason is that two different consultants can come up with the same result independently, removing a lot of the subjectivity and argument from the equation. I like to include the entire calculation as this allows clients to repeat my work and thus understand why it turned out the way it did.

There are issues with the OWASP Risk Rating methodology:
- It’s far too easy to generate “Extreme” risks. Extreme risks are really, really rare. They are company ending, life ending, project ending, shareholder value strippers, reputation destroyers. Think BP and the Gulf Coast. SQL injection at TJ Maxx is an extreme risk (despite them still being in business, it did cost a lot).
- It’s difficult to game the numbers to create “Low” risks when you know that it really should be a “Low”. I basically take nine off the top, as I’ve never gotten a value less than nine. This helps a bit, but even then.
- It’s hard to do it manually. I use Excel spreadsheets, but you may want to automate it more.
- You must talk to your customers first. Otherwise, you need to take out the business elements (financial, legal, compliance, privacy) as you will not be able to lock these in.
- Impact values are not the same for the entire review. They change as per the asset value/classification, and you will most likely have more than one asset value / classification in your review. There’s a difference between contexts, help files, PII, and credit cards. Document which one applied.
That said, the OWASP risk rating methodology is way better than pretty much everything else out there for web apps. CVSS is not suitable as it’s for ISVs who produce software. That doesn’t describe most enterprise, hobby, open source projects, and so on. If you need to do AS4360 risks, CVSS is not going to cut the mustard.

Risk Management 102.

We spend a lot of time arguing with some clients because we haven’t thought through our risk carefully enough, or worse, just used the one from the last report. No two clients and no two apps are ever the same. Therefore, the risk ratings for each of your reports MUST be different. Spend the time to do it right the first time, or you’ll spend a lot more time later when your client argues with you. And they may have a point.
- Try not. Do… or do not. There is no try. The likelihood rating is solely about the likelihood of the MOST SKILLED threat agent SUCCEEDING at the attack / weakness / vuln you’ve described.
- The impact rating is solely about the WORST impact of the attack / weakness / vuln using the threat agent you’ve described.
For example, you have a direct object reference in the URL and no other controls – my Mum could do this attack. The IMPACT is off the charts, and the likelihood too. Just because a n00b consultant with an automated tool is unlikely to do more than annoy the web server, doesn’t mean that’s the threat agent you should document.

If you came so, so close to exploitation and you just know that it could be bad, but you failed miserably after several hours, exploitability has to be set to 0. Seriously. The impact has to be low too, as there’s no impact that you’ve proven. To document anything else is wrong. I’m happy for folks to write up how close they came, and draw attention to it in the executive summary and in the read out, but to put a high likelihood says that you’re lame, and a high impact says you’re a chicken little. Don’t do it.

If you’re unsure, map out different attackers (n00b consultants with automated tools, script kiddies, organized crime, web app sec masters), work out how likely they are to succeed at the attack, and then work out what the impact is for each of these threat agents. Do the math and use the most likely choice with that most likely choice’s impact. Don’t under or over blow it – if a web app sec master could totally rip a copy of the database with both hands tied, the impact is likely to be low.

Lastly, don’t go the terrorist route. You are more likely to win lotto, fall out of your new private plane from 30,000 feet and then get killed by lightning than you are ever likely to be a victim of terrorism. Chicken little scenarios work once or twice, but you’re just wasting everyone’s time and scorching the earth for all those who follow you.
June 9, 2010
Intelligent Session Manager Architecture
As security researchers, I think we’ve let down users in the quest to close down questionable and unlikely events. The problem is that even though unlikely, these events – such as MITM attacks – work nearly 100% of the time. They make great demos to scare folks who don’t understand what they’re seeing. It’s a shame that they just don’t occur in the real world all that often. So let’s move beyond “Expire it after 10 minutes”, and to a session manager that actually helps the business and makes users love you, and really close out some of these attacks.

The reasoning behind 10 minutes is a balance between the business (who’d prefer no time outs really and would love to have a magic “remember me” function that is somehow secure) and Tin Foil freaks like me who know how incredibly simple MITM, session fixation, and session hijacking can be. Many of the goals of our advice has been based on 1970’s standards and thinking, and 1990’s type of attacks that still work, primarily because we’ve been asking for the wrong solutions, like short time outs and don’t let users log on twice.

As Dr Phil says, “How’s that working out for you?”

So let’s think about ways to improve session managers to blunt the known attacks. We know that TLS has issues with MITM attacks, but we’re very lucky that this is a local attack (for now). Such attacks are also exceedingly unlikely outside of security conference wireless networks, and motivated attacks on behalf of organized crime (very rare but devastating – see TJ Maxx).

However, some of the other assumptions we’ve made when recommending bad ideas usually don’t think about the user of the application. My wife does all of our shopping online. The system is awful. It times out within a short period of time, and it usually takes 4 to 5 attempts to finish an order. I’m sure there’s some poor risk manager going “WTF? PCI is stupid – we have to implement 10 minute time outs for a process that lasts 30-40 minutes?” Let’s move beyond quick fire “gimme” penetration test results, and think about HOW the USER is impacted when we make recommendations with our consultancy hats on.

What goes wrong if it takes 40 minutes to assemble a shopping list? Do we have a financial loss? No. Do we have a reputation loss. Yes. Do we have a shareholder loss. No. Do we have a privacy impact? No. Do we have a regulatory impact? Only if you consider PCI DSS a regulation worthy of its name. What can we do to make it better?

With the online shopping example, losses start when we can order stuff. Easy! Keep everything intact (and allow items to be placed in and removed from the cart), but make the user re-authenticate to purchase or see their profile if it’s been more than 10 minutes. But with 100% of session managers today, that very act is impossible without significant customizations and we all know there’s some B List pen tester willing to ping you on long timeouts if you do write that secondary all singing all dancing session manager. THINK BEFORE YOU RECOMMEND RECEIVED WISDOM!

Realistically, we need to set some baseline parameters for every session manager.
- Strong. Session tokens should be random enough to resist being brute forced in a reasonable time frame. I still see this although it’s been solved on most platforms since 1996 or so.
- Controlled. Session managers should only accept their own session tokens.
- Session hijacking resistant. Session managers should rotate their tokens from time to time automatically. Every five minutes is fine, as is every request as long as there’s a sliding window of acceptable tokens to allow the most used button (Back) to work. All frameworks should possess a regenerate token API – it’s ridiculously hard in all frameworks but PHP today.
- Session hijacking resistant. Session managers should watch headers carefully and reject requests that don’t perfectly match up with previous requests. There is no reason for a user agent or a bunch of other headers (upto and including REMOTE_ADDR) to change within a session.
- CSRF proof. Session managers should tie themselves to requests, and check that the session and forms match up. OWASP CSRF Guard can do it, and realistically, this should be standard in every session manager.
- Cloudy Web Farm support. It’s very hard to do federated session state with most session managers, and yet the hackiest solutions I’ve seen for getting around this issue is due primarily to the isolated session manager mentality. There are good last writer wins replication mechanisms around, including “deliver at least once” – not everyone needs this functionality, but those who do really need it badly. This can be used as a pre-cursor to…
- Notifications. Most SSO products use work arounds so that the primary session manager times out before the SSO token does. This means that their are active SSO sessions you could reconnect to if you know what you’re doing. Let’s make it easy for folks like Ping to get notified when regenerate, idle, absolute and logout events occurs.
- Adaptive timeouts. Sessions that “expire” should be put into a slush pool, that comes alive again up to an absolute limit. But the instant that a user wants to perform a value transaction, the session manager should require re-authentication.
- Integration with common SSO protocols. SAML and WS-Federation are the two most popular SSO mechanisms out there. Realistically, all session managers should be aware of how these work, and tie into them strongly so that if folks use SAML/WS-Federation, this can be tied to the session token in use. How many times have we seen these two operate in completely separate worlds and then been a target for replay, session expiry and other attacks.
- Destroy means destroy. Make it easy for devs to do the right thing when the user clicks logout. Not only clear the session properly, but also all associated copies of that token – headers, cookies, DOM, etc, etc.
Notice that I didn’t put one of the lazy pen tester’s favorites in the above list – “Logging on more than once”. I REALLY don’t care about that. I care about what VALUE TRANSACTIONS you can do within the assigned sessions. If there’s a problem with value transactions, preventing two sessions at once isn’t going to save your bacon. Transaction signing / SMS authentication / re-authentication will help, or if it’s about resource consumption, then transaction governors like in ESAPI will help. THINK BEFORE YOU PUT STUPID THINGS IN YOUR REPORTS.

Many of these items are in ESAPI. That’s awesome, but it would be nice if all session managers dealt with sessions to support users and business uses, rather than obscure and unlikely attacks.
June 7, 2010