Thursday, 29 December 2011

A different 2D array in bash

I have recently been writing a pretend Object Oriented Programming layer on top of bash. Basically, it uses some global vars, function-within-functions and a ton of eval to create the illusion of OOP in bash.

Something like:

#!/usr/lib/oobash.sh

# Import the Car class
`import Car`

# Create a new "object"
new Car as mycar

# Set a variable
mycar.speed 100

# Call a function
mycar.drive

This will be made available here http://code.google.com/p/object-oriented-bash/ soon, but in the meantime I have needed to implement a 2D array in bash. A quick Google search shows some different ways of doing this. Firstly is to dynamically create arrays using eval. The other approach is to use a huge offset, e.g. 1000, then refer to array[i][j] as array[i*1000+j] etc. The latter example requires you to keep track of two indexes which is sort of annoying.

Edit:

I noticed someone has already done something similar at http://oobash.sourceforge.net/. I'l take a look at it some point and if it does the job I'll abandon my project.

So, in an attempt to find something different, I came up with the following. The idea is to store some metadata about how many elements we're adding.

An array with just one element might look like this:

[0] = 1
[1] = "Hello there"

The element at index 0 is the number of elements that follow. As the value is 1, the next element contains the value, which would basically be a scalar. Now, an array might look like this:

[0] = 2
[1] = "Hello"
[2] = "there"

Element 0 tells us to expect 2 values. Nice and simple huh? A more complete example:

[0] = 1
[1] = "Hello"
[2] = 3
[3] = "How"
[4] = "are"
[5] = "you?"

You get the idea. But, we have a problem. With arrays, one often wants to pop from the end of the array. In order to do that, we have to start at the end and work backwards. But the following example would break that:

[0] = 1
[1] = "Hi"
[2] = 3
[3] = "aaa"
[4] = 1
[5] = "bbb"

If we worked backwards from 5, we would get to element 4, which looks like an array count and conclude that (4,5) was a scalar pair. That isn't actually the case. But, there is a solution.

[0] = 1
[1] = "Hi"
[2] = 1
[3] = 3
[4] = "aaa"
[5] = 1
[6] = "bbb"
[7] = 3

At the cost of some space, we've added a count element at the end of each array. That way we can go to the last element, and know how far backwards to go.

Well, that's the theory. Now lets look at a push:

#!/bin/bash

# Function: array_push()
# Params:
#   name     - name of array receiving data
#   indexVar - name of variable to store index used
#   ...      - list of one or more values to be pushed
# Returns:
#   Nothing really
array_push()
{
    # Require at least 3 parameters
    if [ "${#}" -lt "3" ]; then
        return
    else
        # Set parameter variables
        local name="$1"; shift
        local indexVar="$1"; shift

        # Get the next index
        eval local index=\$\{\#$name\[\@\]\}

        # Make a note of how many elements we're pushing
        local count="${#}"

        # Set the index var to our index. This can then be used later 
        # to retrieve data from the array.
        eval $indexVar=$index

        # Record the count
        eval $name\[$index\]=$count

        # Moving up in the array
        ((index++))

        # We now iterate over the values, adding them to the array
        end=$((index+count))
        while [ "$index" -lt "$end" ]; do
            eval $name\[$index\]=\"$1\"
            shift
            ((index++))
        done

        # Finally, make another record of the count
        eval $name\[$index\]=$count
    fi
}

declare -a aTest

echo First push
array_push aTest iIndexA "Good day."

echo "index: $iIndexA"
echo new array is ${aTest[@]}
echo

echo Second push
array_push aTest iIndexB "To" "you" "sir!"

echo "index: $iIndexB"
echo new array is ${aTest[@]}

Running it gives us:

$ ./push.sh 
First push
index: 0
new array is 1 Good day. 1

Second push
index: 3
new array is 1 Good day. 1 3 To you sir! 3
$

Wonderful. We're not doing anything useful, but the data is there. It was at this point that I got bored of this approach. The problem is that doesn't extend to beyond two dimensions. The only way around this would be some sort of JSON like beginning/end tags instead of a count. E.g.

[0] = {
[1] = 'a'
[2] = }
[3] = {
[4] = 'b'
[5] = 'c'
[6] = {
[7] = 'd'
[8] = }
[9] = }

This is could get a bit slow to process. We'd have to do a lot of string comparisons, and we'd need to quote values to prevent confusion.

Alternatively, we could use an array as usual, but have our functions automatically test values to see if they're an array. That blog post shall follow shortly.

Wednesday, 26 October 2011

Fundamentals principles of better security

Defence in depth

This is the daddy of all security advice. The idea is to not rely on any one particular technology, process or approach to security. By building up layers of defence you can tolerate the eventual compromise of one of the layers. For example, don't just run AV software and hope for the best, harden your system and keep it up to date. Then use technologies such as whitelisting, sandboxing and URL blacklisting to further reduce risks. Don't just rely on a perimeter firewall or NAT, firewall between networks and use IDSs.

Security is a process 

Similarly to the defence in depth advice, don't ever expect to be completely secure. No product or solution will ever make you 100% hacker proof, regardless of what the marketing material states. Security is a process of identifying vulnerabilities, prioritizing and mitigating risks, making things a little bit better each time. 

Security through obscurity

Sticking with obvious advice, don't rely on security through obscurity. This means relying on the secrecy of a design, implementation or process for security. For example, rather than relying on some obscure proprietary encoding to protect data, you should use well known and tested cryptography. It shouldn't matter that someone knows your data is protected with 256bit AES. 

Crypto is hard


Speaking of cryptography, it is difficult. Very difficult. So unless your clever idea to protect data has just been rigorously tested by the crypto community, you probably shouldn't use it. Use a standard library instead, don't roll your own. You're going to get it wrong. Even implementing standard libraries can be tricky, so be careful.

KISS

The phrase Keep It Simple Stupid applies to most areas of life, including security and IT in general. Basically, complexity is the enemy of good security. Complex systems are difficult to understand, manage and maintain. A unnecessarily complex system will degrade over time which generally means holes and weaknesses developing. Bugs creep in and remain hidden. People become afraid of it, unwilling to investigate and fix issues.

Test, test, test

Speaking of bugs. Test the security of code during development by means of code review and static analysis etc. Test your applications before they go into production, eg through audit, fuzzing and standard QA testing. Test your infrastructure once it's in production by red-teaming and using external audits / pen testing.

Temporary is rarely 

This might not sound like security advice, but it is. The classic being the "any/any" firewall rule. Probably got put there during development or testing. It fixed the problem, and probably has a comment like "TODO - Work around to get customer working before deadline. Must investigate and fix properly - A N Other, March 2005". Although unavoidable, at least don't pretend that you'll get around to fixing it later. Just accept that it is going to be yet another permanent hack, and perhaps invest another 10 minutes to make it suck slightly less.

Weakest Link

Aside from all that dodgy hacked code and long forgotten cronjobs, your systems are in pretty good shape. However, security is like a chain. It is only as strong as the weakest link. Perhaps your database server is uber-hardened. But if you've got a five year old cronjob running somewhere on an unpatched six year old server that connects to your database will full admin privileges, that hardening is meaningless.

Principle of least privilege

So, why would that cronjob that simply dumps data need to run as an admin? It was probably created when the database only had one user - the admin. Applying the principle of least privilege means that people and systems should only be authorised to do what is required for them to successfully do their jobs. Another example is developers having root access on production systems. Is that really necessary? Or even sysadmins having full access to the development source code repositories. Following this principle helps guard against accidental and malicious abuse of power.

You cannot create data retrospectively

But you can throw it away. This is an operational point that extends beyond security. You can do some very interesting and insightful things with operational metrics, giving you new visibility into your system and therefore greater understanding. From a security perspective, an IDS alarming might coincide with a jump in open file descriptors on a system, or a sudden jump in memory used. Perhaps the number of lines per minute in an application log file jumps or drops. By collecting, graphing and reviewing this data you could lower the time it takes to understand an attack. There is more to life than just network traffic. Not using certain data? Archive or delete it after 12 months. Simple.

Secure at the source

As a general principle, you want to secure as soon as possible. If data is being generated on a system, you wouldn't wait until that data has been transferred half way around the world before attempting to secure it. This can go as far as never letting unencrypted sensitive data even reach a harddisk.

Security is a user problem

On the one hand you have security, on the other you have usability. No matter how clever a security technology is, if it needs to be driven by a human then it will need to be usable. When designing a system, try to work security into it from the start so that it doesn't become a bolt-on that gets in  the way of usability.

Security is everybody's responsibility

Another classic principle, security is the responsibility of everyone, from the receptionist to the CISO. Why? Well, weakest link for a start. The receptionist might not have root access on the production servers, but they are probably on the network and could be a pivot in further attacks. They would probably also make good sources of spear-phising emails to internal addresses. Educate people to take an interest in doing their best to maintain security. Reward them for reporting suspicious activity (e.g. a social engineering phone call).

Remote code execution == game over

Basically, if a box has been compromised to the extent that the perpetrator has run code on it, you have to assume you have lost control of that system. Rootkits are becoming more sophisticated by the month and you probably do not have the time to fully investigate exactly what happened to the system. Investigate it, understand what they succeeded at, what they failed at and perhaps even what their motivation was. Then rebuild the box and start from scratch. Also, don't rely on backups taken after a compromise.

You're going to get hacked

Almost finally. It's not a matter of if, but when. Work on the assumption that you are going to get hacked and look at how it's going to happen and how you can minimize the impact on the business. Make sure you have an incident response process.The idea is to stop an attack as soon as possible, reducing the impact whilst recording data for later forensic analysis (a hardened centralised syslog server might help).

Finally

This list isn't complete. I will add to it over time.

Monday, 30 May 2011

London Sky - Flickr pictures

A slight tangent from the usual rare posts, but I've just uploaded some London skyline pics to Flickr.

The rest can be found here: Flickr

Sunday, 1 May 2011

M4sl0w's hierarchy of needs (for geeks)



What is Maslow's hierarchy of needs?

I suggest you look here or Google for it, but in summary it is a psychological theory to describe the levels of human needs and desires and how higher levels of needs can only be satisfied once all the levels below are satisfied.

Does this apply to geeks?

Well, even geeks have basic needs that must be met. I'll discuss them in turn below.

Caffeine


All geeks need caffeine in some form. There are a few crazy exceptions, but they're by far the minority. Popular sources of caffeine are coffee and soft drinks. Energy drinks are of course an excellent source of this essential nutrient and have helped many a geek through a late night coding frenzy.

Awesome workstation and shiny gadgets

So, you're getting your regular dose of free caffeine. This will no doubt be the first thing you do when you get to the office, if you haven't already picked a drink up on the way in. You sit down at you're desk and you're greeted with what you're going to be staring at for the next eight hours. Do you look at it and have flashbacks to the eighties or do you admire the technical marvel before you? As they say, "if you need something done, you may as well do it on something cool and expensive".

Fun and interesting colleagues

As a geek you probably dread social interaction. Or perhaps not. I think that cliché is somewhat dated. But you probably have to work with people around you. Perhaps even regularly. The goal here is to have fun at the same time, building up a sense of team and therefore a shared sense of purpose. That way you are more productive and get things done sooner leaving you with more time to sip your caffeine drink of choice. If you're lucky, you might even meet your colleagues after work in some form of social event.

Interesting work, a sense of control and doing things right.

Do it. Do it now. Do it right now. Do it right.

So, you're sipping your regular dose of caffeine, sitting back at your shiny workstation after a interesting and thought provoking conversation with a colleague. It is now time to get on with the day and do some work.

I think most people, and especially geeks, have a basic desire work hard, work efficiently and to do the right thing. As a geek, you are being paid not just to mash the keyboard from time to time, but to use your expertise to assess and prioritize what work needs to be done and how. A sense of control is very important here - if you and your colleagues know what needs to be done, know how it needs to be done, know that it can be done and that it is the right thing to do, but cannot do it for political reasons, then you're probably going to be somewhat demotivated (wow, that last 'sentence' was an abuse of the English language, but you get my point I hope).

It is the role of a good manager to align the goals and opinions of the geek to the business requirements.

And, if you're lucky, your work might even be particularly interesting. Everything is interesting if you think about it the right way :) What was the other one? There are no boring tasks, only boring people.

Play time and training

Now that you're being productive and the business is happy with your and your team's work, what else is there?

Geeks do what they do because they love it, not because it is a career path (by definition, otherwise they're just random people who happen to work in IT etc). It is a geek's natural instinct to experiment, play, create and break things. And to learn during the process - understand how things work, what makes things tick and where the boundaries lie.

So, what is the ultimate geek motivation? To play, experiment, test, break, fix, create and get paid for it. That is why Google has their 20% time. This means that you get to spend 20% of your time doing whatever you like, which can sometimes result in the birth of a new product or service. It gives geeks a chance to think "outside of the box" (Bingo!) and to try or learn new things.

Why bother? Isn't it a waste of time? Well, perhaps. But there is an asymmetry in the potential gains. Basically, there is a fixed loss - 20% of an employee's time. However, the potential gain has no boundaries. Perhaps someone comes up with something truly phenomenal which ends up being worth millions to the business. Or perhaps they create something useful that gets release to the community and makes the world a slightly better place.

And who knows, the company might even lower its staff turnover and save some money on recruitment.

Finally, a word on training. There are two types of training found during employment. Basic training required to do the job effectively and personal training which indirectly benefits the company but adds to the employee's sense of purpose and career progression. Both forms of training are motivating, with the latter being more so. Unfortunately, training is rarely a priority for companies and doesn't seem to exist much in the wild. Lucky for them geeks tend to take responsibility for their own training, but this does generally result in higher staff turnover - if someone has dedicated most of their free time and spare cash into training themselves, why wouldn't they move on to reap the rewards of a better income or more interesting work?

Sunday, 27 February 2011

10 useful little bash tricks

We're going to have a quick look at a collection of ten little bash tricks that you may or may not know. Or possibly forget. Or perhaps know, but don't tend to use often and so this might be a good reminder.

Please note, this stuff was tested on Bash 4.1.5. You can find further details and examples for all these tricks in the Bash manpage.

1::Using previous command

This trick is very simple. If you run a command and then have to repeat it (for example using sudo)
then instead of typing the whole command again you can refer to it using !!.

user@host:~$ aptitude install blah
[100%] Reading package lists^C
user@host:~$ sudo !!
sudo aptitude install blah
[sudo] password for user:
...
user@host:~$


As you can see, it even prints out the new full command.

2::Using previous argument(s)

Similar to the first trick, bash will allow you to easily reference the last argument of the previous command.

In this first example we're going to touch a few files, then decide to remove the last one.

user@host:~$ touch a b c
user@host:~$ rm !$
rm c
user@host:~$ ls
a  b
user@host:~$


As you can see, !$ only expanded to c. If you only had one argument, it would exand to that one argument.

We can also reference all the previous arguments using !*

user@host:~$ touch a b c
user@host:~$ rm !*
rm a b c
user@host:~$ ls
user@host:~$


3::Process Substitution

Right, now its time to move on to something a little different.

We want to compare the files in two directories to see which are in common, missing or changed etc. It would make sense to use the diff utility to help. We are also going to use md5sum to get a fingerprint of the files so we can see if they have changed. All we have to do is look through each directory (using find), do an md5sum of each file and compare the two lists.

Now, diff usually expects to operate on two files. We could redirect our results into files and then diff those files, but there is a smarter way. By using the <(list) for we can run our finds and diff will treat the results files.

diff <(cd boot-a; find . -exec md5sum {} \;) <(cd boot-b; find . -exec md5sum {} \;)

< 9c830b456ed37e0c0d63f2528fb43de5  ./initrd.img-2.6.35-23-generic
7,8d5
< f70fd0262a6f8e1e82028237e94657fc  ./config-2.6.32-25-generic
< c16c8e1705a54db8b96adce7de17710a  ./vmcoreinfo-2.6.32-25-generic
10d6
< b1f002028905e42f594c83603606ca0b  ./config-2.6.35-23-generic
12,16d7
< 668d54c704f22d85cb548f27a7e19d41  ./config-2.6.35-22-generic
< 882721477fcf705c02b60a4bb219b0c8  ./vmlinuz-2.6.32-25-generic
< 88472c27c3d18a832b5d85aea1edc541  ./abi-2.6.35-22-generic
< 496b4aed0005b07574e9dd3895089b23  ./abi-2.6.35-23-generic
< 6ee40238bdabb40a862c6f63262efa9d  ./System.map-2.6.32-25-generic
18d8
< 4e02de05c66cb0b322dc5c71f43882e9  ./System.map-2.6.35-23-generic
84c74
< ac9641014f0b460fc6eb5d2936bd869c  ./grub/menu.lst
---
> c5cce31e9e9eb884685137aac1ab8a8a  ./grub/menu.lst


As you can see, boot-a had quite a few files that boot-b was missing, and grub/menu.lst was different between the two directories.

4::Here document and redirect

Sometimes you need to put together a little config file, or perhaps a test HTML page. It is possible to do it quickly without having to resort to opening some sort of editor such as vi.

user@host:~$ cat << EOF > test.conf
> option1=blah
> option2=something else
> debug=true
> # end of config
> EOF
user@host:~$ cat test.conf
option1=blah
option2=something else
debug=true
# end of config
user@host:~$


What we did is specified a here document using << EOF. This keeps reading from the terminal until it encounters the string EOF. That is then redirected into the file, in this case test.conf.

5::List expansion when copying

This a very simple but extremely useful little trick that I'm sure you already know, but because of how often it is useful, I thought I'd put it here anyway.

Let us say you have a file you want to quickly backup. Instead of doing something like cp file file.backup we can use list expansion to shorten the command:

user@host:~$ ls
special.conf
user@host:~$ cp special.conf{,.backup}
user@host:~$ ls
special.conf  special.conf.backup
user@host:~$


6::Nested list expansion

Sticking with list expansion, it is worth remembering that you can have nested lists:

user@host:~$ touch {a{1..5},b{1,2,4},c{5..9}}
user@host:~$ ls
a1  a2  a3  a4  a5  b1  b2  b4  c5  c6  c7  c8  c9


7::Quick maths

Need to do a quick calculation? Already have a terminal open? Don't bother firing up a calculator, just do the calculation in bash:

user@host:~$ echo $((2**32))
4294967296
user@host:~$


See the Bash manpage for the supported operators.

8::Tilda expansion

Bash allows you to quickly reference the current $PWD and old one ($OLDPWD). This could be useful if you forgot to use pushd/popd:

user@host:~/a$ cd ../b
user@host:~/b$ ls ~+
inside_b
user@host:~/b$ ls ~-
inside_a
user@host:~/b$ echo $OLDPWD
/home/user/a
user@host:~/b$ 


So, ~+ expands to $PWD and ~- expands to $OLDPWD.

9::Setting default values

When you have a bash script that takes command line arguments you might want to set default values if no arguments are supplied by the user.

In the bash script below, we default arg1 to "a" if argument 1 is not supplied. We default arg2 to "b" and arg3 to "c". Of course, the order of the arguments matters.

#!/bin/bash

arg1=${1:-"a"}
arg2=${2:-"b"}
arg3=${3:-"c"}

echo $arg1 $arg2 $arg3


When we run the script without arguments, it uses the defaults. But when we give it arguments, it uses those values.

user@host:~/test$ bash test.sh
a b c
user@host:~/test$ bash test.sh x y z
x y z
user@host:~/test$


10::TCP and UDP

Finally, bash allows you to send TCP and UDP traffic directly using a special device of the format /dev/tcp/host/port (use udp instead of tcp if necessary).

Firstly, we set file descriptor 3 to use that device. We can do that with exec, as it suggests in the manual:

"Note that the exec builtin command can make redirections take effect in the current shell."

user@host:~$ exec 3<>/dev/tcp/localhost/80
user@host:~$ ls -la /proc/$$/fd/
total 0
dr-x------ 2 user user  0 2011-02-27 16:34 .
dr-xr-xr-x 7 user user  0 2011-02-27 16:34 ..
lr-x------ 1 user user 64 2011-02-27 16:34 0 -> /dev/pts/8
lrwx------ 1 user user 64 2011-02-27 16:34 1 -> /dev/pts/8
lrwx------ 1 user user 64 2011-02-27 16:34 2 -> /dev/pts/8
lrwx------ 1 user user 64 2011-02-27 16:34 255 -> /dev/pts/8
lrwx------ 1 user user 64 2011-02-27 16:34 3 -> socket:[156914]
user@host:~$


We quickly use $$ to see the open file descriptors for our process. As you can see, FD 3 has been opened for a socket connection, in this case a TCP connection to localhost on port 80.

We can now use redirect echo into FD 3. Note that we use the -e option so that bash interprets the escaped characters giving us new lines needed by the HTTP protocol. We then redirect into cat to read the output.

user@host:~$ echo -e "GET / HTTP/1.0\n\n" >&3
user@host:~$ cat <&3
HTTP/1.1 200 OK
Date: Sun, 27 Feb 2011 16:34:26 GMT
Server: Apache/2.2.16 (Ubuntu)
Last-Modified: Sun, 27 Feb 2011 16:25:27 GMT
ETag: "901-e-49d4602cddfad"
Accept-Ranges: bytes
Content-Length: 14
Vary: Accept-Encoding
Connection: close
Content-Type: text/html

<h1>Hai!</h1>
user@host:~$


Cool huh? Not a telnet, netcat or wget in sight...

Saturday, 5 February 2011

A simple Monte Carlo simulation with R

I recently read How to Measure Anything: Finding the Value of Intangibles in Business by Douglas W. Hubbard. It's a fascinating and informative read on the problems and solutions of measuring "soft" variables typically found in business. Fluffy variables like productivity, quality, risk can be measured if you use the right techniques and work within the limitations of measurement and statistics.

There is an excellent example of using a Monte Carlo simulation (or method) to calculate the risk of leasing a new machine in a manufacturing process. You can find the example on pages 82 through to 86.

Given my hatred of spreadsheets and having recently started playing with R, I thought I would have a go at replicating the simulation using R.

This is what I wrote. Please note I'm still an R n00b so some things can be done better no doubt.

######################### Variables #######################

# Firstly set this to TRUE if we want to save our plot as a 
# PNG and if so, what file and dimensions
bDoPNG <- FALSE
sFile <- "htma.png"
iWidth <- 1024
iHeight <- 768

# The following values represent our 90% confidence interval 
# (CI) ranges for the various inputs to our simulation.

# We are 90% confident that the maintenance savins per unit 
# is between $10 and $20
vMaintenanceSavingsPerUnit <- c(10,20)

# We are 90% confident that the labour savings per unit 
# is between $-2 and $8
vLabourSavingsPerUnit <- c(-2,8)

# We are 90% confident that the raw material savings per unit 
# is between $3 and $9
vRawMaterialsSavingsPerUnit <- c(3,9)

# We are 90% confident that the production level per year 
# will be between 15K and 35K units
vProductionLevelPerYear <- c(15000,35000)

# The annual lease is $400K so we need to save this amount 
# just to break even for the investment
iAnnualLease <- 400000

# This is a quick cheat which basically means there are 
# 3.29 standard deviations in a 90% confidence interval
iStdDevCheat <- 3.29

# This is the number of simulations we are going to run
iNumberOfSims <- 100000

##################### Generate the basic data ###################

# A new data frame initiated to have iNumberOfSims rows in it
dData <- data.frame(seq(1,iNumberOfSims))

# We use the rnorm function to generate a distribution across 
# all the simulations for the maintenance savings. The mean is 
# literally just the mean of the range (e.g. (20-10)/2) and we 
# also give it the standard deviation of (20-10)/3.29.
dData$MainSavings <- rnorm(iNumberOfSims, 
mean(vMaintenanceSavingsPerUnit), 
diff(vMaintenanceSavingsPerUnit,1,1)/iStdDevCheat)

# Same again for the labour savings
dData$LabourSavings <- rnorm(iNumberOfSims, 
mean(vLabourSavingsPerUnit), 
diff(vLabourSavingsPerUnit,1,1)/iStdDevCheat)

# And the raw material savings
dData$RawMaterialsSavings <- rnorm(iNumberOfSims, 
mean(vRawMaterialsSavingsPerUnit), 
diff(vRawMaterialsSavingsPerUnit,1,1)/iStdDevCheat)

# And finally the production levels
dData$ProdLevel <- rnorm(iNumberOfSims, 
mean(vProductionLevelPerYear), 
diff(vProductionLevelPerYear,1,1)/iStdDevCheat)

# We can now create our total savings column based on the 
# inputs given. Because R is a vector language, the below 
# operation is applied to each row automatically.
dData$TotalSavings <- (dData$MainSavings + dData$LabourSavings +
dData$RawMaterialsSavings) * dData$ProdLevel

# Later on it will look better on the graphs if we deal
# with numbers in thousands so create a couple of shortcut variables
dData$TotalSavingsThousands <- dData$TotalSavings/1000
iAnnualLeaseThousands <- iAnnualLease/1000

# We now let R generate a histogram of our savings but without 
# actually plotting the results. We will end up with a series of 
# buckets (aka breaks) which will go on the X axis and the number 
# of simulations that fell within each bucket (on the Y axis)
hHist <- hist(dData$TotalSavingsThousands,plot=FALSE)

# We create a new data frame for the breaks and counts 
# excluding the last break
dHistData <- data.frame(
breaks=hHist$breaks[1:length(hHist$breaks)-1], 
count=hHist$counts)

# We can calculate the chance of the project making a loss as 
# the sum of counts where the breaks were less than 
# the annual lease (ie. $400K).
fPercentChanceOfLoss <- 100*sum(subset(dHistData,
breaks<iAnnualLeaseThousands,
select=count))/sum(dHistData$count)

# Calculate the median of the savings. That is 50% of
# the simulations had savings less than the median 
# and 50% had savings of more than the median.
fMedian <- median(dData$TotalSavingsThousands)

# We put that chance of loss in a sub title
sSubTitle <- sprintf("%02.2f%% chance of loss at $400K expense, 
median savings at $%02.0fK", fPercentChanceOfLoss, fMedian)

# Check whether we want to save our PNG
bDoPNG && is.null(png(sFile, width=iWidth, height=iHeight))

# Now draw the actual histogram, setting some labels but without 
# drawing the axis
hist(dData$TotalSavingsThousands,col="lightblue", main="Histogram of Savings",
xlab="Savings per Year ($000s in 100,000 increments)",
ylab="Senarios in Increment", axes=F)

# Add the sub title
mtext(sSubTitle)

# Draw the Y axis using default parameters
axis(2)

# Now draw the X axis explicitly setting values of the ticks/breaks
axis(1, at=hHist$breaks)

# That's it, turn off output if saving PNG
bDoPNG && is.null(dev.off())
And this is the pretty graph it produced. It should look similar to the one on page 86.

So yeah, go buy the book. Read it. Then have fun with R :)