Wednesday, December 23, 2009

I just read the “Does the Distro Matter” post and immediately felt compelled to relay some of my experiences, not with distributions, but with recruiters. What is the deal with these people? Does it matter if a company is looking for experience with RHEL5 and you have experience with CentOS5? No! From a technical point of view they are almost identical! I would personally say that if you have 5-10 years experience with almost any distribution you should be able to move into a new one easily. But core skills do not matter to recruiters, only how well you match the buzz words. I helped a friend install a distributed MySQl environment for a small company once, but since it was not my company it apparently does not count as real experience. I may be a Java programmer with experience with several IDE’s but since Microsoft Visual Studio’s is not one of them it is the same as if I answered “What’s Java?” Why is this? What is wrong with these people?

I blame three things:

1) The recruiters, or even the companies central HR, are not in the field and thus do not understand the buzz words on their sheet. Most of them would not know the difference between one distribution of Linux and another. They also would not understand KRB/AD cross platform integration if it hit them in the face. They only know what is on the list in front of them. To them the difference between FC10 and FC11 may as well be the difference between speaking Bask and Thai. The people who would understand never get to see your resume.

2) The people who are looking for employees are too specific – mostly due to the fact the screeners don’t know anything. Do you ask for some one who understands the differences between Exchange Server 2003 and Server 2003 with Exchange? No, but they will ask for some one with RHEL5 (Even if they are using CentOS5) because that is what they are using. They could just ask for 15 years Linux experience but they don’t. In addition it would seem most people write the job requests so specific that only an internal candidate could possibly fill them. For example “Must have 5 years experience with Joseph ERP” Is not valid when Joseph ERP has less then 2000 outstanding user licenses across the globe and has only produced a commercial product for the last 6 years! Just because they guy your replacing had 5 years experience with it does not mean you need to be looking for some one with the exact same experience. Some times this is done so they can higher the internal candidate they had in mind before HR told them they had to post the job publicly, or sometimes this is done so they can decline any candidate they don’t feel is a good personal fit without filling out a ton of paperwork with HR, and sometimes it is because the person writing the job description was not thinking; i.e. the Joseph ERP example above.

3) The current applicant pool is so great they can get away with it. When I am loosing out to people with masters degrees, $100K worth of certifications, and 15 years experience, for a $20/hour job on a 3 Mo. contract you know there is a problem. That is too much experience for that job. On the other hand, if you have reached upper management, you have been apparently poisoned from playing in the field. No on wants to higher a former Director to manage their system migrations, so I should be grateful that they have the good sense to realize the over qualified person is a barging and buy. But this does mean that almost no mater how strict you write your requirements and how literal the recruiters follow them, eventually you will find a person who fits your requirements, or at least lies well enough to get in the door.

As an abstract complain, why do they ask what other jobs you have applied for and then disparage where you have applied? And why would they say they know some one at company X but when it is not the person you know at company X they believe you are lying about knowing any one at company X? And finally you do not have the technical experience to understand what I am telling you so why are you interrogating me regarding my technical experience. When I say that I believe DR is dead because of the reduced cost of CDP w/ FO why do you conclude I have no DR experience? I have many other complaints but I think most of them have been covered in prior posts.

Tuesday, December 22, 2009

Host Identity Protocol

This is going to be an uncherictersticaly short post; however I am woundering what is up with the Host Identity Protocol? For those who do not know the HIP allows for the implemntation of the HIT or Host Identity Tag. This is designed to solve the issue of moblie people losing thier sessions and streams when moving between networks. The assumption is that all, or atleast the majority, of services identify a client based upon there IP address. Acording to the abstract this will also make setting up servers behind NAT fierwalls easyer and increase security by uniqly identifying each host similer to an ssh finger print.

However I have three questins that leasd me to ask why this is a real projects:
1) Session ID's - When an application server needs to indentify a client they issue a session id to the client. For example when you hit my tomcat server and start a particuler session I don't care what network the client is on or what physical server the client may be hitting, I need to identify the client for the life of his session. Session ID's are already in place - so what what portion of session ID's fails so badly that it needs to be replaced with an entierly seporate project?
2) Existing Identifyers - Why not use existing identifiers such as the MAC address of the clients network interface? While it is true that ones IP can change all to frequently during the cource of a normal mobile operation, the MAC does not.
3a) Security Vs. Privacy - The answer to the above question leads to my final concern. Mac addresses can change when one user changes network interfaces. However the protocol sugested, as I understand it, sits between the Addressing and Transport layers. This is tradtionaly tied to a network interface thus changing interfaces would also change your HIT. The other one would be that MAC addresses can be esaly changed and this breaks it from being a consistant and secure identifyer. But HIP suffers from this as well.
3b) Security Vs. Privacy - There was a way to solve this security issue introduced back in the 90's; it was called CPU ID. This was a hardware based ID that was easy for all applications to read but very difficult to change or mask. The problem soon became that everyone screamed PRIVACY! It turns out people do not want to be uniquely identified with all of there activities on line at all times. As such manufactures quickly included the option to volunteerly turn off CPUID, and that was so popular it started to come turned off by default. This, of course, made it a completely invalid identifier.

So, unless you planning on replacing an entirely server side expiration based session ID I do not see how this will work. However, at the same time, I can not see why you want to replace session ID's. This leaves me to ask - Why is this a real project?

This is an actually question I am hoping some one can answer so please feel free to comment to this via the blog comments, or like most of you do, via twitter or facebook.

Friday, September 4, 2009

Open Source PDF Printing in Windows

Configuring broad application PDF and TIFF printing using Ghost Script and Red Mon port redirection.

If you have ever wanted to produce PDFs without having to pay for Adobes PDF writer please read on. If you have no need to create pdfs for free, then this is not the article for you.

Setting up Ghost Script
Down load the latest version of ghost script.
  • Download from http://www.cs.wisc.edu/~ghost/
  • (AFPL is the Always Free Version, and the GPL is the OK for commercial version, other wise they are nearly identical.)
Install into C:\gs
  • Run the installer and accept all the defaults except the location. Change that to C:\gs. At this time there are still some issues with directory names that include spaces.

Setting up RedMon port redirection Download and install the latest version of RedMon

Setting up printer

Since ghost script is a command line tool that uses it’s own environment, each printer will either be used at the ghost script command prompt, or via a pre written script. Obviously a script is better. We then must configure the port redirector to use this script


Printer Script

A script is just a test file that contains the required commands and settings. We will be creating one called “tif.rsp” and we will be placing it in the C:\gstools directory.

  1. Create a text file
  2. Cut and past the following lines.

-Ic:\gs\gs8.54\lib;c:\gs\fonts

-sDEVICE=tiffg4

-sOutputFile=c:\print.tif

-dNOPAUSE

-dSAFER

-dBATCH

-r600

-sPAPERSIZE=a4

  1. Save this as tif.rsp
  2. Copy tif.rsp to C:\gstools

Script items explained

If you want to change some of the variables inorder to make more printers that do difrent things, here is a list of each and wht they do.

  • -Ic:\gs\gs8.64\lib;c:\gs\fonts: Tells they system were to find Ghost Script and its fonts. It is highly likely you will need to change the version number to match your particular set up.
  • -sDEVICE=Device Name: Instruct GS to use a particular virtual encoder. Some acceptable encoders are

jpeg (several additional options are required to control quality with JPEG)

pdfwrite (sounds like this prints pdf’s to me)

tiffg4 (four bit black and white)

tiffgray (eight bit grey scale))

tiff12nc (12 bit RGB color)

tiff24nc (24 bit RGB color)

tiff32nc (32 bit CMYK color)

tiffsep (Creates one 32bit CMYK and a 8 bit grey for each separation)

There are also devices for fax, BMP, PCX, Photoshop PSD, Adobe PDF, PS, EPS, PXL, and many more.


Create a virtual printer
    1. Go to Printers and Faxes (Off of the start menu or in the control panel)
    2. Add a printer
    3. Choose Local
    4. Set the port to RPT1 (created by the RedMon installer.)
      • If this port does not exist, add it.
      • Select the Create a New Port option and choose Redirected Port.
      • Name it RPT1: -or- RPT2:, or RPT3:, etc.
    5. Choose the Generic manufacturer and MS Publishing Imagesetter for the type.
    6. Open the printer just created
    7. Go to properties
    8. Go to the Ports Tab
    9. Set the configuration options for the port to:
    10. Redirect this port to the program: C:\gs\gs8.xx\bin\gswin32c.exe (obviously you will need to change these to your particular location for each)
    11. Arguments for this program are: @c:\gstools\tif.rsp -

(See window snapshot below for example) Pay attention that there is a hyphen at the end of the Arguments for this program field.



Prompt for filename

When output is set to Prompt for filename, the redirection program should write its output to a file. The name of the file is obtained from a Save as dialog, can be obtained by inserting %1 in the program arguments. If you wish to place %1 in the program arguments and do not want it substituted with the filename, you must instead use %%1.

For example, the program arguments might include:

 -sOutputFile="%1"

This method is recommended for use with Ghostscript, and is commonly used with a PostScript printer driver and the Ghostscript pdfwrite device to create a PDF writer.

Do not share a printer which uses RedMon with Prompt for filename. RedMon will not allow this because the Save as dialog box would appear on the server computer, not the client which submitted the print job.

Fine Tuning

After you install the printer, you may want to fine-tune its properties. To do so, open again the Printers setting folder, and right click on the printer; select `Properties`.

  1. Under `Device Settings`, set very low values (e.g. 5) to the following two parameters:
    1. Minimum font size to download as outline
    2. Maximum font size to download as bitmap
  2. Next, go to the `Advanced` tab, and from there, select `Printing Defaults…`. In the window that opens, select `Document options`, then `Postscript options`. Set the following two options:
    1. PostScript Output Option: Optimize for Portability
    2. TrueType Font Download Option: Outline


For large format color printing: I recommend using the Xerox Document Center CS50 PS. Just make sure you tell the driver the paper sizes you want to use are loaded. To do this just go back to the properties window, click on the Device Settings Tab and under the Tray Assignment menu select a few paper sizes. Personally I believe, in addition to Letter, Legal and Tabloid are musts.


Appendix:
We pages to know for this application set:

http://www.cs.biu.ac.il/~herzbea/makeP.htm#_Creating_PDF_using_Windows_Office_A

http://www.cs.wisc.edu/~ghost/doc/cvs/Devices.htm

http://server3.nethost.co.il/set_tif.html

http://www.noliturbare.com/

http://server3.nethost.co.il/set_tif.html

http://www.cs.wisc.edu/~ghost/doc/cvs/Use.htm#Known_paper_sizes

Sunday, May 17, 2009

Installing VMServer 2 on Ubuntu 9.04

I just installed VMServer 2 on Ubuntu 9.04, and like most linux VM installs, there is a trick. Below you will find the cheet sheet version in a hope that it saves some of you some time when you go to install VMServer 2 on your Ubuntu machine. I make the assumption that you have installed things before, and maybe used VM ware in the past, but if that is not the case and you would like a more detailed guide just let me know.

0) Prep the system
Find out what kernel your running with uname -r
Install the headers with sudo apt-get install linux-headers-`uname -r` build-essential xinetd where uname is the name of your kernel.

1) Get VMWare server. http://www.vmware.com/products/server/

2) Sign up or login & don't forget to make note of your serial number

3) Download the Linux tar file, not the rpm

4) Expand the tar file
tar xvfz VMare-server-*.tar.gz

5) cd into the folder

6) Fix the installer (Otherwise you will have trouble compiling vsock)
Either - install the patch
get the patch (created by delgurth @ http://blog.delgurth.com) from http://ubuntuforums.org/attachment.php?attachmentid=94477&d=1227872015
  • If you have not done the install (and assuming your still inside the expanded VMServer install folder)
patch bin/vmware-config.pl /path/to/vmware-config.pl.patch
  • If you have already installed
sudo patch /usr/bin/vmware-config.pl /path/to/vmware-config.pl.patch

Or - Fix it manually (Finish install first)
tar xvf /usr/lib/vmware/modules/source/vsock.tar
cd vsock-only
sed -i 's/^\#include //' autoconf/*.c
include
make
sudo cp vsock.o /lib/modules/$(uname -r)/misc
sudo ln -s vsock.o /lib/modules/$(uname -r)/misc/vsock.ko
sudo depmod -a
sudo /etc/init.d/vmware restart

7) Install / reconfigure
  • If you have not installed
sudo ./vmware-install.pl
  • If you have already installed but skipped the vsock install,
sudo /usr/bin/vmware-config.pl

8) Preparing a user
During the configuration, when prompted for the admin user you can enter your user name, or, if you accepted the default, your can use root
  • If your using root you will need to set a root password, since by default root on ubuntu is locked without a password (make it very strong)
sudo passwd root

9) Fixing permissions
Login by opining a web browser and going to https://server:8333
Enter root, or if you set the admin, the admins user name
Enter your password
Go to Permissions
Click on New Permission
Enter your user name
Choose the Admin role
Remove root, if root was set as the default admin user

Thursday, April 30, 2009

RSync and SSH Keys - A Presentation on backups

Recently I did a presentation on RSync and RSnapshot focusing on using it for backups. You can down load the presentation from http://www.theonealandassociates.com/files/rsyncPortable.zip or if you do not have Open Office yet (a free and powerful Office suite comparable and compatible with MS Office and Word Perfect) or another application that can handle the ultra efficient open document formats, you can get the power point version (at twice the total download size) at http://www.theonealandassociates.com/files/rsyncPortable1_with_ppt.zip
The presentation is narrated and is easy to follow, but for you looking for the cliff's note version

1. You do not need to set up an rsync server to use rsync. The server function handles file browsing and other functions and set up is not required for transfers.
2. If your going to automate your backups going from one computer to another, you should implement some basic security. Moving your files over SSH for example is easy, but you need to set up a pair of ssh keys so that you don't have to enter a password to shh from one server to another. Simply perform the flowing commands from the production server (now known as Server A) and just use remote execution to perform your work on the backup server (hence forth know as Server B)
2.i) backupuser@ServerA:~> ssh-keygen -t rsa
    a) Do not enter a passphrase (just hit enter)

2.ii) backupuser@ServerA:~> ssh backupuser@ServerB mkdir -p .ssh
    a) This creates an ssh directory for the backup user on server B

2.iii) backupuser@ServerA:~> cat .ssh/id_rsa.pub | ssh b@B 'cat >> .ssh/authorized_keys'
    a) This moves the contents of your public key to the remote servers authorized keys file
    b) You can just as esaly open a second terminal window, log into server B, vi the .ssh/authorized keys file, and cut and past from the vi window of your .ssh/id_rsa.pub file on server A
2.iv) backupuser@ServerA:~> ssh b@B chmod 0700 .ssh/
2.v) backupuser@ServerA:~> ssh b@B chmod 0600 .ssh/authorized_keys
    a)If you don't restrict the permission SSH will ignore the file by default and the whole thing will fail to function.

3. Set up an automates script containing a command like
3.i) rsync -a -r -v -t -z --stats --progress -e ssh /dir/for/destination/files/ backupuser@ServerB.MyDomain.com:/dir/for/source/files/
3.ii) there are more detailed instructions for windows and Linux inside the presentation.

You can verify the whole thing is working, and trouble shoot problems, by ssh'ing witht he verbose option -vvv and looking at the recipiants logs /var/log/secure
backupuser@ServerA:~> ssh -vvv b@B
b@B sudo -tail -f /var/log/secure



Your basically done. Though the presentation fills a nice half hour time slot and provides more detail; as such I highly recommend downloading it from the links at the top of the post ;)

Wednesday, April 29, 2009

Please show your work and label your units

A small rant on units. I have run into meaningless use of units so often lately that I thought I post a small rant here. “Please show your work and label your units” is something every instructor I had since grade school has been reiterating, so why do we have a problem with people failing to do just that? Units are important as they provide understanding to numerical measurements. To do so the units used in quantitative measurements should be appropriate for what is measured, consistent with comparable measurements, and above all well labeled. To begin with I would like to address the disturbingly prevalent absence of units. For example I was recently looking at a simple matrix chart in a magazine comparing baby monitors. One of column was labeled distance and had units such as 700, 300, 500, but no where in the article or the chart did it tell you 500 what; inches, meters, feet, miles, what? The second frustrating error is the inconsistent use of units. To use the prior example I would have only slightly less disdain if the article indicated 700 In., 300 M, 500 Ft. Yes, one now has the information required to understand the measurement and to do the comparison, with conversion, however the inconstant units do not allow us to use the chart in a useful way to make quick comparisons. The third issue I have is the use of units that either don't make sense for the measurement or don't convey accurate information. Simple said units need to make sense for what the measure. For example, the axiom that one should always leave two car lengths between your car and the car ahead of you, suffers from this bad choice of units. The units, car lengths, are first non standard (i.e ranging 2.7 M for a Smart to 5.7M for a Lincoln Town Car) and second, meaningless. I say meaningless because stopping distance is a function of the speed the vehicle is traveling. For example if I am travailing at 10 kph I can easily stop in less then 3 meeter's. However if I am travailing at 150 kph I would not be able to register the need to stop and have my foot hit the break, let alone stop, inside 5 times that distance. Let us look at it this way, we want a fixed unit to govern distance with regard to a variable distance/time function; thus we need a constant time unit. Simply said, if you stay two seconds behind the car ahead of you the distance will become greater at higher speeds, thus accommodating for the greater stopping distance required.
Now I understand when what you are looking to measure quantitatively is intrinsically tied to a qualitative value it gets harder. For example a recruiter asked me how many years of experience I have writing disaster recovery plans. My first thought was, who does nothing but write disaster recovery plans for their career, that must be a rather specialized market. I was tempted to say 10 years since my first disaster recovery project was in 1998, when I took over a flawed DR project, and the last one I wrote was in 2008 when I wrote one as part of developing standard operational documentation. However the first plan was for an environment I understood with a budget I understood and was limited in scope (servers and licenses for the IT department which served a greater college at the local university) and the last one was integrally tied to the development of a brand new .com operational procedures manual which I was helping start from scratch. And while I had written several in between, (~5 in total) they had all been part of normal operations for the company and not separately measured. In addition many of those had the assistance of an establish continuity plan to borrow from. I would argue hours, not years, would have been a more accurate reflection of my experience. For example if you have two candidates, one of which has 10 years experience, and one who has 2 years, which one should you higher for a managerial position? What if the one who has 10 years managed one worker who clocked only 8 hours a week for 10 years while the second one managed a department of 50 full time employees and a dozen contractors for two years? One has more “years” but at the same time far fewer hours of managerial experience. However, this is nearly an impossible paradigm shift for most people. In the above example if I had replied to the recruiters experience query with “I have over 2,500 hours of disaster recovery experience” it is likely I would have received a blank stare.
In conclusion, please use appropriate, consistent, and well labeled units in all quantitative measurements. Doing so will make your work measurably more useful.

How to hack your head

This post is taken from a conversation occurring on the Phoenix Lunix Users Group, and giving that the information is both obvious and important, but never followed, I felt that it was important to present it here for all of you in hope it will help change the way you work.

Hack your Head
or
How to maintain optimal thought processing abilities.

When working on systems, we often get into the zone, don't eat, don't stretch and don't maintain optimal glucose for thought processes. We therefore get REALLY STUPID and don't notice it. This can happen late at night, or at noon, but when it happens, we must acknowledge our own limitations.

If you have attacked a problem for 30 minutes using caution, and addressing resources, and cannot fix it, get up and walk around, go outside look at the sun. Be assured that some part of your higher functioning is still working on it in a creative way. Poking a problem for 30 hours with the same information you were unable to think with initially, is not productive, it's stupid. You might need to attempt to explain the problem to another (like we do on the PLUG list) in order to get clarity for instance. We learn to package and organize abstractions to develop if/then/therefore logic via PLUG list discussions, and or questioning our initial assumptions. There are a great many people who never learned systems analysis using documentation, Linux veterans did not have the luxury of Google (RHCE does not allow you to use anything but the system itself). Others essentially do not think in language, but use higher functioning to solve problems. All of us develop a higher functional way to solve problems, but sometimes that process fails and we must therefore use language or logical dissection to find our way out. We all LOVE doing this, it's incredibly addicting, but it also has some mental health risks, that must be mitigated with lifestyle changes.

Be sure to follow these daily cautionary steps to remain healthy:
1) Eat a good mix or protein, polyunsaturated fats and carbs on a regular schedule.
2) Sports drinks are just going to make you crash badly, however, adding B and C vitamins with a good breakfast, lunch and dinner will go a long way toward allowing you think effectively. Caffiene does assist with some times of tunnel attention, but can also cause health issues. The best and brightest don't drink coffee all day - that's for marketing people.
3) When you are too tired or ill to work, you must acknowledge it. Too many systems administrators just work and work and work, over and beyond what is healthy and make grave mistakes when tired.
4) Build a healthy life away from computing to provide for emotional balance. We get so far into the abstract analytical virtual realm and develop functional stunting, especially under the pressures of 24X7 Uptime.
5) Talk to others in a deep personal level; if you have noone to talk to, call your own voice mail or record yourself. Einstein and thinkers of the last century all kept uber personal journals. The mere act of talking about things or examining issues through grief and anger to laughter will assist development of free flowing heathy emotional states and that all important core of individuality and muscled critical thought.
6) Do various balanced emotional and physical things that restore your individualism, such as walk/bike, laugh, and play games, hug, chase the opposite sex and dream or create, and listen to music. Allowing children to swing and listen to music is known to stimulate intelligence.
The sheer number of IT professionals and college students taking SSRI neurotransmitter uptake inhibiters is astounding, and certainly not necessary. Exercise has been shown to be more statistically effective over time than SSRIs. Tobacco has been long used as an anti-anxiety medication, however, smoking does kill. Anxiety from balancing unrealistic, unevolved demands from people who cannot understand you when you talk is best mitigated with laughter and zen detachment.

I am sure you all can relate to Number 10 on the top ways to Hack your Brain http://brainz.org/brain-hacks/
O'Reilly has some good books that are an amusing way to wait for your greater intelligence to find the best solution to another problem.
1. Mind Performance Hacks: Tips & Tools for Overclocking Your Brain (Hacks) by Ron Hale-Evans
2. Google Hacks: 100 Industrial-Strength Tips & Tools by Tara Calishain
3. On Intelligence by Jeff Hawkins
4. Mind Wide Open: Your Brain and the Neuroscience of Everyday Lifeby Steven Johnson
5. Getting Things Done: The Art of Stress-Free Productivity by David Allen
6. Firefox Hacks: Tips & Tools for Next-Generation Web Browsing (Hacks) by Nigel McFarlane
7. Knoppix Hacks: 100 Industrial-Strength Tips and Tools by Kyle Rankin
8. How the Mind Works by Steven Pinker
9. This Is Your Brain on Music: The Science of a Human Obsession byDaniel J. Levitin

Derived from HackFest Series PLUG ,
Written by Lisa Kachold:
Edited by Bryan O'Neal

Saturday, February 14, 2009

Simple Thoughts on Disaster Recovery

Introduction to a disaster recovery plan
In its simplest form a disaster recovery plan is just what it sounds like; you plan to recover from a disaster. This may sound vague, but much like the advice “you should purchase insurance” the details differ for every person, group, department, and company. The insurance you need is not the same as the insurance your 19 year old college bound daughter needs, and it is certainly not the same insurance Xerox Corp of America needs. However, I will try to give some general advice about disaster recovery plans for small businesses, and then some simple how-to sections that will get you started.
This posting will cover:
  1. The justification for a disaster recovery plan, including how to calculate your budget for implementing a disaster recovery plan.
  2. How to determine what you need and build a plan.
  3. And a simple example of how I do backups.

Justification:
Q) Do I need to worry about disaster recovery? If I have an insurance policy would that not cover everything?

A) Yes, you should still have an active disaster recovery plan. No your insurance policy does not cover the value of your data loss and likely does not cover the income you will lose from being unable to conduct business. Your theft/fire/flood insurance is great for replacing your hard assets, such as your building. It may even compensate you for some of your soft assets. Yet it can not replace soft assets, such as your operational data, and the compensation will never equal the damage from the loss.

Q) How much will it cost me?
A) Less then you think, and far less than it’s worth. However the end cost depends on your needs. A range for most small businesses, that I feel confident about, start at a variable cost of $250 per year on the low end and up to a total allocated cost of $25K a month on the high end. I know that is a big range, so I'll try and break it down a bit over the course of our discussion.

Q) How will I know how much my company should budget for this?
A) That depends on what you need to do. One obvious answer is how much can you afford? If you think you can’t afford anything, then it is time to see what other opportunities you need to first forgo in order to take care of this; because what you cannot afford to do is not implement a recovery plan. First, perform some basic risk assessment. You are really looking for two key components: how likely is it that something will happen? and what will it cost when that something occurs?

Calculating the risk and reward
(Skip if you are not the cost accounting type).

Risk Beta
In it’s simplest form Risk Beta is the likelihood of a specific event occurring in relation to the likelihood of the group of event happening. This helps us decide if a particular investment is good or bad when compared to other investments. What is the likelihood you will have a catastrophic data failure in some period of time? Let us use a five year period, assume you have one server in your office, and you have no paper files, you have no backups. Now let's add up the odds for each way you can lose the server, fire, flood, theft, hardware failure, etc. happening in the next five years. We will call this number X and it will range from 0 (no chance any of it will happen) and 1 (100% chance something will happen). What X actually is depends on a lot of variables and some statistical shortcuts. In addition you may choose to weigh certain effects differently based on impact cost. (Side Note 1: Consult with a professional?)
If you really simplify it then most people can come up with a rough number from a general experience. Have you or any of your neighbors been broken into? Have there been any fires or floods? How often does this happen? If two of your 4 neighbors have had a break in during the last five years, and their computer equipment was damaged or stolen, then you may assume you have a 40% chance ( 2 businesses out of 5 total in your area) of being broken into in a five year period. Each chance of a failure is cumulative and should be added together, at the same time, each option of redundancy reduces the chance. If you have a 40% chance of being broken into and a 100% chance of losing half of your computer equipment during the break in (50% chance of losing any one computer) then the chance you will lose your server is 40% x ½. If you also have a 10% chance of fire then you would have a (40% x ½) + 10% or 30% chance of data loss over five years or 0.5% chance you will lose all of your date in any given month. Similarly if you have two locations each having a 0.5% chance of total data loss over any given month, and each site backs up the other site, and it takes at least one month to get a site back up with the data from the other site, then you would have a chance of 0.5% * 0.5% or 0.025% chance of data loss. As you can see the back up reduces the likelihood of total data loss considerably. For our purposes we will make up a number like .0234 or a 2.34% chance for total data loss.
Please note that a comprehensive qualitative risk analysis which separately handles each risk, vulnerability, and control is an amazing asset and should be required for ever medium business and many small businesses. In addition there is more to worry about than just one server and theft.
Here is a sample chart of how data loss can occur.


Risk value:
What is the data worth? The value of the risk is the total potential loss times the risk beta, ie. likelihood the loss will happen. Again, let us assume you have no data that holds a legal responsibility for loss, such as client social security numbers or medical records. Odds are high that if you had to put your staff through something like HIPAA training you have professional IT people on hand who understand these issues. So those situations aside, let’s look at just the effects to your average small company. Typically we like to say that 45% of companies that suffer a catastrophic data lose never reopen, and 90% are out of business within two years. Again I am greatly over simplifying this, but let us say that over our five year period you could expect to earn ten million in revenue, or two million a year. Now let us say you go out of business exactly two years after losing your data. Now I realize you would not have earned to your potential during those two years, but we are keeping things simple. In essence you lost the potential revenue for the remaining three years, or six million. You have a 90% chance of this happening if you lose your data and a 2.34% chance of the loss occurring. So really you are looking at a normalized cost of ~$126K over five years or ~$25K per year.
Return on Investment (Simplified)
Now if you have a current risk beta of 2.34% for a risk value of $25K/year and you can reduce your risk of loss (beta) down to 0.234% then your new risk value of $2.5K per year and thus you have a return of $22.5K per year for the investment. Now let us assume you can do this (reduce the risk) for only $2.25K per year. This would give you a 1,000% return on your investment. What you actually need to spend, what kind of a reduction in risk you can get, and what are your opportunity costs, are all things to consider when making your budget and evaluating the project. Now, many people will arguer that cost reduction and risk abatement are not the same as revenue return, but that is a discussion for another time. We are, after all, trying to keep things simple.
(Side Note 2: Does every business require this?)

How to plan
If you’re lucky enough to be working in a company that has a continuity plan, take a look at it and pull out the metrics by which you need to measure success. The main thing to consider is how long can you be offline, as a whole, and as each individual piece of your company. Odds are that your creative team can be offline longer then your accounting department, which could probably stand a longer outage then your sales force. Then again, perhaps not. It depends on what drives your business. No mater how you stack it every project gets broken down into 6 stages: Investigation, Evaluation, Proposal, Implementations, Documentation, and Testing. Please notice I place documentation towards the end. This is because what happens during implementation may be different then what was originally planed. In addition documents need to be constantly updated to stay useful. Similarly testing needs to be regularly performed. I can not stress this enough, no mater how good the plan, no mater how foolproof you feel your implementation is, you will find issues during testing. Addressing these issues makes all the deference when a real disaster strikes.
A disaster recovery plan, in essence, determines how you will rebuild your business. For any department include: each workers PC, all the servers and services, phone systems, and sales system, etc. It is not enough to think about how you will back up your data, but instead you must plan for how are you going to get every working again. At this time I should also note that if your work is primary paper driven, you should consider initiating a paperless work place; at the vary least digital arching. The reason is simple, it is much easy to preserve, safeguard and restore digital documents then it is for paper files.
The big divide, as I see it, is whether you need data only, a cold site, a warm site, or a full hot site? Some essential functions may need multiple points of redundant fault tolerance with a full hot site waiting for them. Others may need little more the their data. The example is this, if sole server controlling your company goes down and you have a tape with all the business data, up to the moment it went down, how long will it take you to get the business back up and running normally? If you have just the tape and need to order a new server it could take days, or even weeks to get a box, then you need to install / configure the server and all of your applications before you can restore the data. On the other hand if you had a cold box already, you would just need to do the configure and restore. If you had something hot with everything already installed and all the data on it you could point people to it and start running. As you can see I am defining data (sub cold), cold, warm, and hot as the following
Data only: The data only plan is by far the cheapest, since very little capital expenditure is required to implement this strategy. However it also has the longest delay before you can start working again. In the old days, a data only plan would cause a work delay of a minimum of one month. However, with the advent of commodity server farms, like Go Daddy, and the increasing ability to telecommute, a good shop can have some services up the same day and the entire IT/IS infrastructure can usually be rebuilt in less than a month. The fact remains that your IT/IS shop will need to procure a new site and set up, from scratch, everything you need to get people working.
Cold site: A cold site is a place you can go and start a new at a moments notice. However you will have to reproduce everything your people need to work. For example, if you have an agreement to use an associates unused office space, which has desks, power, and basic connectivity, then you have a cold site. You will still need to provide all equipment required for you people to do their jobs.
Warm site: With a cold site everything is in place, you just need to do some adjustments and upload the latest data and away you go. The best way that I have seen this plan work is when a company has some extra space; be it the factory floor, a warehouse, or office space arranged through corporate agreements. With an in place phone system and a spare server or two in waiting, and a stack of spare desktop pc's waiting for users, you can often have a warm site configured for use in a matter of hours. This sort of set up is not regularly maintained, uses excess resources to keep capital costs down, and basically provides a head start on rebuilding the enterprise. If you need a warm site but do not have the spare resources for it, then I would recommend contracting with a company who specializes in providing warm sites for your use. Nearly every metropolitan city has one and the costs vary, but if you need one you will usually find the costs can be reasonable. However, if you have strong business relations with companies that, for one reason or another, have unused office space, I would recommend setting up an arrangement to use their space if required. Often times the involved parties can come to an extremely reasonable and mutually beatifically agreement.
Hot site: With a hot site, you not only have all of the physical plant items in place, but you regularly maintain them so at a moments notice you can switch your entire enterprise over. Obviously this is the most expensive option and is often implemented for only the most critical divisions of the company.
Concerns for manufacturers: If the company in question has critical divisions that produce goods, not services, then you will also need to consider what would happen if you production floor was suddenly unusable. If you have an ample storage, have a stock pile of completed work, and work in a primary push business, you may be fine until you can replace the lost PPE. But if you are working with a JIT inventory system in a pull environment, you may need to consider a way to reproduce your production center in a matter of hours not days. I only state this as much of what I am discussing here concerns disaster recovery for your intellectual workers; sales, marketing, accounting, customer service, etc. and does not meet the needs of manufacturing environments.

Categorizing your needs:
Of course this kind of planning extends to more than just your data. For example if you need to have your sales force on the phones and at their desks working on computers 24 hours a day 7 days a week, and you can not be down for more than an hour, then you need phones, desks, computers, the server, your data, and your people up running in a new space within an hour. Fortunately, if you really do need this kind of insurance there are a number of companies that sell this sort of service by over booking space for just such an occasion. A simplistic view of this situation is like hotelling your workers some place. Some company has phones, internet, desks, and computers for 100 people. And they sell you an insurance policy, depending on terms and conditions, that allows you to place a given number of people in that space at a moments notice and leave workers there for a period of time. They sell a similar policy to 1000 other companies and gamble that no more the five will need it any given time. But most of this is covered in your continuity plan, should you be fortunate enough to have one.
Minor loss contingency – be prepared
I should also note that your disaster recovery plan should take into account other forms of loss rather then just total catastrophic failure, caused by such things as fire. For example you need to think of things like power outages. How often does the power go out in your area? How long is it out, how much does that outage effect your workers? What level of emergency power is required to mitigate this loss? And is it worth the investment? How often do you have people overwrite their documents with or without backups? How can you ensure backups, how can you make recovery more efficient? There are a wide variety of scenarios you should consider. A comprehensive IT/IS plan will spend more time on day to day operation and accident mitigation than it will on catastrophic failure. However, it is potential catastrophic failure that usually prompts the initiation of the plan.
The Plan
Now that you know what you need to get back online and how quickly it needs to be done, you should have an idea of how to prepare for such an event and how to execute a recovery plan. There is no magic formula, however there are some decent templates out their that are industry based. In addition your insurance company may have a specific template or general requirements for you to follow, so check with them first. Personally I prefer a document that con be used for every day issues, not just total loss situations. These operational style manuals are often more time consuming to produce and maintain, but they are well worth it. I like to categorized the document by resource, functional area, and rank. For example you should know what to do if you lose a small group of pc's, or a copier, or a key component in your assembly line; ie. the resource. Furthermore you should know what to do differently if the resource occurs in accounts receivable department versus the marketing department. And finally you should know if there are any differences if the resource is primarily used by a worker versus a director. If you are a larger company you may want to consider looking into some of the new emerging standards, such as BS25999 and NFPA 1600. In any case, now is a good time to look for a template and start filling in the details.

A simple example:
Let us assume that I have three physical boxes sitting in my server room. Collectively they run three non-public web sites. All three of them are file servers, maybe one is a fax server, one a print server, an email server, a domain controller, and four applications and a unifying database are served from them. What is our plan?
First off, what were our causes of loss? 40% of our loss came from hardware failure. Typically this means failure of the component storing the data, or the hard drive. How can you protect against this? Well, how often do you need to be back up and how musch time do you have to do it? If the answer is as often and as quickly as possible, I would recommend starting with a RAID. RAID stands for a redundant array of independent disks, there are a wide variety of implementations, and if you want a discussion on it let me know, but I am going to recommend RAID 5 as a general purpose implementation. In RAID5 data is stored across many discs instead of one, and a portion of each disk is used to store information about the information on the other discs. If one disk fails you can take it out, put another one in, and rebuild the information. You can even configure a hot spare that just sits their ideal waiting for a failure and when it occurs the data from the lost disk is rebuilt on the spare. All you need to do is order another spare and toss away the bad one (read dispose of according to company policy to safeguard against data leakage) The odds of more then one disk failing (n+1 if you have n hot spares) at the same time are exponentially lower then any single disk failing. The more disks you have in the array and the more spares you have waiting for failure the less likely you will have an unrecoverable failure.
The next largest piece of the pie was human error. Typically this does not result in total loss of all the data, but it can. Most often this occurs when one person overwrites changes made by another person, or accidentally deletes a file or folder, or overwrites work they did earlier in the day, etc., etc.. There are two methods I like to use; they are the snapshot and the repository.
The snapshot simply takes a picture (aka. copy) of the data at certain times of the day. Thus if I spend all night working on something and save the last piece at 4am, and a snapshot is taken at 6am and some one deletes the file at 8am, then at 10 am I can recover the version I had on the disk the last time the snap shot was taken; in this case at 6am. Snapshots are relay just a few very convenient backups taken at certain points in time. Often they will reside on the same machine as the original data and a half dozen copies are saved before overwriting. The space required depends on how they are implemented. Those system that use a precache setup using a diff mechanism only requires space equivalent to N*X + C where N represent the average size of the changes made during any given snapshot interval, X represents the number of snapshots kept and C is some predefined overhead. In the middle are those who use a post cache and diff system. The space used for this is estimated using A+(N*X)+C where a now represents the original size of all the data being monitored. On the high end you copy all the original files and a full copy of each changed file, thus resulting in A+(K*X) where K represents the average full size of files changed during a snapshot window. Please keep in mind space requirements when selecting a snapshot method.
Windows has their own implementation (shadow copy) with their servers, as do Mac and Solaris, however you can use RSnapShot with just about system out their. Personally, for an 8-5 business, I like to take snapshots twice a day, one at 7:00am and once at 12:00pm. As most of these human errors occur first thing in the morning and just after lunch. Taking the snapshots shortly before these problematic times provides the best recovery scenario. In addition I will typically dedicate 15% of the ready drive space to snapshots, which for many companies is about a weeks worth at twice a day.
I am sure you can see the limitation on this method. Inherently it is time based not change based. I may start on something at 12:30pm finish at 9pm, have the work destroyed at 4am, and never have it captured by these snapshots! In addition if you have a dozen people working on the same set of documents we could accidentally overwrite each others changes all the time and not notice for week. This is where the repository comes in and we get into the world of version control.
Let me give an example of two lawyers and a paralegal working on a large case document. Typically not everyone will be writing to this document at the same time (if that is the case there are versions of version control for you too) instead one person may open it to review and update a section and then close it. Let us look at the case where layer A opens it to draft out a new section. Lawyer B asks paralegal C to take a look at a another section and correct the language, which she promptly does. Lawyer B then verifies the work was done. When lawyer A saves the document the changes made by paralegal C are gone. What is worse is no one know that those changes are gone. With version control you check documents in and out of the repository and the repository keeps track of every one who has a copy and all the changes. If we had version control enabled when lawyer A tried to save the changes a conflict notification would have risen indicating the document had been changed since it was checked out. Depending on what kind of repository you have it will even highlight those changes for lawyer A so that they can be accepted or rejected. Even if they are rejected a copy of those changes are still saved in the repository and you can go back and forward to look at any version you desire. The down side is if you use a repository designed for say, text files, and you start dumping large CAD files in their, you are going to run out of space fairly quickly with every change being kept as a separate file. But not to worry, there are versions specifically for CAD files that understand the file format and will keep only the changes. This is also true for products like Adobe Photoshop where file size can be a major issue.
A number of implementations out there exist from companies like Microsoft, Amazon, and Google, and which one you should use depends on what is being held in the repository. For text documents and other simple files I like SVN. It is easy to set up and easy to use clients exist for almost every platform.
Backups - The other 21%
To take care of the other 21% of you loss candidates, I like good old fashion backups. When most people think of backups they think of tapes; I, however, do not advocate tapes. The main reasons are that high capacity tape drives are mildly expensive, tapes require manual attention, and tapes are fairly vulnerable to environmental conditions. The other issue I have is just that tapes are cumbersome to back up to and restore from. Unless you have really big money to afford a halon equipped, environmentally controlled, vault with your servers serviced by robotic tape changers, you are instead likely to have a person changing the tapes once or twice a day by hand with those tapes stored and treated improperly . What if they forget? What if they get a tape out of sequence when doing a incremental backup? What if they get left next to the server (which they usually do) and are stolen, or burn, or are flooded out with the server? What if I have them taken off site, in a hot car, or in a car that’s stolen? It just is not a practical or safe way to keep your backups. For a lot less money you can do off sight backups to a machine you control using R-Sync and SSH. Or you can outsource to Google, Amazon, Microsoft, Apple, or even Iron Mountain (The last one is exceptional for long term storage of your paper records too)
Personally, for smaller companies, I like one or two run of the mill pc’s with large hard drives sitting in a secure location (preferably a secure co-lo, but if your small enough, perhaps even the CIO’s house) with an encrypted file system (in case of theft) a vpn tunnel to the office network (no external exposure) and lock the box down to just SSH (Side Note 3: Limiting exposure). Then you put a simple script on every computer in the office to back up to a secure spot on one of the servers. I prefer using R-Sync for the back ups and ensuring different logins for each system. The files on the servers are backed up over a secure connection, again using R-Sync, to the off site servers. For the mail and databases, I like to use a combination of offline backups and log transfers (like HADR for DB2) and keep regular copies replicated amongst the office servers. However I will follow this up by moving the offline backups to the off site servers vie SSH. The off site servers will then R-Snapshot after each backup so that you can maintain a history of changes. Don’t worry about moving that much data off site, it happens in the wee hours of the morning and for most set ups you can do this with little to no real bandwidth issues. In my next few post I will show you how to do exactly this; as well as explore some of the variations offered by Windows, Mac, and Linux operating systems.
If you need to perform fast “bare mettle” restores, regularly imaging the system is the only practical method. However this has traditional been difficult to achieve, given the machine needs to be offline for a complete cold image to be performed. To get around this I often employ virtualization. Virtualization is incredibly affordable with a low TOC and high returns. Snapshots of the entire operating system can be taken at regular intervals without adversely affecting operation. Systems can be easily duplicated, split, merged, and managed. Furthermore the images can be made into incredibly compact files for easy back up to an off site repository. The only drawback to virtualization is the use of specialized low level equipment that the virtual server can not emulate. This could be anything from a BrookTrout Fax card to the serial controller designed to operate you tandem mass spec machines. If this is the case I would suggest you start by attempting to divorce what you need have the BMR (bare mettle recovery) from the specialized hardware. Working with the vendor you may find that a client server relationship can be set up, or you may find they offer virtual hardware controllers for popular vendors, like VM, that then operate the real hardware; often at a minimal cost. Either way, if you need to have cold images, virtualization is one of your best avenue to a solution. (Side Note 4: additional value in virtualization).

Redundancy: I would like to point out that live redundancy is your first line of defense and should be incorporated wherever possible. If I have 5 sales offices and I can never miss a call, I probably don't need to dramatically over staff, have spare offices for each, or need to contract with a call center. Instead I just need to make sure my phone system roles any unanswered call going to one office to all other locations. If some one calls the main line of my sales office in Mesa and no one answers the call (perhaps due to emergency evacuation), my phone system rings every other sales office until some one picks up. In addition all sales staff are appropriately trained to handy the customer. While it is much nice for some one to get a representative in they have a relation ship with, any Tempe sales rep can still assist any Mesa customer and when the Mesa reps are back they know their customer called and what the Tempe rep helped them with. The end result is for almost no cost provide a dramatic reduction in missed calls and unhappy customers. Another example close to my hart is front end web services. I would never have a single web server, application server, or database server. Similarly I would not have cold spares waiting around ideal. Instead I would have a variety of machines clustered together with fault tolerance as part of the design core. If one server, or even half the servers fail, no one using the system should ever notice a glitch, even mid transaction. Instead the automated systems should seamlessly fail over to the working systems when the a system becomes unavailable due to a failure or just high load.

Testing
The single most important piece of advice I can give, no matter what your disaster recovery plan is, is to test it. You may back your data up every night, but when was the last time you made sure those back ups worked? I can not tell you the number of IT shops I have worked with that have never tested their back up plan. For that matter I have found that in nearly 2/3rds of every small business I have consulted for their automated nightly backups fail every night. I like to test back up the same way I like to do inventory checks. Once a day/week/month you randomly test a small sub system. Often times you will be forced to do this as people use the system, however controlled tests are better. Each of these periodic test should not last very long and be just enough to ensure that particular aspect of your plan works. Once a year, or maybe once every two years, you should revisit your plan, make changes if necessary, and the run through a complete disaster recovery simulation. This is the only way to know for sure your being adequately protected.


Notes:

Side Note 1:
If you are a medium size business or intend to use this number for more then the most general ball park, I can not over emphasize the need to higher a professional to help you; actuarial adjustment can be quite difficult to master. What I am presenting here is an overly simplistic example for illustration purposes only. However, depending on your insurance policy you may be able to obtain a decent risk adjuster at a discount rate. In addition the implementation of a comprehensive disaster recovery plan may reduce your corporate insurance. This is also true for a continuity plan and a reasonable set of stranded operating procedures. Even a simple employee manual for each department can often win you a decent insurance discount.



Side Note 2:
This entire post was spurred by a discussion with a friend who said that some business would have no issue with data loss, such as the local pizza guy. I firmly disagreed. I believe that any company that has any records that correspond to real or potential money, such as client lists, prepaid expenses, unearned revenue, account receivables, account payables, short or long term liabilities, perhaps a warranty or service agreement with their client, you will suffer greatly when you loose all of that information. But this is a discussion for another time.



Side Note 3:
Please note that limiting external exposure is essential but VPN's are not always convenient. In addition if you have a box where, in case of emergency, you can not get to the physical council you need to have at least SSH exposed. However you can hide the SSH port, or limit its access to specified remote networks, or implement port knocking, or a myriad of other minor security precautions that help add piece of mind.>/A>


Side Note 4:
In addition to offering a great way to image running servers virtualization offers a phenomenal way to reduce your costs and increasing departmental performance. This is obviously a topic for another post, but if you are wrestling with scalability, recovery, constancy, remote management, or security, I recommend looking into virtualization. It is not write for all instances, and using it as a panacea can cause more harm then good, but it can also be a wonderful tool in your IT arsenal.

Tuesday, February 10, 2009

How-To set up WebFolders on Windows XP

I recently set up a remote file server for a client that has no central office, but required a large amount of storage space. In addition to SFTP/SCP/R-Synch and other common access methods I installed WebDAV. I wrote this simple How-To on setting up web folders about five years ago and have been recycling it ever since; as such I thought I would post it here. I will not go into how to set up the WebDAV server at this time. However it is very easy to do and if some one wants a how-to guide I would be happy to put it together. I hope you enjoy one of the simplest manuals I ever wrote :)

Setting up Web Folders on Your Computer

1. Minimize or close your browser.

2. Double-click the My Computer icon on your desktop.

3. Double-click the My Network Places icon that appears on the left set of options.

4. From the Network Tasks menu on the left, click Add Network Place. This will open the Add Network Place Wizard.

1. The Add Network Place Wizard opens. Click Next.

2. At the next step of the Wizard, you are asked where you want to create this network place. Click on "Choose another network location" and



then click Next.

3. You will be prompted for the Internet address of your representatives Web site. For example: https://Server.DomainName.com/WebDAVRoot/Folder

4. Then click the Next button.

5. Enter the user ID and password provided to you.

6. Enter a name for the folder that will help you easily identify this network place, such as “Home” or “Office”, and then click Next.

7. Click the Finish button. You will be prompted for your username and password once more and as the Wizard completes the setup.


If you have any problems test your ability to connect.

1. See if your online. The simplest way to do this is open a web browser, go to http://news.google.com/ and see if you see today's headlines. If you do, you are likely online.

2. See if you can connect to the DAV server. Remember that URL you entered above that looked like https://Server.DomainName.com/WebDAVRoot/Folder? Try entering it into your web browser and see what happens. Do you get prompted for a ID and password? If so, you can connect to the server.

3. Enter your ID and Password. If you can see your folders contents then your ID and password are correct.

4. If you can complete all of the above steps, but still cannot set up a web folder see the note bellow. If any of the steps above fail and you can not solve them your self, please contact your IT support group.


Note: Several people have reported problems setting up WebDAV on Windows XP. If you are experiencing difficulty please try perform the following steps before repeating the setup:

1. Double-click My Computer.

2. Double-click local disk C: (If at any point it says "these files are hidden," click "show the contents of this drive" under System Tasks on the left.)

3. Double-click the WINDOWS folder.

4. Double-click the system32 folder.

5. Scroll down the list until you find a file called webfldrs or webfldrs.msi. Double-click this file.

6. Click "Select reinstall mode."

7. Uncheck "Repair all detected reinstall problems."

8. Check the last four options beginning with "Force all files ..." and ending with "Validate shortcuts."

9. Click OK.

10. Click Reinstall.