Risk: One of the few certainties in life.
~involves changes today for a better tomorrow.
~involves the uncertainty that change entails
Stanfield Turner’s new book includes near-war risk (v23n1p12)
“Colonel William Odom alerted Zbigniew Brzezinski at 2:26 a.m. that the warning system was predicting a 220-missile nuclear attack on the U.S. It was revised shortly therafter to be an all-out attack of 2200 missiles. Just before Brzezinski was about to wake up the President, it was learned that the”attack” was an illusion - which Turner says was caused by “a computer error in the system.” His book makes various suggestions that would greatly reduce the threats of accidental nuclear war. ’We have had thousands of false alarms of impending missile attacks on the United States, and a few could have spun out of control.’”
RISK MANAGEMENT
~ concern: future - what could go awry?
~ concern: change - how will changes affect our ability to deliver as required?
~ must identify all risks and decide which ones are the right risks to take or deal with.
Reactive vs. Proactive Risk Strategies:
~ REACTIVE - “Indiana Joes school of risk management” or “fire fighting mode”. Not worrying about any problems which might arise until they actually do.
~ from AOL Long Distance (non)billing (Steve Klein ACM SIGSOFT v23n4p22)
a long distance service provider began to market its services through AOL. They don’t mail out bills, they have a link to your bill signs on to AOL. Doesn’t work for Mac Users who connect via an ISP. Their solution is to have the customer call every month and request the bill be sent via email. The bill is sent in HTML format, which doesn’t work for people whose email doesn’t support HTML. Don’t know how they will fix it. (See also contingien cy plan and reactive strategy)
~ PROACTIVE: potential risks are identified, probability and impact are considered, then they are prioritized. Finally a contingiency plan is developed.
Software Risks:
~ 2 Characteristics of risk: uncertainty and loss.
~ KNOWN RISKS: Can be uncovered after careful evaluation.
~taken from Critical mass or critical mess
at Los Alamos? (ACM SIGSOFT v23n4p21)
“A software problem caused two uranium assemblies in a cricality facility to acelerate to ward one another...when the joystick interface did not respond a subroutine returned an ASCII ’?’ to the main program for the potentiometer settings that controlled the stepping motor speed. The main program was never developed to deal with a question mark and translated this value to the number equivalent... (the number 63). The number 63 corre sponds to a large negative position that caused the stepping motor to drive in at full speed when it 2was selected for movement.”
~ PREDICTABLE RISKS: Can be guessed from past experience.
~ex: Texas prisoner convicted of rape employed to enter Metromail survey results, harasses respondents.
~ UNPREDICTABLE RISKS: Nuff Said?
~ from Satellite transmission snafu leads
to diplomatic incident
(Nick Brown ACM SIGSOFT v23n1p11)
A technical error caused the contents of one channel to accidentally be transmitted onto another channel. The problem was the channel, viewed in Saudi Arabia, was supposed to show French govornment-run general interest and news but instead displayed hard-core porn. Because of that Arabstatcancelled its contract with France Telecom.
~ PROJECT RISKS: risks that threaten the plan for completion (budget, time)
~ TECHNICAL RISKS: If technology threatens the quality and time relative to other products
~ BUSINESS RISKS: If a change in business threatens the products usefulness (change of company policy or market interest)
~ ex: Difficulties in developing large systems: IRS etc. (v22n4p25)
The IRS abandoned its Tax Systems Modernization effort, on which it has spent $4 billion. IRS stated the systems “do not work in the real world”. The FBI threw away a $500-million fingerprint-on-demand computer system
Risk Identification:
~ 2 subtypes of risks: generic and product-specific
~ Develop a RISK ITEM CHECKLIST
~ PRODUCT SIZE RISKS: how much is reused code, LOC, #changes
~ Taken from Medicare computer project terminated
(Edupage, ACM SIGSOFT, v23n1p10)
“Clinton Administration has terminated a contract with GTE for a new computer system to handle medicare proved to be so antiquated and compli cated that they frusteratred GTE’s efforts. Medicare officials say they will now work on individual pieces of the system rather than attempting to do the entire project at once.”
~ BUSINESS IMPACT: when business goals and issues affect technical work
~ CUSTOMER RELATED: if customer interaction issues could adversly affect.
~ Taken from Electronic airline ticketing by Robin Burke (v22n4p27)
A woman called to confirm her reservation on a return flight. She was told she had used her ticket a week earlier. This wasn’t true. What had happened was the agent that pulled the ticket accidentally pulled the receipt instead of the ticket and
~ PROCESS: if software-design process is poorly defined to affect the project
~ TECHNOLOGY: when something new is tried, chance for unknown risk is high
~ Taken from Five Million Dollar Bug by David Kennedy (v22n4p26,27)
Tokyo university is developing a microchip that can be placed on cocka roaches which will direct their movement. At the time of the article the roach would run around the room out-of-control.
~ DEVELOPEMTN ENVIRONMENT: If proper development tools are not
available
~ STAFF SIZE/EXPERIENCE: nuff said.
~ Taken from Comvor: Hamburg police computer system problems (Martin Virtel ACM SIGSOFT v23n4p22)
Hamburg police has wasted $70 million on a new computer system. The project has been going on for years. 356 jobs have been cut because the new system is supposed to eliminate the need for these jobs. They also needed the money to pay for new computers for the new system. Now the police officers do all the work. The old system costs around $200thousand per month to maintain. Police people were put in charge of development. Interi or minister blames the complexity of the system.
~ RISK DRIVER: cause of a risk
~ 4 levels of impact: negligable, marginal, critical, and catastrophic
~ RISK COMPONENT: what a risk will damage
~ PERFORMANCE RISK: will it be fit for intended use?
~ COST RISK: will it remain in budget?
~ SCHEDULE RISK: will it stay on schedule?
Risk Protection:
~ Also called RISK ESTIMATION
~ 2 factors for rating: likelihood and consequences
~ 4 risk projection activities:
~ 1) Scale of estimated likelyhood of risk
~ 2) List of consequences of risks
~ 3) Estimate impact of risk on project
~ 4) Note the overall accuracy or the risk projection
~ RISK TABLE
~ See p. 143
~ 5 columns: description, category, probability (%), Impact, RMMM
~ Average 4 each risk component impact to get overall impact.
~ 1ST ORDER PRIORITIZATION: Table is sorted by probability/impact
~ 2ND ORDER PRIORITIZATION: Only tackle risks above a defined line
~ RMMM plan must be developed for all risks above cutoff line.
~ Assessing Risk Impact:
~ 3 factors determine likely consequence ofa risk: nature, scope and timing.
~ nature: the problems that are likely if a risk occurs
~ scope: the severity and overall distribution of a risk occuring.
~ ex taken from v22n2p20
a computer malfunction causes panic selling at Hong Kong stock ex change. Computer reported a 4% drop. worldwide scope.
~ timing: when and for how long the impact will be felt.
~ ex of crucial timing from v22n2p21
a blown fuse took out a large portion of Iowa’s 911 emergency phone system for three hours over thanksgiving weekend. Isolating the prob
lem was difficult because of the system complexity.
~ Steps to determining the overall consequences of a risk:
~ 1) Determine the avg probability of occurrence for each risk piece.
~ 2) Determine the impact for each piece (fig 6.1)
~ 3) Complete the risk table. Analyze the results as described.
now we must prioritize the risks and think of how to avert them.
~ RISK REFERENT LEVEL: how close the project is to ending due to risks
~ Taken from Comvor: Hamburg police computer system problems
(Martin Virtel ACM SIGSOFT v23n4p22)
Hamburg police has wasted $70 million on a new computer system. The project has been going on for years. 356 jobs have been cut because the new system is supposed to eliminate the need for these jobs. They also needed the money to pay for new computers for the new system. Now the police officers do all the work. The old system costs around $200thousand per month to maintain. Police people were put in charge of development. Interi or minister blames the complexity of the system.
~ REFERENT POINT: point in which risks cause the project to stop (see/correct fig 6.4)
~ at referent point the decision to continue or stop is weighed equally
~ during risk assessment we do the following:
1. define risk referent levels
2. attempt to develop between each referent level (nature, scope and timing)
3. predict the points that define the region of termination
4. try to predict how combos of risks will affect a referent level.
Risk Mitigation, Monitoring and Management
~ All of the above were to develop a strategy for dealing with risk.
~ 3 points to an effective strategy: risk avoidance, monitoring and management.
~ RISK MITIGATIONS: avoiding risk.
~ best idea for a proactive strategy against risk.
~ ex of mitigation failure from Trumbull in a china shop by Willian Hugh Murray (v22n1p17)
When a squirrel got into a transformer and brought down the external power supply at the infrastructure computer center the UPS kicked in, enginer generators came on line and the center operated for about an hour and a half. Then the external power was restored. The external power, UPS and enginer generators went into a deadly embrace. The whole sys tem went down and wouldn’t come back up. The operators had tested to make sure the auxilliary system would come on. They never bothered to check what would happen when the primary system came back up.
~ RISK MONITORING: monitoring the likelyhood of a risk (if it’s going up or down.) and the effectiveness of risk mitigation. Also determines origin of problem.
~ RISK MANAGEMENT: once mitigation has failed.Develop a contingemcy plan. Decide which RMMM steps are cost-feasible.
~ from Sun Valley ski area forgets to back
up access data bass
(David Kipping ACM SIGSOFT v23n3p25)
Sun Valley ski used an electronic season discount pass. all ticket information is stored in a computer database. The hard disk failed and the ticketing databse was destroyed. There was no backup for the ticketing database. All pass-holders (several thousand) are asked t ocome to the Sun Valley offices for reregistration.
~ RMMM steps cost money.
~ 80% of the project risk can be accounted for in 20% of the risks considered.
Safety Risks and Hazards
~ Some risks can happen after the product has been delivered.
~ from AOL Long Distance (non)billing (Steve Klein ACM SIGSOFT v23n4p22)
a long distance service provider began to market its services through AOL. They don’t mail out bills, they have a link to your bill signs on to AOL. Doesn’t work for Mac Users who connect via an ISP. Their solution is to have the customer call every month and request the bill be sent via email. The bill is sent in HTML format, which doesn’t work for people whose email doesn’t support HTML. Don’t know how they will fix it. (See also contingiency plan and reactive strategy)
~ usually the value of a computer-based system outweighs the risks in the field.
~ SOFTWARE SAFETY and HAZARD ANALYSIS uncover and fix these bugs.
The RMMM Plan
~ Documents all steps taken towards resolving possible risks.