Risk: One of the few certainties in life.

 

~involves changes today for a better tomorrow.

~involves the uncertainty that change entails

 

Stanfield Turner’s new book includes near-war risk (v23n1p12)

“Colonel William Odom alerted Zbigniew Brzezinski at 2:26 a.m. that the warning system was predicting a 220-missile nuclear attack on the U.S. It was revised shortly therafter to be an all-out attack of 2200 missiles. Just before Brzezinski was about to wake up the President, it was learned that the”attack” was an illusion - which Turner says was caused by “a computer error in the system.” His book makes various suggestions that would greatly reduce the threats of accidental nuclear war. ’We have had thousands of false alarms of impending missile attacks on the United States, and a few could have spun out of control.’”

 

RISK MANAGEMENT

                               ~ concern: future - what could go awry?

                               ~ concern: change - how will changes affect our ability to deliver as required?

                               ~ must identify all risks and decide which ones are the right risks to take or deal with.

 

Reactive vs. Proactive Risk Strategies:

                              ~ REACTIVE  - “Indiana Joes school of risk management” or “fire fighting mode”. Not worrying     about any problems which might arise until they actually do.

                              ~ from AOL Long Distance (non)billing (Steve Klein ACM SIGSOFT v23n4p22)

                              a long distance service provider began to market its services through AOL.  They don’t                                    mail out bills, they have a link to your bill signs on to AOL. Doesn’t work for Mac Users                                                                  who connect via an ISP. Their solution is to have the customer call every month and request                                                  the bill be sent via email. The bill is sent in HTML format, which doesn’t work for people                                             whose email doesn’t support HTML. Don’t know how they will fix it. (See also contingien                                                                           cy plan and reactive strategy)

                               ~ PROACTIVE: potential risks are identified, probability and impact are considered, then they are       prioritized. Finally a contingiency plan is developed.

 

Software Risks:

                               ~ 2 Characteristics of risk: uncertainty and loss.

                               ~ KNOWN RISKS: Can be uncovered after careful evaluation.

                                                             ~taken from Critical mass or critical mess at Los Alamos? (ACM SIGSOFT v23n4p21)

                                                             “A software problem caused two uranium assemblies in a cricality facility to acelerate to                                                                                                        ward one another...when the joystick interface did not respond a subroutine returned an                                                                                                                       ASCII ’?’ to the main program for the potentiometer settings that controlled the stepping                                                      motor speed. The main program was never developed to deal with a question mark and                                                                                                    translated this value to the number equivalent... (the number 63). The number 63 corre                                                                                                     sponds to a large negative position that caused the stepping motor to drive in at full speed                                                                when it 2was selected for movement.”

                              ~ PREDICTABLE RISKS: Can be guessed from past experience.

                                                             ~ex: Texas prisoner convicted of rape employed to enter Metromail survey results, harasses                                               respondents.

                               ~ UNPREDICTABLE RISKS: Nuff Said?

                                                             ~ from Satellite transmission snafu leads to diplomatic incident

                                                                (Nick Brown ACM SIGSOFT v23n1p11)

                                                               A technical error caused the contents of one channel to accidentally be transmitted onto                                                                     another channel. The problem was the channel, viewed in Saudi Arabia, was supposed to                                      show French govornment-run general interest and news but instead displayed hard-core                                                                porn. Because of that Arabstatcancelled its contract with France Telecom.

                               ~ PROJECT RISKS: risks that threaten the plan for completion (budget, time)

                               ~ TECHNICAL RISKS: If technology threatens the quality and time relative to other products

                               ~ BUSINESS RISKS: If a change in business threatens the products usefulness (change of company                                  policy or market interest)

                                                             ~ ex: Difficulties in developing large systems: IRS etc. (v22n4p25)

                                                              The IRS abandoned its Tax Systems Modernization effort, on which it has spent $4 billion.                              IRS stated the systems “do not work in the real world”. The FBI threw away a $500-million                                                         fingerprint-on-demand computer system

 

 

 

 

Risk Identification:

                               ~ 2 subtypes of risks: generic and product-specific

                               ~ Develop a RISK ITEM CHECKLIST

                                                             ~ PRODUCT SIZE RISKS: how much is reused code, LOC, #changes

                                                                                           ~ Taken from Medicare computer project terminated

                                                                                              (Edupage, ACM SIGSOFT, v23n1p10)

                                                                                               “Clinton Administration has terminated a contract with GTE for a new                                                                                                                computer system to handle medicare proved to be so antiquated and compli                                                                                                       cated that they frusteratred GTE’s efforts. Medicare officials say they will                                                                     now work on individual pieces of the system rather than attempting to do                                                               the entire project at once.”

                                                             ~ BUSINESS IMPACT: when business goals and issues affect technical work

                                                             ~ CUSTOMER RELATED: if customer interaction issues could adversly affect.

                                                                                           ~ Taken from Electronic airline ticketing by Robin Burke (v22n4p27)

                                                                                              A woman called to confirm her reservation on a return flight. She was told                                                                             she had used her ticket a week earlier. This wasn’t true. What had happened                                                                                was the agent that pulled the ticket accidentally pulled the receipt instead                                                                             of the ticket and

 

 

                                                             ~ PROCESS: if software-design process is poorly defined to affect the project

                                                             ~ TECHNOLOGY: when something new is tried, chance for unknown risk is high

                                                                                           ~ Taken from Five Million Dollar Bug by David Kennedy (v22n4p26,27)

                                                                                              Tokyo university is developing a microchip that can be placed on cocka                                                                                                                               roaches which will direct their movement. At the time of the article the roach                                       would run around the room out-of-control.

                                                             ~ DEVELOPEMTN ENVIRONMENT: If proper development tools are not

                                                                available

                                                             ~ STAFF SIZE/EXPERIENCE: nuff said.

                                                                                           ~ Taken from Comvor: Hamburg police computer system problems                                                                                                                     (Martin Virtel ACM SIGSOFT v23n4p22)

                                                                                               Hamburg police has wasted $70 million on a new computer system. The                                                                              project has been going on for years. 356 jobs have been cut because the new                                                        system is supposed to eliminate the need for these jobs. They also needed                                                                                               the money to pay for new computers for the new system.  Now the police                                                                           officers do all the work. The old system costs around $200thousand per                                                                                                                                                    month to maintain. Police people were put in charge of development. Interi                                                                                                      or minister blames  the complexity of the system.

                               ~ RISK DRIVER: cause of a risk

                                                             ~ 4 levels of impact: negligable, marginal, critical, and catastrophic

                               ~ RISK COMPONENT: what a risk will damage

                                                             ~ PERFORMANCE RISK: will it be fit for intended use?

                                                             ~ COST RISK: will it remain in budget?

                                                             ~ SCHEDULE RISK: will it stay on schedule?

 

Risk Protection:

                               ~ Also called RISK ESTIMATION

                               ~ 2 factors for rating: likelihood and consequences

                               ~ 4 risk projection activities:

                                                             ~ 1) Scale of estimated likelyhood of risk

                                                             ~ 2) List of consequences of risks

                                                             ~ 3) Estimate impact of risk on project

                                                             ~ 4) Note the overall accuracy or the risk projection

                               ~ RISK TABLE

                                                             ~ See p. 143

                                                             ~ 5 columns: description, category, probability (%), Impact, RMMM

                                                                                           ~ Average 4 each risk component impact to get overall impact.

                                                             ~ 1ST ORDER PRIORITIZATION: Table is sorted by probability/impact

                                                             ~ 2ND ORDER PRIORITIZATION: Only tackle risks above a defined line

                                                             ~ RMMM plan must be developed for all risks above cutoff line.

 

 

                               ~ Assessing Risk Impact:

                                                             ~ 3 factors determine likely consequence ofa risk: nature, scope and timing.

                                                                                           ~ nature: the problems that are likely if a risk occurs

                                                                                           ~ scope: the severity and overall distribution of a risk occuring.

                                                                                                     ~ ex taken from v22n2p20

                                                                                                        a computer malfunction causes panic selling at Hong Kong stock ex                                                                                                               change. Computer reported a 4% drop. worldwide scope.

                                                                                           ~ timing: when and for how long the impact will be felt.

                                                                                                    ~ ex of crucial timing from v22n2p21

                                                                                                       a blown fuse took out a large portion of Iowa’s 911 emergency phone                                                                                      system for three hours over thanksgiving weekend. Isolating the prob

                                                                                                   lem was difficult because of the system complexity.

                                                             ~ Steps to determining the overall consequences of a risk:

                                                                                           ~ 1) Determine the avg probability of occurrence for each risk piece.

                                                                                           ~ 2) Determine the impact for each piece (fig 6.1)

                                                                                           ~ 3) Complete the risk table. Analyze the results as described.

                                                             now we must prioritize the risks and think of how to avert them.

                                                             ~ RISK REFERENT LEVEL: how close the project is to ending due to risks

                                                                                           ~ Taken from Comvor: Hamburg police computer system problems

                                                                                              (Martin Virtel ACM SIGSOFT v23n4p22)

                                                                                             Hamburg police has wasted $70 million on a new computer system. The                                                                                                                         project has been going on for years. 356 jobs have been cut because the new                                     system is supposed to eliminate the need for these jobs. They also needed                                                                      the money to pay for new computers for the new system.  Now the police                                                                               officers do all the work. The old system costs around $200thousand per                                                                                                                                                    month to maintain. Police people were put in charge of development. Interi                                                                                                        or minister blames  the complexity of the system.

                                                             ~ REFERENT POINT: point in which risks cause the project to stop (see/correct fig 6.4)

                                                                                           ~ at referent point the decision to continue or stop is weighed equally

                                                             ~ during risk assessment we do the following:

                                                                                           1. define risk referent levels

                                                                                           2. attempt to develop between each referent level (nature, scope and timing)

                                                                                           3. predict the points that define the region of termination

                                                                                           4. try to predict how combos of risks will affect a referent level.

 

Risk Mitigation, Monitoring and Management

                               ~ All of the above were to develop a strategy for dealing with risk.

                               ~ 3 points to an effective strategy: risk avoidance, monitoring and management.

                               ~ RISK MITIGATIONS: avoiding risk.

                                                             ~ best idea for a proactive strategy against risk.

                                                             ~ ex of mitigation failure from Trumbull in a china shop by Willian Hugh Murray                                                                                                (v22n1p17)

                                                                When a squirrel got into a transformer and brought down the external power supply at the                                                infrastructure computer center the UPS kicked in, enginer generators came on line and the                                         center operated for about an hour and a half. Then the external power was restored. The                                            external power, UPS and enginer generators went into a deadly embrace. The whole sys                                                                tem went down and wouldn’t come back up. The operators had tested to make sure the                                                               auxilliary system would come on. They never bothered to check what would happen when                                                             the primary system came back up.

                               ~ RISK MONITORING: monitoring the likelyhood of a risk (if it’s going up or down.) and the effectiveness     of risk mitigation. Also determines origin of problem.

                               ~ RISK MANAGEMENT: once mitigation has failed.Develop a contingemcy plan. Decide which RMMM          steps are cost-feasible.

                                                             ~ from Sun Valley ski area forgets to back up access data bass

                                                                (David Kipping ACM SIGSOFT v23n3p25)

                                                               Sun Valley ski used an electronic season discount pass. all ticket information is stored in                              a computer database. The hard disk failed and the ticketing databse was destroyed. There    was no backup for the ticketing database. All pass-holders (several thousand) are asked t                                            ocome to the Sun Valley offices for reregistration.

                               ~ RMMM steps cost money.

                               ~ 80% of the project risk can be accounted for in 20% of the risks considered.

 

Safety Risks and Hazards

                               ~ Some risks can happen after the product has been delivered.

                                                             ~ from AOL Long Distance (non)billing (Steve Klein ACM SIGSOFT v23n4p22)

                                                               a long distance service provider began to market its services through AOL.  They don’t mail out bills, they have a link to your bill signs on to AOL. Doesn’t work for Mac Users who connect via an ISP. Their solution is to have the customer call every month and request the bill be sent via email. The bill is sent in HTML format, which doesn’t work for people whose email doesn’t support HTML. Don’t know how they will fix it. (See also contingiency plan and reactive strategy)

                               ~ usually the value of a computer-based system outweighs the risks in the field.

                               ~ SOFTWARE SAFETY and HAZARD ANALYSIS uncover and fix these bugs.

 

The RMMM Plan

                               ~ Documents all steps taken towards resolving possible risks.