Here is a COG identified through the Problem BrainStorm needing support
Individuals that provide facts will earn points if a COG is validated by Steering Team.
COG: The problem with the OES is that the current system relies too heavily on skewed qualitative data leading to bias, inflation, and an inaccurate depiction of an Airmen's performance.
PROVIDE EVIDENCE
NPS graduate paper, Stephane L. Wolfgeher (2009): INFLATION OF USAF OFFICER PERFORMANCE REPORTS:
ReplyDeleteANALYZING THE ORGANIZATIONAL ENVIRONMENT
Per discussion with the Air Force Board Secretariat (anonymous, personal
communication, October 23, 2009), there is no average time spent reviewing
records. While the magnitude may be daunting (for example, approximately
6,000 records to be reviewed and scored by 25 officers over the span of 3
weeks), they are directed to take as much, or as little, time as is necessary in
order to accurately score the record. While there is no set amount of time, the
38
magnitude of the task guarantees that informal processes will be relied on to
conquer the task. As was mentioned in the OPR and PRF sections, informal
practices such as no white space, stratification, and hard-hitting bullets on the
first and last line are all taken into account by the board members. If there is a
lot of wasted space (white space), that gives an “impression” to a board member.
Stratification, no stratification, and type of stratification implies something. Board
members tend to read the “opening line” and “closing line” of OPRs and PRFs to
get an overall impression. These informal practices mean one can inflate an
OPR or PRF and yet still send a “message” to the promotion board about the
officer.
Pg 37-38
http://www.dtic.mil/dtic/tr/fulltext/u2/a514309.pdf
Same NPS paper recommendations
DeleteC. RECOMMENDATION
In summary:
Military structure leads to inflation based on the lack of
control at lower levels
Military promotions and reward systems support inflation
In military culture, “average” is not good; culture supports
inflation
Human nature and rational choice theory has a dominant
strategy—inflation
80
To counter inflation:
Structure
Eliminate the “up or out” system
And/or, make promotion decisions at lower levels (but
this has a low chance of implementation)
Rewards
Reward accuracy/punish inflation
Reward alternative career paths
Culture
Train officers to give and accept accurate evaluations
Demonstrate through word and deed that meeting
high standards is acceptable
People
Hold raters accountable through profiles
Provide incentives for raters to comply with the stated
system
Tool
Institute some method of measurement (such as
BARS) that supports statistical analysis
Based on the heterogeneous mix in the LAF
competitive category, the tool should allow for
qualitative explanations (essay)
Pg 79-80
The culture of the military is one that celebrates excellent performance. This culture is directly reflected in the OES in the form of inflation. If a culture highlights only excellent performance as acceptable, then the only tendency is for the average to become excellent, and the excellent to become outstanding. This creates inflation. With the current evaluation system, it is accepted that it is the records that get promoted, not necessarily the people. Promotion boards only see the records, not the person. This type of inflation tells the promotion board that if an individual receives an honest evaluation of “average” that they are actually below average because they did not get an “excellent.”
ReplyDelete“As stratification is seen as a discriminator, officers are taught to find some way to be able to discriminate their subordinates. This is effectively ‘creative stratification’.” Raters also know that promotion boards have limited amount of time to review officer packages, so they turn writing information to include stratification in numbers rather than words, because they catch the eye. (Wolfgeher)
Because the OES is open to subjective evaluation, “it lacks structure and standardization in a standardized and centralized system. Some would argue that there are positions or specialties where even the bottom individual in one organization is better than the number one individual in another organization when considering obligations and responsibilities necessary for the next higher rank (Wolfgeher 60). This stems from the environment where Air Force officers are compared to each other, despite having drastically different jobs, experience, and expertise.
Wolfgeher, Stephane L. “Inflation of USAF Officer Performance Reports: Analyzing the Organizational Environment.”
This study is old (1965), however, I believe it brings to light the issues we are looking to get at. One of the only empirical studies within the USAF that is specifically about the OES.
ReplyDeleteUSAF Officer Evaluation System Survey:
Attitudes and Experience
Analysis of OER rating trends reveals many relationships that suggest different groups
of officers are evaluated differently. When many of the associated influences and factors are
considered, these group differences appear to be logically generated and thus not the result
of systematic biases. Group attitudes, however, reflect rating concepts not realized in
practice. Almost all officers in the sample believe no differences in rating level should occur
for groups because of regular/reserve status (Item 30), aeronautical rating (Item 41), and grade
(Items 35, 36), for example. Yet a systematic difference in average rating level is noted by
grade, i.e., the higher the grade the higher the average rating level. This may be a function
of actual performance, but each officer is being compared only to others of the same grade.
Theoretically at least, and concurred in by about two-thirds (67%) of the officers, second lieutenants
should on the average have an OER as high as colonels; this does not occur (Item 36).
At the same time officers (657) disagree that comparison within grade on an Air Force-wide
basis results in fa'ir evaluations (Item 20). If a change occurs, they would prefer that comparisons
be made within grade and within each career field separately (43% of the total sample,
Item 26).
Pg 10
www.dtic.mil/cgi-bin/GetTRDoc?AD=AD0628551
Directly from the Air Force Core Values "little blue book": "just as they help us to evaluate the climate of our organization, they also serve as beacons vectoring us back to the path of professional conduct; the Core Values allow us to transform a climate of corrosion into a climate of ethical commitment." We have the Core Values to guide the entire force. The little blue book provides a lot of good information from which to draw standards that are applicable across all AFSCs. Providing common measurement and baseline can help reduce the need for subjective judgments.
ReplyDeleteIn the last year as the wing exec I reviewed and provided feedback on over 100 OPRs. Not a single one of them stated that an officer was “adequate” or “had room for improvement.” By all Raters stating that their officers are “outstanding” or “excellent”, it is impossible to create a record that accurately depicts an Airmen’s performance and potential. The Officer Evaluation System Training Guide (2009) that was certified by HQ AFPC states that “The Officer Evaluation System (OES) is performance-based. It focuses on how well an individual does his/her job and the qualities he or she brings to the job.” Why then does it send a huge message if a Rater were to refer to an officer’s skills as “sufficient”? The training guide goes on to say “Raters must honestly observe, evaluate, and document individual accomplishment in preparing performance evaluations. The OPR is the official record of an officer’s performance…”. By not seeing a single OPR during my time as exec that hailed officers as anything but “excellent” I believe the inflated OES has created a culture that forces Raters to skirt or even cross the lines of integrity in order to not harm their member’s careers.
ReplyDeleteI concur with this point -- and while I don't have an article to back it (though I'd say it is an accepted part of statistics) performance ratings, no matter how you decide to measure it, must fall within a standard distribution (bell curve). If a goal of the evaluation system is transparency, a reformed system needs to be clear about where people fall within the normal distribution.
DeleteI concur with this point -- and while I don't have an article to back it (though I'd say it is an accepted part of statistics) performance ratings, no matter how you decide to measure it, must fall within a standard distribution (bell curve). If a goal of the evaluation system is transparency, a reformed system needs to be clear about where people fall within the normal distribution.
Delete"Weaknesses in the nature of rater judgments are generally considered to compromise the utility of workplace-based assessment (WBA)." ...
ReplyDelete" Raters make and justify judgments based on personal theories and performance constructs. Raters’ information processing seems to be affected by differences in rater expertise. The results of this study can help to improve rater training, the design of assessment instruments and decision making in WBA."
Govaerts, M. J. B., et al. "Workplace-based assessment: raters’ performance theories and constructs." Advances in Health Sciences Education 18.3 (2013): 375-396.
http://link.springer.com/article/10.1007/s10459-012-9376-x
After being an Exec I have also reviewed more than 60 OPR's. The OES is very skewed and looks at past accomplishment vs what you have done in the past reporting period. I have seen far too many, "protected persons" I.e. came from another command/assignment with strat saying they are a God, and are automatically giving that same strat, even though if you actually read the other 8 bullets, they have done little for the current organization or even themselves. This leads to bias and favoritism as viewed by others in the organization.
ReplyDeleteThis status quo needs to be challenged by the current leadership. That means they truly need to be out and know thier airman to make a more objective view. Raters need to use their core values, and make honest assessments on the people they are rating on. Another aspect is the unwritten code, used in writing a report. To the un-trained eye, it looks good, however, one word can kill a career. This needs to also be addressed and re-assured through proper leadership and audits.
On the culture of inflation, just this week I have heard these comments and wrote them down.
ReplyDeleteO-4 - "The only time in the Air Force you need to embellish is on your OPR"
O-5 - "Don't believe your own OPR"
O-5 - "You have to be okay with playing the game when it comes to your OPR and don't worry about a little embellishment"
This comment has been removed by the author.
ReplyDeleteAnother quote from today, I recognize it is anecdotal but I believe it counts as evidence when it is a direct quote from someone who works with the OES. "Our wing does strats based on what an officer needs to progress his career, if he is up for promotion or wants some kind of special assignment, then we will make sure he gets a high strat in some area. Wing leadership just takes turns giving high strats based on the assignments people want."
ReplyDelete"Bleeding Talent: How the US Military Mismanages Great Leaders" written by Tim Kane deals with the culture of inflation in Chapter 9 "Measuring Merit." It can be read on Google Books. Very good summary of the problem, documented evidence, and some good recommendations.
ReplyDeleteQuote from Bleeding Talent (p204) "Unfortunately, the inflation of quantitative measures in OPRs pushed raters during the last four decades to distinguish their subordinates using the written assessment - a small block of white space to summarize work, performance, and potential. The write-ups were soon filled with code words, which evolved in each service and tribe, suffering their own inflation over the years. "Extraordinary" might mean "normal," and "one of my best" officers could be code for "one of my worst." ...Remember that each officer file is considered for one-two minutes by promotion-board members, typically scanning the first line...Consider some of the implications. Because careers hinge on the writing ability of supervisors, pity the young officer with a gruff commander focused on simply fighting the enemy. On the other hand, supervisors who worry that their prose may not be very good frequently allow subordinates to draft their own written assessment. This may seem unethical, but it is a rational response to a broken system. Officers across the US military are nudged into a moral gray area, pretending to write objective assessments that are actually amplified self-promotional advertisements. Perhaps the greatest indictment of the evaluation system is what we might call the integrity critique. As a cornerstone of the supposedly values-driven, service-above-self culture, the US military's performance evaluations in truth pressure officers to sell out their integrity by making dishonest assessments of subordinates. Even minor distortions are difficult to accept for young officers who have been disciplined by strict honor codes during Academy and basic training. Integrity, however, has a steep price when the cultural norm means that hones assessments of subordinates will ruin all of their careers."
ReplyDeleteThis paper was written for ACSC in 1999 as a basic overview and evaluation of the different branches OES.
ReplyDeletehttp://www.dtic.mil/dtic/tr/fulltext/u2/a395121.pdf
(p15) "The Air Force’s officer evaluation system is perhaps the most convoluted and easiest manipulated evaluation system in all of the military services."
JQP.
ReplyDeleteFrom the US Air Force NCO Page:
----
FROM THE INBOX:
One of my troops, a SSgt, received numerous awards and is clearly a top performer in our squadron. He won NCO of the quarter twice among various annual awards. He received a "promote" on his EPR. Leadership then sends an email to me saying that they know that he is clearly a top performer and that he couldn't have done anything different. But they said he is a relatively new SSgt so they are giving the "must promote" and "promote now" to more seasoned NCOs although he outperformed them. Does this seem fair? Isn't this why they are getting rid of TIS/TIG points? They also are saying that the promotion recommendation was based off of the past 3 EPRs, and not just this past year. Any guidance would be great."
----
Demonstrating that the intent of the new program wasn't effectively sold at squadron level, and that no one knows how the hell it is supposed to work. This makes it basically arbitrary. Airmen will be even less able to predict their success based on performance than they were before.
The principle underlying the new EES is worth fighting for: performance, and nothing else, should drive promotions. But by a combination of bad communication and failure to effectively win commanders and SNCOs over to the new program, that principle is getting submerged in a stew of other motivations.
One of the major knocks on the OES is that stratifications are awarded because it's "someone's turn" or because "timing" or because someone is the love child of a 3/4-star. Looks like we're starting to see the same nonsense injecting itself into the new EES.
Is this really why we went to the trouble of changing it?
An interesting case from AFRL in the late 1990s - their re-write of their performance evaluations essentially made specific categories - written by the lab employees themselves - which outlines the traits of what a true master in the field looked like. They had the same exact problem with OES we see today - rampant overinflation and no standardization resulting in widely subjective performance feedback.
ReplyDeleteOur new EES has elements of this, and the USMC's Fitness Report has this element as well.
"Performance Appraisal Reappraised" by Dick Grote, Harvard Business Review
https://hbr.org/2000/01/performance-appraisal-reappraised
Inflation is an issue, especially in the area of stratification. “In the United States Air Force (USAF), raters have either been pressured to stratify (rank amongst a set or subset of individuals [Milkovich & Boudreau, 1994, p. 177]) an individual or have been told that stratification would not be given; stratification would be given to the individual coming up on a promotion board, regardless of actual rankings… When methods to combat inflation, such as quotas or secret scoring based on rater’s section of items on the evaluation form, were implemented, resistance and non-acceptance of the methods were often encountered (Syllogistics, 1987, pp. I-2 – I-5). Finally, one article (Wayland, 2002) suggested that attempting to compare functionally different groups (such as operations, logistics, operations support, intelligence, maintenance, etc.) increased pressure to inflate ratings within the functions to ensure competitiveness at a central promotion board. This suggests that the structure of the military as a large, bureaucratic organization and its processes for individual advancement within the organization may influence individuals to inflate evaluations. In the review of previous research on the inflation of evaluations, it became clear that each military service has experienced, or is experiencing, inflation.”
ReplyDeletehttp://www.dtic.mil/dtic/tr/fulltext/u2/a514309.pdf
AFI 36-2406:
ReplyDelete3.1.2. Evaluation ratings are used to determine selections for promotions, job and school
recommendations, career job reservations, reenlistments, retraining, and assignments.
Therefore, evaluators at all levels must use caution to prevent inflation; it is important to
distinguish performance among peers and is a disservice to ALL Airmen when OPR/EPR
ratings are inflated.
Clare's comment above shows how this AFI is not being met.
Delete"Toward a Better Promotion System" by Kyle Byard, Ben Malisow and Col.Martin E.B. France, Air and Space Power Journal, July-August 2012
ReplyDelete'the inflated ratings of the OES system not only devalue positive
reports but also emphasize negative—or insufficiently laudatory—comments.
The system assumes that no officer, at any time over the course
of his or her career, will experience even a short period of less than stellar
performance or conflict with a supervisor. If the latter does not wish
to write effusively enough on the OPR, future promotion boards will
note this lack of enthusiasm. In such cases, the rated officer has little recourse.
One cannot appeal a favorable performance report simply because
it wasn’t sufficiently laudatory. Gen David C. Jones, Air Force
chief of staff from 1974 to 1978, described the rating problem this way:
“The effectiveness report system has become so inflated that far more
people get perfect effectiveness reports than can be promoted. The promotion
board is faced not so much in finding out who should be promoted,
but who shouldn’t be promoted. It’s very difficult if somebody
has a bad knock on his record to promote that person and not to promote
somebody who doesn’t have a bad knock on his record.”'
http://www.airpower.maxwell.af.mil/digital/pdf/articles/2012-Jul-Aug/F-Byardetal.pdf
Subjectivity that has trended to outrageously inflated comments as the norm. Inaccurate and restricts the candidate from appealing what is written. This article goes on to suggest that honesty to the point of identifying some areas in which the candidate is incongruous with their leadership may result in identifying them as a potential harbinger of positive change. This is in contrast to labeling them "less perfect" than their peers which my identify them as an individual not as desirable to promote.
DeleteSubjectivity that has trended to outrageously inflated comments as the norm. Inaccurate and restricts the candidate from appealing what is written. This article goes on to suggest that honesty to the point of identifying some areas in which the candidate is incongruous with their leadership may result in identifying them as a potential harbinger of positive change. This is in contrast to labeling them "less perfect" than their peers which my identify them as an individual not as desirable to promote.
Deletehttp://users.polisci.wisc.edu/kmayer/904/Visibility%20Theory%20of%20Military%20Promotion.pdf
ReplyDeleteThis is a link to a great read on this topic and the structure of the OES for the military.
Some points from the text to highlight:
1)Problems with Evaluation System stem from either the nature of the tasks or performance report inflation.
2)Ultimately, the criteria of performance are perceived by the military to be related to combat capability, however the military operates in mostly a peacetime state. This leads to the military defining tasks in diffuse rather than specific terms.
3)Diffuse definitions lead raters to evaluate on a set of activities that are only tangentially related to combat performance, such as personal appearance, style, ability to relate well to other people, and social acumen.
4)These tangential factors are nonstandard and conflicting. They result in unreliable and nondiscriminating indicators of future combat performance.
5)Nonstandard factors rely on subjective interpretation and inflation.
6)Inflation would not be "faulty" if inflation was uniform across the raters and sufficient differences remained between officers in order to discriminate them.
7)Without access to the wider range of reports, raters do not know how much to inflate there evaluations so that a "true" picture of their subordinates can be presented.
8)Inflation is necessary if their subordinates are to remain competitive with officers receiving the inflated ratings.
9)Changing the forms doesn't fix inflation, there will be new inequities that will be taken advantage of.
10)While the services treat inflation as a consequence of inadequate report forms, the causes of this inflation are probably endemic to the structure of the organization itself.
Military Advancement: The Visibility Theory of Promotion
Author(s): David W. Moore and B. Thomas Trout
Source: The American Political Science Review, Vol. 72, No. 2 (Jun., 1978), pp. 452-468
Published by: American Political Science Association
Stable URL: http://www.jstor.org/stable/1954104
During my time as an ADO, I saw several changes in leadership at various levels. With each change came new guidance on what words (action verbs, acronyms, adjectives, etc) were now "correct" for use on feedbacks. Changes were never substantial or a 'hard and fast' rule that was likely to remain in place for any substantial period. If one commander didn't like the use of the word 'extraordinary', they would excise it's use from within the organization and replace it with a thesaurus-searched term like 'outstanding' or 'superb'. Since there was never a concrete correlation between a word and an order of merit (numbered ranking), and since there's no standardization between commands/agencies, such descriptors have become character fillers intended to reduce/remove white space.
ReplyDeleteIf the AF is going to keep a list of approved words, it has to be maintained at a force-wide level only. Anything less allows for the subjectivity to creep back into the system. Switching to a more empirical accounting would reduce that subjectivity and enable raters to provide accurate feedback to their personnel.
During my time as an ADO, I saw several changes in leadership at various levels. With each change came new guidance on what words (action verbs, acronyms, adjectives, etc) were now "correct" for use on feedbacks. Changes were never substantial or a 'hard and fast' rule that was likely to remain in place for any substantial period. If one commander didn't like the use of the word 'extraordinary', they would excise it's use from within the organization and replace it with a thesaurus-searched term like 'outstanding' or 'superb'. Since there was never a concrete correlation between a word and an order of merit (numbered ranking), and since there's no standardization between commands/agencies, such descriptors have become character fillers intended to reduce/remove white space.
ReplyDeleteIf the AF is going to keep a list of approved words, it has to be maintained at a force-wide level only. Anything less allows for the subjectivity to creep back into the system. Switching to a more empirical accounting would reduce that subjectivity and enable raters to provide accurate feedback to their personnel.
Team,
ReplyDeleteI apologize for the late reply. Just speaking from personal experience, but it would be far less confusing if there was a standard abbreviation list across bases and AFSC's. In this way may folks from different walks of life can relate and understand how it relates in the grand scheme of AF life and how these accomplishments go above the norm for the peers in this group.
v/R
While the Marine Corps does use subjective data, the use of a comparative assessment allows the rater to rank the Marine against a set group of standards. Upon selection and submission, every person the rater has ever rated shows up on the FITREP and the Marine Corps is able to see how good the Marine is against all previous Marines. It makes it borderline impossible for the rater to rank the Marine other than where he belongs, as it will actually hurt his top Marines and make him look bad as a Commander.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteI was a group exec a few years ago and one of our jobs was going over majors and Lt Col PRFs and get them ready for review by the Wing and assist the Wing/CC in picking his DPs and promotes. The thing I noticed was how many of these strats were from when these officers were Lieutenants and placed side my side into the few bullets that were allowed on the PRF. What the Wing/CC ended up with was a bullet which read 2/3 Lts,1/6 CGOs, 1st in Grp etc. This he had to compared with over a dozen field grade officers to determine his rankings. The difference between a DP and P greatly determined an almost for sure promotion and a ranking which left the officer remaining in the large pool of officers. I wonder if this system gives leaders the best view of which officer to promote. With strats making up a large portion of those PRFs it leaves alot on commanders who may or may not have deep interactions with the officers on a day to day. With a smaller force it begs the question can we do a 360 degree feedback with supervisors being closer to raters? This would give a better view of the overall performance of the officer and give Wing/CCs a better view come time for promotion
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteQuantitative measures of Executive Intelligence exist: https://hbr.org/2005/11/hiring-for-smarts
ReplyDelete