Maximizing Training ROI: Effective Evaluation Strategies for Business Impact
Explore why evaluating training effectiveness is crucial, learn best practices, and discover metrics to measure ROI, ensuring training adds real business value.
File
Evaluating Training Effectiveness and ROI
Added on 09/30/2024
Speakers
add Add new speaker

Speaker 1: Hello and welcome to the SIOP Professional Practice Committee's webinar on Evaluating Training Effectiveness and ROI. I'm Mark Morris, Chair of the SIOP Professional Practice Committee on Learning Resources. I spent 10 years leading training functions at companies like Lockheed Martin and JCPenney. Here's what we have for you today. First we're going to talk about why are we evaluating training at all? Why is there a seminar on training effectiveness and ROI? The answer is there's a lot of money being spent on it. According to Jack Phillips, $164 billion are spent annually on training and senior executives are demanding and should be demanding payback on this investment. Combine that with the fact that according to ATD, only 18% of companies are doing a good job measuring learning effectiveness and we have a potential issue. SIOP and the IO community needs to get on top of this to do training effectively and allocate our resources efficiently to show that we're adding real value to the business with a huge piece of the HR spend, the training function. Let's look at the learning objectives. The IEA always likes us to have good learning objectives and these should be tied to our strategic impact. Training as I said is a big piece of the HR budget, it should have a big piece of the impact on company strategy. So we're going to review our current approaches and the best practices for evaluating training effectiveness. Secondly, we're going to know and be able to construct basic training metrics and I'll show you some aspirational training metrics as well and third, we'll understand how training was evaluated for business impact and do ROI for some specific examples. The needs analysis is the important first step of any good training project. You always want to identify what are we solving for with this training? Who asked for it? What do they need? Who needs to attend? What's the content look like and how does it link to a business goal or problem? Much of the credibility of our function is dependent on the ability of our training to be directly useful to the business. I'll give you an example with customer service, something we certainly dealt with at JCPenney quite frequently. Let's say you have a sudden spike in complaints. You might even have a service level agreement or a requirement for a certain amount of customer service complaints and if you go beyond that, you might even have to pay a fine like MB transportation has done in the past, but anybody wants to keep this down because this can directly relate to sales and profit impact. Training will typically get called in by a leader in the business or operations who says, hey, we're getting these complaints. Can you guys train these people so we don't have them anymore? Then you have to do a good needs analysis to figure out exactly what's the problem. Are we taking too long to respond? Do the people not know what they're talking about from a product or service standpoint? Is it an issue with the number of people staffed at the call center? Is it a courtesy or politeness issue? Is it a cultural issue? Until you figure that out and build the training content to address it, you're not going to be able to fix the problem, but once you have got the training content and you put it in place and get everybody through it, then you want to see if the complaints actually dropped. How fast did they drop? Did they drop quickly for attendees versus non-attendees? How long did it take after the training? Could you have gotten by with a shorter amount of training? Did this actually reduce the fines from the client? All these things are part of your evaluation of the training for customer service. The classic model since 1959 is the Kirkpatrick model for training evaluation. There's four main components. The reaction measures, or sometimes called smile sheets, which looks at the learner's reaction to the training in terms of did they have a positive experience? Did they feel good about it? Were they personally satisfied and felt it was a good use of their time? By the way, you've probably all experienced the reaction measures many times when you get those forms at the end of a training class or a webinar, a seminar. Level two, the learning, you've no doubt experienced when you were in college, if you took a final exam or some sort of end of course measure that looked at whether or not your knowledge in a particular domain actually increased from the beginning of the class to the end of the class. If it did increase, you're showing a good ROI from a learning standpoint. The behavioral or level three Kirkpatrick model level of analysis talks about whether or not you can actually apply that new knowledge and those new skills to the job. Typically, this is measured with competency ratings, whether it's part of a performance appraisal process or something else. Level four, which is usually the holy grail, can be the toughest to find, is the results on the business. How did it help the business? What metrics did it help that might be directly linked to business outcomes? Whether it's the business KPIs or key performance indicators, some sort of dashboard or perhaps a performance appraisal of some sort. Much of the future evaluation for level four will be dependent upon currently changing performance management systems and norms. Our performance management process in IO psychology is currently undergoing a pretty significant transformation, a major evolutionary inflection point. What comes out of that will be a key factor for determining what level four looks like. For example, we can move to adoption rates, likes, followers, shares, comments, page views. These could all be deemed the level four outcomes. The types of training and level of evaluation, I'll give you some examples of where they might fall. Let's say you're doing training on a new policy. It could be a new dress code policy that you're sending out. It could be a new policy on sexual harassment in the workplace. It could be regulatory rules that you're looking to apply. Any type you're looking for something which is basically, I need people to know this, you're looking at policies, rules, level two type evaluation would be appropriate. For safety or compliance training, this is typically not seen as motivational training, but it's required. That's typically an acknowledgement of some sort, and maybe a level two type analysis. For motor skills, for example, at Lockheed Martin Aeronautics, we would teach people drill, ream, and countersink. It was actually our number one class. If you're taking a huge drill and drilling a hole in carbon fiber side of an aircraft with very good accuracy and keeping it perfectly straight and level in exactly the right spot, you're looking at level three. You really need to evaluate that training in terms of, can they demonstrate that skill after the class in the workplace? Are the error rates for inaccurately drilled holes, are they going down? That would be an important metric for that one. Attitudes or culture change. A lot of companies are struggling with changing in culture and getting change skills into their workforce. Level one is important in this because you want the attitude of people feeling positive about this. So getting that attitudinal measure is important for culture change. And then, of course, level three, can they apply those skills or is the culture crushing and destroying their ability to assimilate the new mindset? Competencies and skills, whether this is conflict resolution, collaboration, communication, any of those types of competencies and skills would typically be considered a level three. It would be appropriate evaluation for them. And then leadership development can be across all those levels. It's pretty widespread. If the training has got design issues, you're going to get good feedback from level one and level two. So the trainer will typically adjust the training delivery, adjust the training design, maybe add some blended learning, maybe change the participant selection criteria, make sure we have the right people getting benefit from the class, and that's where level one and level two can give you good feedback. For level three, if the training is sound, well-designed, well-delivered, clearly job relevance of the needs analysis, but you're having issues with level three or level four, the trainer will need to work with the manager who requested that training and or the leader of the group attending the training, because you may have a job design issue, where a process has to be changed or improved. Maybe there's a missing tool. Maybe there's a systems issue. Maybe there's a reward conflict going on, but there's probably some performance consulting that needs to take place. Sometimes this can be rectified pretty easily with a job aid, in which case the trainer can continue to be of help. Level four issues may also involve like a criterion deficiency. Lots of times there's difficult time finding metrics for level four type issues, and that's going to, again, tie back to the performance management evolution that we talked about a few minutes ago. The fifth one that's been added to the Kirkpatrick model by Jack Phillips, who is without question probably the most prestigious and prominent voice and face of ROI in training for the last 10 or 15 years, is the ROI addition, so number five, return on investment. Jack is very interested in quantifying the financial payoff of training investments, much like the CHROs and CFOs that I talk to. Where's the money? Convince me. Let's make sure this isn't Latham and White and the futility of utility with some crazy high amount of money that no one really believes. The basic steps to do ROI per Jack, so you've done your needs analysis. You're collecting data pre-training. You're collecting data post-training. You're trying to eliminate confounds or third order effects on the outcomes. You're converting the data to dollar impact, and that's where you've got to be careful and be conservative. Then you're going to sum up all the costs of training, whether that's the labor costs of the attendees, the labor costs of the trainer, if you have any guest speakers, materials, if you're renting a venue, all that has to be added up, and then you subtract it to find out how much you got. Give it a little check for intangibles and context factors. For example, if you're looking at training effectiveness, but your company's in the middle of a merger or acquisition, people could be pretty stressed out and it's going to be difficult to get an accurate measure of training. Or if you're looking at a place where it's going through a union organizing effort, at the very time you're looking to assess training impact there, that's going to be very difficult to disentangle those effects. Center for Creative Leadership also talks about experiential training and some metrics that might be related to the CCL model for how people learn. In the leadership development space, as opposed to the other types of training, you're looking at, this is the classic model. There's been some discussion about modifying it for group differences, but just so you have the fundamental model, 70% of learning is supposed to occur on the job, 20% from others, whether it's a mentor, a coach, or a peer, and 10% in formal training classes, which is really the lion's share of the evaluation we're talking about. How do you evaluate the training that occurs in the other 90% of the leader? This is, again, specific to the leadership space, where much of the learning is not occurring in a formal class. How do you evaluate that training? Well, here's some examples. By the way, Cindy McCauley's 2014 book would be a great resource to follow up if you want more on this. She talks about case studies of leading organizations and how they do experience-based leadership development. They're looking specifically at strategic assignments, rotational assignments. We looked at this at Lockheed quite a bit as well. One of the criteria you can use, one of the metrics you can use is, what percent of those assignments were actually good, actually successful? If you put 10 people on a rotation, if for eight of them did it work out, for nine of them did it work out, how many failed in the rotational assignment? That would be one thing you can look at. You can just plot the trajectory of these high potentials. Then it could also be a selection issue as well. If you're looking at the training that's occurring by leadership folks with mentors or other people, you can look at perceived organizational support as a potential metric, as well as the degree of mentor-mentee fit indices, or you can look at just the percent of people that actually have a mentor-mentee, or the percent of people that actually have a development plan, a current development plan in your LMS. A method you might not be familiar with is Brinkerhoff's success case method. This is a different approach, much more in-depth, much more detailed. This would be a case where you have some successful and some unsuccessful people who've gone through an important training class, and you had some winners and some people who weren't as successful. What happened? Why did it work that way? Typically, this will involve in-depth discussions, interviews, surveys, and focus groups with attendees to map out an impact model, to figure out where the causes were, where the root causes were, and why this training had such a good impact on some and not as much on others from a qualitative perspective. Then you'll create findings and analyze them and make recommendations for how you might redo the training or redo the participant flow or the participant preparation. This can involve things like pre-reads. It can involve things like post-class assignments. It can involve things like committees of practice. There's all kinds of possibilities. Probably only justified in cases where the training is really high value add. And of course, the ADDIE model of instructional design, clearly the dominant model in the instructional design world, talks about the E in ADDIE stands for evaluation. So let's talk about ADDIE's evaluation criteria as well. Typically, they'll look at the summative evaluation, which is the Kirkpatrick's levels, but they'll also talk about formative evaluation. Now, they're looking heavily on training design here because it's an instructional design model, but they'll look at things like, did you include graphics, videos, storytelling, analogies, metaphors, was it an interactive design? Did you get sufficient feedback and attention from the attendees? Did you use examples at all? Did you make the case of payoff? How did you connect it to other learning that they might have gotten? For example, other webinars, other courses that they've had, required courses, 101 courses, college courses. Was it interesting enough from a level one standpoint to gain and keep their attention? And did it make sense for the learning styles of attendees for that particular company? So that's the ADDIE model, and it's based on Scriven's work. Role playing case studies, those are all ways to kind of blend it up that ADDIE likes. If you're looking for, well, Mark, I think I have some training metrics, and I think they're useful, but I'm not sure. Well, use these little scorecard to assess the quality of your training metrics. Do the training metrics you're using actually help your L&D team improve the training? Are they an objective assessment of training quality, as opposed to the opinions of your trainers or the level one reaction measures? Do they show variability in their outcomes? If every training class is always an A, there's no variability. And as IO psychologists, we know you need variability to do a correlational analysis. Do they offer reliable measurement criteria, or are they unreliable? In other words, do different attendees have widely different ratings for the same class? That would be an example of unreliable measurement criteria. And of course, do they connect to important business outcomes, which we keep hammering throughout this webinar? Some common metrics that you'll often see, hours of training completed annually per employee. Across the United States, this is typically about 40 hours a year. That includes compliance training, as well as learning training, leadership training, competency training, professional skills training, technical training, customer training, etc. So typically, a week, a year is spent annually per employee. If your company is spending less than a week a year, according to your LMS reports, you might be underinvesting in development. Maybe that makes sense for your business, your industry, or your employee base. If they're very high turnover base, for example, it might modify how many hours of training you invest in. Completion rates by course. This is another very common one. And then the percent of completions that are done on time. For example, in California, you have to do sexual harassment training every two years. Were those completed in a timely fashion? Did you have a set of deadlines that you had to meet? Pass rates for certification tests or final exams, typically these are at the 80% level, but whatever it is, are you meeting that? The cost per learning hour would include cost of the trainer, the trainees, the supplies, the materials, the venue, all those things. And then the average participant evaluation. So this is a level one smile sheet. The percent of exempt workers who actually have a development plan in your LMS. This could be all workers, just exempt, salaried workers, whatever you want to do. And then the percent of people trained by department. Maybe you have some departments or some regions that are really not pulling their people off the line to do training. This will give you a good sense of whether or not you have a culture of learning in your organization or if there's an area that needs to be focused more on. Especially if they also are short on things like bench metrics for succession planning or if they're having problems with turnover, which is often going to be related for, especially for millennials. Aspirational training metrics. So these are a little more advanced training metrics. So looking at the average number of hours of training by level of potential, who should have the most training, right? High potentials typically are very thirsty to learn and would normally be expected to have more training. You should be particularly investing in your backups to make sure they're ready now. The percent of people promoted. So if you completed a leadership class versus a control group, you should be more likely to be promoted. Is that actually occurring? The technical outcomes and productivity, safety, or quality pre and post-class. Did you have fewer safety defects, fewer quality errors, fewer mistakes? Did you improve productivity versus the group who didn't complete the training? And Lockheed, this was a really critical one we looked at. The deltas for competencies or engagement pre and post-class. The count of the most commonly selected IDP needs. This is very interesting to look at anyway, say, what are the needs people are most often going after? Because it could be different than the needs that are scored the lowest. You might have an average lowest competency of collaboration, but you might have an average most common competency by level that's different than that, and that can give you good information. The percent of movement into new jobs and new roles by job level. You have more movement occurring in middle management than you do in senior management, so this is very useful information to have. And then as I mentioned before, the count of likes, comments, and shares, whether they are going out to YouTube or using your internal LMS or using a massively open online course, which is more and more commonly part of your learning portfolio. One more example that I'll close with would be an example of ROI that we saw in transit, which is at MV Transportation, a 20,000-person bus transportation company with hundreds of contracts across the U.S., including corporate shuttle, campus shuttle, city bus, paratransit, school bus, etc. We had a situation with overtime costs being very, very high, so we brought in a few dozen general managers, three different classes of them, and they spent almost an entire day with one of our senior finance people going over an income statement, working as part of a group. So they did a case study, they looked at their own income statement, they did peer group work in terms of table exercises, and at the end of the day, they were able to identify how to predict and actually anticipate and address overtime issues before they became as onerous. And as a result, if you looked at their overtime costs for their sites three months prior to the class and three months after the class, and the three classes were held at different times of the year, so there was no seasonal effect, we saw an average 10% reduction in overtime costs, which of course is money directly to the bottom line. So there's an example that I was directly involved with recently. So some specific recommended actions. Your LMS needs to be a metric generator. Everything coming out of it, from a course completion to when it was completed, who attended it, the level of the person, all that needs to be, the hours of each class, the cost of each class, your LMS needs to be your metric generator. Your HRIS is also a powerful resource for data and metrics. So I strongly encourage you to figure out what data can you get from that, whether it's performance management data, competency rating data, pay data, job title data, level pay grade data, anything you can get that you can then tie to that LMS and start showing interesting relationships and correlations and ratios. And then integrating the data beyond HR, bringing in customer data, operational productivity data, sales and financial data. All that stuff is going to be very, very powerful from a linkage analysis standpoint. Every culture has metrics that matter. They're the ones that CEO pays attention to. If you can find out how you impacted them and connect them to the learning data, your $164 billion in spend should be not only protected, but grown. So that's our recommended actions. Good luck taking training evaluation to the next level. And thank you for your attention today. Thank you.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript