Competency-Based Training for Biomolecular Researchers: Insights from BioAccel (Full Transcript)

Vera Matzer discusses competency-based training for biomolecular researchers, focusing on high-performance computing needs and the BioAccel project.

Download Transcript (DOCX)

Speakers

Add new speaker

Speaker 1: Matzer from EBI, who's going to be talking about training biomolecular researchers.

Speaker 2: Thank you very much. I'd like to thank the organizers for allowing me to come. Can you hear me well enough? I'm aware that my voice doesn't project very well, so I'll try and speak up. So I'm going to tell you today about competency-based training, and specifically for biomolecular researchers with high computational needs. And this is in the context of a project called BioAccel. And I'm Vera Matzer, and I work at EMBL-EBI. And specifically, I'm the work package leader for dissemination and training on this project. So I'm going to tell you a little bit about BioAccel, just so that you have the context of this project. Then I'm going to tell you about the competency profile, and what our aim was with it, and the implementation. And mainly, what have we learned? What would we do differently if we started over again, and what are we going to do next? There's going to be one example that I'm particularly going to highlight, and that's our HPC readiness, so that's high-performance computing readiness. Now, life science and HPC have become increasingly kind of connected over the last years. And this is where the EU has kind of created a new instrument called the Centers of Excellence. The idea behind it is to connect the scientific communities better to the high-performance infrastructure that's available. So in the case of BioAccel, that's the computational biomolecular research community, but there are a number of other Centers of Excellence as well, for instance, some on biomedicine, material science, weather prediction. So they've looked at areas that particularly need the high-performance computing, and that to some extent are currently limited by the computing available. Now, BioAccel particularly looks at the atomistic to kind of cellular level, and while life science becomes more dependent on HPC, there are a lot of new challenges. There's challenges around data storage, data analysis, but for us, I mean, the data sets get much complexer as well. For us, the main challenge was the skills gap that's currently available. We've already talked about it a little bit. A lot of people get through their degrees kind of not with the right skills to be able to work in HPC for the moment, at least. There's three main goals that BioAccel, the project was written around. The aim is to become kind of a central hub for biomolecular modeling and simulations. We try and improve the excellence in biomolecular science by improving the performance, efficiency, and scalability of some of the key codes that are in the project. So for instance, this is GROMACS for molecular dynamics, HADAC for docking, CPMD for QMMM. But we've also tried to improve the usability. And for instance, this is one of the ways that we're trying to also approach this skill gap, is by building workflow environments, workflow environments where the data is integrated to make them easier to use. A lot of this software is very, very difficult to install, especially on different HPC centers that are all a little bit different. And then the third one is the competence building among academia and industry. So we try and promote best practice, train users to make best use of both software and the infrastructure available. Now the process that we've used, as I've already mentioned it, we use a competency framework. And we originally looked at existing frameworks, one of them being the ICB one. And kind of with a small working group, we built a kind of draft version, an initial version that we could discuss at a community meeting. So we had a meeting where we brought the kind of project together, but also people from outside the project, for instance, to make sure we had a good representation of industry. And we started to work on our kind of draft profile. It got changed quite a lot, actually, at that stage. And we defined what the required competencies are to kind of work in this particular domain. We then tried to refine that a little bit more by adding for each competency, the knowledge, the skills, and the attitudes that you need. And then we also looked at the different kind of roles that we have within our community. What is actually essential for those roles, because there is a little bit of variation. And then we sent it out for consultation to ask what's missing. What needs fixing, but also do you feel that the way we've assigned roles and what is necessary, whether that fits. And the other thing that we wanted to gain by this is to look at what training is already available, training materials that map onto these competencies. And then we went to a kind of refining process. So we integrated some changes. And I think you've already heard some people say it today, this is a living thing. We expect this to change over time. We don't think these competencies will stay the same. So we've kind of made it available as a living document. And we expect changes throughout time. So what's the structure? Our competency profile, when I speak about the competency profile, I mean the whole thing, not just the competencies, but also the KSAs. So we have domains, so four overarching domains where a number of competencies, related competencies cluster together. And then under the competencies, we have the knowledge, skills, and attitudes. Actually, one of the main things we've learned throughout this process is actually at that KSA layer that we found the highest usability. And now why is there a tiger on there? The tiger is just to illustrate that this got very big, very quickly. I've got an example competency just to kind of show what it looks like. One of the ones we've got is write his or her own scripts to perform tasks in the context of biomolecular research. And then as an example of knowledge, we've got knowledge of existing commands and libraries for research, for reuse, skills to write and debug scripts, and behaviors to use the appropriate scripting language. It says behavior here, we've kind of changed it to attitudes to better comply with terminology. So I mentioned the tiger, so don't worry, you don't actually have to be able to read the text on there. This is just to show how big it got. So we have the four domains, and when we add the competencies to that, slightly varying number of competencies per domain, then we add the knowledge, we add the skills, and we add the attitudes. So you can see the huge number of entities we're kind of already working with at this point. You can also see it varies quite a lot. Now I have to immediately point out, I suspect that's more also because we didn't, when we started, anticipate how important the KSAs were. So I suspect a little bit in this is also when we had the community meeting, how closely did that competency align with who was in the room? So how well did the discussion get on what actually needed to be in it? So this is something we definitely need to refine over time, to kind of look at, do we need this many under all of it, can we condense it a little bit, are there some that are missing? Now then, we also kind of added a little bit of additional information. So we added some proto-personas. These are kind of broad user groups that we have. We have an entry-level user, we have a specialist user, and then we have a systems administrator and application experts. So these are not very detailed persona, but what we use them for is things like strategic planning. So have we, in our training program, addressed all of these groups, or are there actually groups that we haven't worked with that much? And one of the things we know is that the systems administrators group is actually a really hard group to reach, so we've actually learned that this is one in the project we've mainly been focusing on the first two, because the third one is really difficult. We've used it to define our audience, to make sure that when we create a course, we have it very clearly in our mind who it's meant for, and then we've mapped these to the competencies as well. And then what we've done is to look at, is this competency maybe either not applicable, do people need awareness of this, do they need working knowledge, or do they need specialist knowledge? Relatively, when we kind of did this exercise, it was mostly consensus. Some might say working knowledge, specialist knowledge, but it wasn't wildly different. And there are also KSA-level differences, so for instance, when you look at something like software development or scripting, the entry-level user might be amending things that are out there already, while the specialist user might be writing their own. So once we'd created this competency profile, we then mapped the competencies to the existing training, so that was training that was developed within the project, any other third parties that we were aware of, we literally just went looking for things. In no way did we get it all, obviously, but we tried to see what we could easily find to kind of give a snapshot in time of what was available and how well it mapped. And then we did a gap analysis, so where have we actually shown that we haven't been able to find a lot of material? Some of this information went into our knowledge resource center, so this was something that we kind of tagged on to the project at kind of a late date, this wasn't really meant, but we were sitting on this data set of lots of training materials that we found that were kind of nicely mapped to our competencies, and we really wanted to expose that. We wanted to show, you know, if you're looking for something, we've already done some of the hard work for you. So we actually ended up building this knowledge resource, because we couldn't really find anything that was out there that already quite fitted, because most of it wouldn't have supported the fact that we've mapped it to these competencies. So this is still a little bit in beta, it needs a little bit of work, but it allows you to search by competency and then see all of the training that we found. And then both the competency profile, the mapping, and the gap analysis all fed into the training program that we've developed for this project. Now what did we learn? As I already mentioned, the KSAs and that, we actually felt that that was the part that was most usable, is something that we didn't entirely anticipate, and we would have done things a little bit differently. But it helped us define what knowledge skills and attitudes are actually needed to be able to use the services within our field. And we also felt that there was really an incomplete mapping between courses and competencies. So sometimes a course would address a little bit of several competencies, but it didn't really address it all the way. So ideally the way to solve that would be to actually map to the KSAs, but this is difficult. Sometimes you get a really good quality description of what a course or a resource is about, and then you can do that. But sometimes beyond title and a two-line description, you don't really get a lot. So the amount of, we also had to kind of balance the effort to see how much time can we actually spend digging into this to do that mapping. Now going forward, we do plan to do some of that, but we'll probably do it for courses or resources where that information is available. And these will then kind of show that we feel that this is maybe a quality level, at least the description, that is higher. And I already mentioned it, the beast, it's become really, really big. And we also kind of felt that it's something that you have to be careful about. Is it essential to show all of this information when you're using it? Does the user really need to know this, or do they just need to profit from the structure that you've put in it? So that's why we kind of came up with this tiger, kitten, nothing model. So the tiger is when you really need the full profile, you need the profile to be visible. So for instance, this is internal use. When we do course development, we use the whole thing. Strategic planning as well, we use the whole beast. And in my case, it lives in my head, so that's fine. But I don't expect people to be able to immediately grasp all of that. I've been working with it for three years, and I'm sometimes still surprised with some bits. The kitten is kind of small doses, where you kind of do want to show it a little bit. For instance, when we do either a short course that's more based on skills development, then you want to show people what skills or knowledge is this course actually giving you and what else is out there. And then the empty one, it doesn't mean it's not there, it's underneath, but we've chosen to kind of hide it. For instance, we hide it in the learning outcomes for a course. Learning outcomes have become a lot easier to write now that I've got a competency profile to fall back on. Because when we've decided what kind of course to build, and you then have the KSAs that you've decided to address, actually, that's your learning outcomes almost written. And it means that people that go on the course don't actually have to grapple with this entire competency profile. They have learning outcomes, and that tells them where they're going. Now on the implementation side, as I said, for Intuner, we use the whole thing. Externally, an example of the kitten is our knowledge resource centre. If you want to engage with the profile, it's there, and we will develop this further over time so that people can actually rate themselves on a scale of the competencies and see where they want to develop. But this is for when BioXL hopefully will get follow-on funding, so it won't be immediate. Internal use, it's the course development. And I want to show you kind of one example, and that's, for instance, HPC training. What we actually found is that most of the training resources that are available are at the intermediate and advanced level. And most of the time, not very well-aimed at life scientists. So the life scientist wants to get to HPC, and they can often see the need for it, but they don't really know how to get there. They don't have the required background skills. So they might try, and either they fail or they get really discouraged, and they decide that a different field or a different specialty is better for them. Or it just takes a lot longer than it needed to. So what we've done is we've developed two courses. It doesn't mean we kind of lead them all the way there, there's still a lot of work they have to do, but we want to make it easier. We want them to not get discouraged along the way. So two courses, one is foundation skills for HPC, and one is hands-on introduction to HPC. Obviously, this is what we're hoping to achieve, and we're doing some kind of impact analysis also long-term to see if we get there. So the first one was a summer school, and we really addressed the generic computing competences here. So we had to brainstorm with the trainers to see what KSAs do we really want to cover here, and we really focused on basic computing components. We actually, in the training room, ripped some of the old computers open and really showed them, okay, when someone says cooling fan, this is what they're doing, hard disk, this is what they're doing, to really get them to think about what's the bottleneck in their particular biological problem. So everything was project-based, and it mapped onto the KSAs. And we used day-to-day examples, obtaining data, installing software, especially when it was software that wasn't that easy to install, and it ended with an introduction to HPC. The faculty was also the same, in the sense of the supporting faculty was the faculty that run the second course to make sure that we had the right people involved. The second one is a praise course, and we really amended it. We added an extra day to make it a little bit easier to digest, took all the examples out, and replaced them with biological examples to make sure that they were in their comfort zone. And this course addressed the parallel computing competencies. Now, that's the end of my talk. There's a couple of people that I really want to thank, Kath, who's in the audience and leads the training team, and the entire training team, actually, because developing this has been, at points, quite difficult, and having such a brilliant team around me has really made it a lot easier. And then the BioAccel Center of Excellence. Specifically, in the competency and HPC readiness, there were a couple of people that were more heavily involved, and Lee Larkin as well, who isn't part of either one, but still is quite important to us. Any questions?

Speaker 1: Yes, thank you, I won't go to the mic. What's the plan for maintaining the KSAs as the world changes?

Speaker 2: Yeah, that's definitely something that's important. For the moment, we're doing it on two tracks. One, we're hoping for follow-up funding for the particular project, in which case, projects always change a little bit. There'll be new partners, so we're really planning a proper overhaul at that point. The other thing is, within our team, we are working on a kind of a new home for the competency profile, because we don't only have this one, we have a number of them from different projects. We're looking at how does that work, who owns the profile, who maintains it, and to make sure there is some cohesion across the end of the project, absolutely.

Summary

Generate a brief summary highlighting the main points of the transcript.

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Key Takeaways

Extract key takeaways from the content of the transcript.

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Enter your query

{{ secondsToHumanTime(time) }}

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

{{ secondsToHumanTime(duration) }}

Select Audio file