Mastering Journalism Archive Management: Best Practices for Digital Content Preservation
Learn essential steps for effective journalism archive management, including creation, labeling, storage, and repurposing of digital content for future access.
File
Personal Digital Archiving
Added on 10/01/2024
Speakers
add Add new speaker

Speaker 1: Let's get started. Thanks for joining me. As you can see on the screen, we're going to talk about what we've called journalism archive management. That comes from the broader field of personal digital archiving, or PDA. Doesn't matter what you call it. What matters is if you do it or not. And if you want to keep your digital content around so that you can access it in the future, we're going to talk about some best practices, some ideas about how you can do that. So, for those, well, we can see, can people raise their hand on Zoom? If you have had loss of digital content of some kind, you know, photos or paper or anything like that, could you raise your hand? Yeah. Okay. I know I sure have. So, what happens is we've kind of gotten into a paradigm where we think about digital as being a permanent thing, but nothing could be farther from the truth. And if you look online, you'll see what I mean. The average life of a webpage today is about 100 days. So, that information, whatever information was created there, can disappear easily. And it's virtually irretrievable. And some of that happens because of link rot, where the URLs change. Some of it happens because of changes in content management systems. But there are a whole series of factors in technology that render content to be ephemeral, fragile. So, let's talk a little bit about, let's see, about five steps that you can use, practical steps. You'll see at the bottom there, there's also a link to a guide, library guide, libraryguides.missouri.edu slash jam, that will take you there. I think you can just go jam.missouri.edu. I think that might take you there, too. So, first, we'll review the five steps. You're all creators of content. You're creating text documents all the time. You are creating possibly photographs, graphics, videos, podcasts, audio content. Of course, there's all kinds of content. So, your first step is creation. The next step is, after you've created something, is you don't want to label it untitled. You need to label it something that you can find. You also need to store it properly, several places, so that you don't lose it. Then, part of the cycle is you can find it. Once you find it, you could actually repurpose it. You can reuse it. Leverage your digital assets wisely. So, the first one, talking about creation of video, audio, stories, assignments, and things like that. Next part, I don't want to dwell much on creating, because you guys are all good at that. But let's say labeling. Think of labeling as a note to your future self about what's important in this content. Because once it's stored on a hard drive, and if it just has some vague descriptor on it, that's like math paper or midterm, some generic term like that, that's not very helpful. And I think about also about in journalism, let's say you have a number of assignments, and you're running into Brian Smith. And one Brian Smith is a photographer. Another Brian Smith is an accountant. Another Brian Smith is a baseball player. And you're doing all these pictures. So, if you name all of these files Brian Smith, you're not going to be helping yourself much. You need to do something more to add that descriptor of is this baseball player? Is he an accountant? Is he a photographer? Et cetera. Put enough in there. And we often use the date. I like to start with the date, because I can think about when things happen generally, and that helps me sort out between having the date and a few other descriptors. I can get pretty close to what I'm looking for pretty easily when I'm searching. So, labeling, I can't overemphasize how important it is to label things and to think about what's important, what's different about this that's going to stand out in a year or five or 10 years, if you're going to keep your stuff around, you may be able to use this again, but only if you find it. Storage. We recommend that you store your project in three places. Of course, you have it on your computer, whether that's a laptop or desktop. We recommend that you also store it on an external hard drive and that you use Box. And if you haven't used Box, you, as a student here, should be able to get a free account. Well, actually, it's not free. You've paid for it. So, use it. Moving on. Finding. If you've labeled it and stored it, finding should not be a problem. You can use a variety of different methods to search your files, your content, and find what you're looking for, narrow it down and get what you need. And then the last part, we say, remember, you can reuse things. And if you've done, let's say, particularly, you know, talking about journalists, you've done a profile on a person and maybe this is a multifaceted person, or maybe you have a lot of material, you can take the same material, repurpose it, and write another article from it without having to go to much more trouble than that. And then that circles back to create. So, it's a cycle, a life cycle for your digital content. So, let's say we're talking about creation. We'll review a few different ways. So, well, let's see, yeah, we create text, photo, audio, video, blogs. Blogs are a little different than things. As I said before, label content as a note to your future self. Be consistent about how you name things, even if you need to change it later. If you're consistent, then you can go back, make those changes more easily than if you are sometimes putting the date first and sometimes putting the date last and sometimes doing the date as day, month, year, and sometimes doing it year, month, day. If you keep changing it, it's going to be hard for you to clean that up later. So, just be consistent. That helps. And name your folders according to, well, for students, a class. Put your files in the appropriate folder. That will help. Even if you just put a number there and name it. Labeling. So, let's say you're working on text files. If you're working on something like Microsoft Word, we don't, I don't know that Google Docs has been around long enough to tell what's going to happen. Hopefully, they'll be able to migrate one version to the next fairly seamlessly. But I know in the past with programs like Word that what appears, the way you create a document in one version of Word, if you try to open it in a later version, it looks different. Things have changed. The fonts will change. Spacing will change. Sometimes more dramatic effects happen. So, what you can do about that is you can save it as a PDF. And I think on most machines now, if you go to print, there's an option to save it as a PDF. So, then the PDF is a nice stable format to show exactly what the document looked like when you created it. It's been created, you know, especially for this purpose. Whereas Word, you know, or other document processing programs, Word processing programs, they are not necessarily, don't have the future in mind. We'll talk a little bit about in terms of, for most media, there are different ways to store the file, different formats you can save it in. Say for photos, a full resolution, uncompressed TIFF has got all the information in there. It's got, you know, it's high fidelity, it's got bit depth, you know, all these different things. And if, but they take up a lot of space. So, if you're trying to save space, and for some purposes you need, like a web browser, you need to save it as a JPEG. JPEG is what's called a lossy format. So, when it's like squeezing out the excess pixels or the excess information in a photo file. And to do that for some, for a lossy format, which is, which saves a lot of space, it actually will throw out some information. The thing to know about that is that if you repeatedly save, edit and save something with a lossy format, it will eventually destroy the integrity of the image or the file. It's not made for multiple sessions of editing. And you'd have to do it, you might have to open and, you know, compress it, recompress it 20 times, but it would eventually mush it up pretty bad. So, you've got, you've got the uncompressed TIFF and the compressed JPEG. You have a similar kind of situation with WAV files being the full resolution audio file. And I think we're, most of us are familiar with MP3s as a compressed format. But the MP3s, depending, I think they're all a bit lossy. You can change the resolution on them. But again, if you're saving them and resaving them, there's a chance that you're going to degrade the audio. Video, oh, well, saving uncompressed audio files as WAVs, compressed files as MP3s. So, likewise, video has its uncompressed and compressed versions. AVI and MXF are full-sized files. They're big, take up a lot of space on your hard drive. MP4 is a compressed format that's, it saves space and it also is useful for displaying if you have a, like a website, you can stream an MP4 pretty easily. So, let's talk a little bit about storage. We talked about three places. So, there's different ways you can do this. And you could, you can do it this way. If you have plenty of room in Box, you could store your uncompressed version in Box, which is a cloud service. You could store the compressed copy on your computer because you, I often run out of space on my computer, so it's kind of a premium place. And you could put a compressed, or if you have a big enough external drive, you could put your uncompressed copy on the external drive and another compressed copy in Box just to save, just to allocate your space. But it's up to you. You can, you can strategize that. Let's see. I was going to say, you can look, sign up for a authorized Box account. I think if you just search Mizzou Box, Google that, you'll find, and it's, if you're a student, you sign up for that. I believe you get 50 gigs as a start. If you want to keep that after you're a student, you will have to pay for it. But presumably, you will be making so much money, it won't matter. Finding content is easy if you have labeled and stored it. So there's some exercises we put in here. You can, again, the point that you can access them, repurpose them, and if you, like many journalism students, have to put a portfolio together, you would want to be able to access your work and maybe customize it depending on what set of skills they were looking for, you could pull up examples, various examples, customize a portfolio based on the fact that you stored lots of different things that you've done in the past. So I think I'll skip this. We're back to the life cycle, those basic steps. I would say the keys, the two key steps here that will help you is labeling things properly, storing them properly. The rest of it should flow from there. All right, do we have any questions? Anybody have an issue or a situation that they need help with? Yes?

Speaker 2: Can a JPEG file be changed to a TIFF file?

Speaker 1: It can. You can convert back to TIFF. Now, the pixels that were compressed or thrown out won't come back. They're not reconstituted. You can't take, put the genie back in the box, but you could save it if you wanted to, like if you only had JPEGs and you wanted to keep those permanently, you could convert them to TIFFs and then you could do your editing on the TIFFs and there would be no loss, no further loss of quality. Right. Other questions?

Speaker 2: Yeah. I see a cloud.

Speaker 1: Right.

Speaker 2: Maybe all you guys know about this stuff, but I'm kind of skeptical about it. Like where is it? Who owns it? How long is that going to be around? What exactly is it?

Speaker 1: The cloud, oh, the question was what is the cloud? And what are the concerns about issues like permanence, privacy, what else? Who owns it, ownership, that kind of thing. So, the cloud is just referring to a network of computer servers that could be located anywhere pretty much nowadays on the planet by networks. They're connected by networks. So, there are companies, obviously Amazon is a big player. They have, they're referred to as server farms. They're gigantic buildings. They locate them close to power plants, like hydroelectric power plants because they want an uninterrupted stream of power 24 hours a day, seven days a week, 365, right, constantly. So, the issues of privacy, whenever you're on the internet, I would say there's a risk of people losing privacy, of people hacking or snooping or there's that. When we're, I think, just about any time we're using a search engine except for DuckDuckGo, if you're using Google or you're using Bing or some of the other ones, they're monitoring everything you do. And if you have a cell phone, you don't, in this day and age, I would say we don't have much privacy if we're part of, if we're connected to the internet, that's something that. So, I personally, I don't, I'm not too worried that anything bad is going to happen to my stuff because of this, but I certainly would advocate to be aware of what's happening. And if you're wanting to stay out of that, there are measures you can take. Like I say, you could, if you're doing search, you could use an engine like DuckDuckGo, which doesn't track you all the time, but your ISP still knows what you're doing.

Speaker 2: My question's more directed at, for example, if I put something on CNN or BirdEye, and it's something that's in my profession that I can go back to, and so the suggestion is that the servers are so redundant and there's stuff snuck in. I mean, aren't they subject to crashing and losing data? So, how safe is your content on one of these big servers?

Speaker 1: It's safer than it would be on your computer or your hard drive because with a computer or a hard drive that you own, it's basically a single point of failure. If you were to lose it, if you were to drop it, if there was a burst pipe, you know, that burst over it and it got water all over it, there's a risk that you can lose your content that way. Now, optical disks are not, they're not as vulnerable, and that's, it's not a bad idea to store things on different kinds of media like that, but as a practical term, uploading it to the cloud means that it is distributed, and they do monitor, they have, you know, that's all they do is they have people and computer systems that monitor the uptime of the computers that make up the cloud, and the disks fail, but they're redundant. So, as one disk fails, they swap it and put another disk in it, and it rebuilds itself from other disks. So, there is a, if you look at the statistics, it's like 99.9 with, like, I don't know how many digits behind it, reliable that that stuff's going to be there. Now, will the business model last forever? Well, I don't know, I don't know of anything that lasts forever, but for the foreseeable future, yeah. Well, it will be, we're not even talking hundreds of years. With digital stuff, we're talking about, you know, five to ten years out. So, you're right, books are, you know, we've figured out how to handle ink on paper. We do, we've done a pretty good job of keeping that around. It doesn't last forever, and, you know, if there's a fire or a flood, or if there's, you know, insects or mold, it's all subject to physical problems. And, ultimately, digital stuff is stored on physical media, too. So, it doesn't last forever. The way that people are keeping it is kind of like what the monks did back in whatever the dark ages, medieval times, that they made a copy. They took a copy, it took them a lot longer. But, with digital stuff, having multiple copies of things, and then storing them in different places is, that's currently about the best strategy. It's called LOCS, lots of copies keep stuff safe, L-O-C-K-S-S, lots of copies keep stuff safe. Other questions, concerns, things that you run into? Anything from our Zoom audience? Everybody happy? Well, that's all I have for you today.

Speaker 2: Thank you.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript