Chad Chelius and Dax Castro during an Accessibility Podcast with the Chax Chat Logo between them.

Acrobat Action Wizard and Dealing with Scanned Content

Accessibility Podcast Topic Links

Accessibility Podcast Transcript

Dax Castro
Welcome to another episode of Chax Chat. Join Chad Chelius and me Dax Castro, where each week we wax poetic about document accessibility topics, tips, and the struggle of remediation and compliance. So sit back, grab your favorite mug of whatever, and let’s get started.

Chad Chelius
Welcome, everyone. Today’s podcast is sponsored by AbleDocs, makers of axesWord, axesPDF, as well as document remediation services. So we want to thank them for being our sponsor once again on today’s podcast. My name is Chad Chelius. I’m an Adobe Certified Instructor, an Accessible Document Specialist, as well as consultant.

Dax Castro
And my name is Dax Castro. I’m an Adobe certified PDF Accessibility Trainer, as well as certified as an Accessible Document Specialist by the International Association of Accessibility Professionals. Hey Chad, how is it going, man?

Chad Chelius
Good, man. How are you doing today?

Dax Castro
Not too bad. I love the time we get to spend recording Chax Chat. I wanted to tell you that someone made a comment on a post earlier this week. And they said, “It isn’t checks chat if it doesn’t start out with an update on the alpaca.”

Chad Chelius
Yeah.

Dax Castro
I worry because our intros are always, usually, filled with personal stuff. But it was fun to see someone’s comment about that. But her car comment was actually about the fact that on our transcript page, we have now started putting headings in the different paragraphs to kind of separate content. So as people are reading the content in the transcript, they can kind of skim through it and see what paragraphs are talking about what. It’s still kind of a work in progress. I’m still feeling out how detailed to be. But her comment was, “Hey, could you put off a links together in one section, so that they weren’t dispersed throughout the paragraphs?” So there was a section of links for the panel. What do you think about that?

Adding Accessibility Show Links

Chad Chelius
Well, you know what, I mean, it makes a whole lot of sense. Because I can’t tell you how many times I’ve been scouring a web page for a link to something. I know, I’d seen it there before, but I have to kind of like scour the whole thing to try to figure out where the link is. And that suggestion makes quite a bit of sense. You know, what I mean? If you put all the links together, you can easily find them. And what do I always say Dax, “Accessibility benefits everyone.” And so I’m sure that would be a very convenient thing to have in our transcripts.

Dax Castro
Well, thank you to our listener who brought that up, and we’re going to implement that on this podcast. I will say, I’m not sure if we’re gonna be able to go back to the other 25 [episodes] – we are on episode 26 – and fix those, but definitely, from this point forward, I’ll make sure that I try to put those links in the link section at the top.

Chad Chelius
Right. Well, this is a familiar conundrum, isn’t it, Dax? Because we work with companies all the time. And one of the most common questions we get is like, “Okay. This is great. Now we know how to remediate a file. But what about the years of documents that exist on our website?” And it certainly is a challenge. And we want to make everything accessible to everyone.

Chad Chelius
But in all, honestly, a lot of companies don’t have the bandwidth to go back in time, and remediate all those existing files. So, we’re kind of in the same position right now. And I guess, what we can say is like, “We want to know your suggestions.” And what we could do is we can make changes moving forward. And if for some reason we find ourselves with a lot of time on our hands, maybe we’ll go back at some point and change the existing ones. But we could definitely make improvements moving forward, and we definitely will based on your feedback.

Accessibility Basics

Dax Castro
Absolutely! We do take it seriously. We want to make sure we get things… You know, it’s an evolution. It’s like accessibility itself. It’s a journey, not a destination. And so as we continue to improve, we’re trying to do things better and make things good for you. And then it goes back to the accessibility journey, that we’re all on. For those of our listeners who are new to the journey, you may be struggling with some of the more advanced concepts, but that doesn’t mean you can’t make sure that things have alt-text, that your titles are set right, that your headings are in order.

Dax Castro
I mean, some of the basic things. If you can get that far, you’re going to create a much easier user experience for the person on the other end. Now, maybe you’ve got some obscure error that pack three is throwing at you, you’ve got some untagged annotations or something else, you’ll get there. You know, it’ll come and you’ll figure it out. But if you can put your headings in order and add alt-text – just those two things – you’re going to be leaps and bounds ahead in creating a much easier user experience for someone reviewing your document.

Chad Chelius
Yeah. I mean, I always say, “You’re better than where you started.” So it’s all progress. I mean, we all want it to be perfect. We want to see all those green checkmarks as we always point out, but at the end of the day, anything you do is an improvement. So keep that in mind. But hey Dax, speaking of Facebook posts, I wanted to talk about one [post] that popped up. I think it was yesterday or maybe it was even this morning, where…

Dax Castro
That’s on our PDF Accessibility Facebook group by the way.

Adobe Acrobat Accessibility Wizard

Chad Chelius
Yeah. Thank you for pointing that out. But she had said, “She was working in InDesign.” Although, it doesn’t really matter. It could be any program. But she was working in a source application, and she was making a PDF, and she was saying that when she took it into Acrobat, she was running the Accessibility Wizard, and she was getting all kinds of problems.

Dax Castro
The Accessibility wizard?

Chad Chelius
Yes! And I can’t tell you how many times I have run into people who are doing this.

Dax Castro
Well, it’s interesting, because the accessibility wizard is such a hard thing to even find. For those of you listening, if you go to View, and then Tools, and then Action Wizards at the bottom – yeah, I guess it is Action Wizards at the bottom – or you can go on the tools bar right on the right hand side and hit the add. It’s more tools. That’s right, more tools. And then at the bottom, under the Customize, there is the action wizard. And it just looks like a little check mark. So it’s very obscure. You wouldn’t know it’s there, unless you were really looking for it.

Chad Chelius
Yeah, in the tools category, it’s under the Customize category, if you’re looking for it. And actually, I don’t want you to look for it.

Dax Castro
That where we are going with this, right?

Chad Chelius
So, the Action Wizard is an interesting little tool that they added inside of Acrobat. And to be quite honest, the goal of the Action Wizard is very, very bare bones. At some level, you can just take any old PDF file, and at some level, improve the accessibility of the document. Now, when I say improve the accessibility, I’m giving it probably more credit than I probably should, because it’s a very, very rudimentary tool. You know, if you’ve been listening to our podcast for any length of time, we go into some significant detail sometimes regarding the remediation process, techniques, tips and tricks, things like that. The Action Wizard is a very, very bare bones tool. So we just got done talking about anything is an improvement. And I guess we could argue that if you’ve got a PDF with no tags whatsoever running the Action Wizard is arguably an improvement. But if you have a PDF that is already tagged, and you’ve already run like the wind, – run forest run – because it will mess up your document. You’re literally taking one step forward and 10 steps backwards, because all the work that you’ve done in the source application, you’re now getting ready to nuke it all. Because the Action Wizard doesn’t understand what’s there, and it just kind of does a steamroller process, and just kind of plows through everything. So I ran the Action Wizard on a document that I had open, that was not tagged at all, and the Action Wizard literally took everything on the page, put it in a figure tag, and asked me to add alt-text to the figure. That was the extent of the accessibility that it was doing in this document. And Dax…

Problems with Automated Accessible Forms

Dax Castro
Well, you know, I had one. And it had form fields in it. And it was grabbing the content from the form fields above, from the lines above, and like grouping it into the next ones. And it did really well on a couple of the tables that were regular, but once they started to become irregular tables with merge cells and other things, it was just not good. So my suggestion is if you’ve got a basic Word document or the basic form letter or something pretty simple, the Wizard is not bad. If that’s where you got to go with it, that’s fine. I think the biggest thing for me though is, is it kind of walks you through? If you look at the three different sections here, it’s prepare, and it tells you what it’s going to do. Add the Document Description, set the open options.

Dax Castro
And then those are good things. You want to start with that with almost every document. And then it does recognize the OCR, so then it’s going to be looking for scanned content. The problem I have with this is that if you’re using the wizard to recognize OCR content, that’s scanned images, you don’t have any fine control, versus the OCR enhanced scan capabilities in the OCR workflow in the Acrobat users. This is just the basic level of that tool. And there’s much more that you can actually do with that tool if you go and actually do recognize text rather than using the wizard. I think the wizard just as kind of a basic level. But I think that if I’m looking at any of this and going, “Okay. Set the language tags, auto tag the document, and set alternate text.” Auto tag the document! We all know, when you try to auto tag a document just using Acrobat, it really doesn’t do a great job a majority of the time. And the problem is, if you don’t understand tag structure and you use auto tag, how do you know what’s not tagged correctly?

Chad Chelius
Yeah. You’re absolutely right. I mean, PDF remediation, certainly, is harder than it should be. But there is a certain skill set required as you’re doing this. And it kind of comes back to, we use checkers and they’re great tools. But at the end of the day, we need a human being to evaluate this document to look for things like headings are not tagged or bulleted items are not tagged correctly. You know what I mean. Like we need a human being to find those problems.

Dax Castro
Well, one of the other things, though, is that even if you did get that part, you use this wizard and run the set, alternate text wizard. Right now, there’s a bug in that wizard that if you set something, if you check the box for mark if this is decorative, it actually adds an artifact tag to the tags tree, rather than being in the content panel, which is a violation of PDF/UA. So you might look at that and go, “Yeah, oh that’s great. It set an artifact tag.” And that’s just fine unless you actually are trying to remediate this to the PDF/UA standard. And in fact, it’s funny that Acrobat itself doesn’t even catch that the artifact tag exists in the wrong place. So yeah, it can be problematic if you don’t know what you’re doing.

Chad Chelius
For sure. So I think for most of our listeners, who are here… You know, you guys are listening to this because you’re trying to increase your knowledge and learn new information, new techniques. For all of you, I think, we’re gonna say, “Do not use the Make Accessible Action Wizard.” Just do not touch it. Run away and avoid using it. And in the case of the person who made that post, I just kind of explained to her that running the make accessible action wizard is actually undoing a lot of the beneficial things that she had already done in the source file. But again, that’s part of the learning process. I mean, I also keep running into people who use InDesign that are tagging their content using XML tags.

Dax Castro
And I use that tags panel.

Chad Chelius
Yeah.

Dax Castro
I started that way. I made that mistake early on. I’m like, “Oh look, it’s a tags panel. I can use that.” And I start, “Oh look, everything is color coded.” And I thought I was doing a great thing. And then I realized, “Oh, this is XML stuff.” That’s not helping me out here.

OCR – Optical Character Recognition for Accessibility

Chad Chelius
But again, you had said, you hit the nail on the head, accessibility is a journey. And so we’ve all done things incorrectly in that path, in that journey, and that’s part of us all getting better at what we do. So Cool. So now the one thing that I would like to talk about kind of related to that Action Wizard is, you know, one of the steps in there was to run OCR, which is interesting because Acrobat itself no longer refers to that process as OCR. When I say OCR, I’m referring to Optical Character Recognition. I had a big… It was a big source of confusion with one of my classes I was teaching, because to them, OCR meant the Office of Civil Rights. And they were confused as to why I kept saying OCR. Like, “What does the Office of Civil Rights have to do with PDF remediation?” And so it took us a few minutes to work through that one. But OCR in my world and in Acrobat is typically stood for Optical Character Recognition. Now in more recent versions of Acrobat, they have changed the name to simply text recognition. And actually, you know what, I’m going to lie here, because they did…

Dax Castro
I’m looking at tools panel here inside Acrobat. It does says, scan and OCR.

Chad Chelius
But the last version did not. The last version said, “Text recognition”. So they keep going back and forth on me here. So I apologize. They take that back. They still do use the term OCR. In some versions, you may see, it say, “Text Recognition”. But at the end of the day, what this tool does is it will take a PDF of a scanned page or pages, and it will run it through a process, so that it can detect the actual text that is made up in that image. So, we all kind of know that when an image exists in a PDF, the best you can do is add alternate-text to that image. The scan and OCR or text recognition will actually evaluate that image and make it actual text. Now, in Acrobat, there are two ways you can do that. There is one setting that makes the document a searchable image. And this is important for people who do legal documents, because they’re not allowed to visually change the look of that document for legal purposes. It has got to stay exactly the way that it looked. So what the searchable image does is it actually kind of puts another layer in the document, and has actual text underneath the image that is searchable, it’s selectable, and it’s taggable. And then the other option that they have, the setting that they have is called editable text and images. And in that case, Acrobat used to refer to this as clear scan. I don’t know if that’s still a terminology that they use. But it basically renders the image as text. And it kind of still looks like the image, but it’s been vectorized. It’s actually text that you can change the formatting of it, you can change the font, you can do pretty much whatever you want, but it will visually change the look of the document. And so for legal purposes, that may not be an option for some people. But regardless of which option you use, you can then go into Acrobat and tag that content and make the document accessible. So it’s really a powerful tool. Now, Dax, you’ve got experience with another product that does this called Abbyy FineReader.

Dax Castro
I will tell you that the company I worked for, we get a lot of reference material – appendices. And they were almost always scanned. And because it’s just, “Hey, here’s the 50 pages that have to deal with this soil content over time or whatever.” And they would just get stuck in the back of these 800 page environmental reports as backup for why we came up with this conclusion. The problem is, when we have to post those documents, they have to be accessible. And so whenever we could, we tried not to include those dependencies, so we can avoid having to make them accessible, but oftentimes we needed it to be included. The client wanted all of it there.

Dax Castro
And so we struggled with Acrobat. And now you can go into Acrobat and go under… You know, when you do a scan and OCR, you’ve got enhance, and underneath the enhanced drop down, you can do scan document and you can go through the steps of recognizing the document and it does an okay job. But when it comes to tables with shading and maybe there’s symbology and lots of other little things, it can be very wrong. And so we did some research and Abbyy FineReader did an amazing job at scanning the content. I mean, talk about third generation tables with shading in the header cells, so you could barely read it, it did a really great job. And we were really impressed. It’s got a lot of features: automated document batching, and hot folders, and stuff like that. But the nice thing was it de-skewed the content. And it actually recognized the content correctly, which was really, really amazing. And so I will tell you, if you are doing OCR, if it’s any part of your workflow in a meaningful way repeatedly, Abbyy FineReader is definitely a great tool to use.

Chad Chelius
Awesome! I’ve never actually used it. I mean, in my work, I don’t run into a high volume of scan documents. I actually try to avoid them to be quite honest. But I know in my LinkedIn learning course, I do show that. And to be fair, I used a fairly simple document to demonstrate that. And part of the reason for that is because as you pointed out, in Acrobat, the scan and OCR feature, it reaches a point where it just fails. And that’s where it becomes… And I mean, let’s be honest. Tagging tables in Acrobat period is grueling. I mean, it’s interesting, because when you’re in the reading order panel, which is what you use in Acrobat to actually tag content, there is a button for table. And you’re like, “Oh wow, I should be able to select a table, hit that button, and it’s going to do the work for me.” Let me tell you, that does not work.

Dax Castro
Yeah. So everything you know…

Chad Chelius
Yeah. I mean, when I show this… And nobody likes to do this. Nobody likes the way I show it. But I literally have to tag each cell individually. And then…

Abbyy Fine Reader

Dax Castro
I will tell you, Abbyy FineReader actually tags your content and it does a good job. It really does.

Chad Chelius
That’s awesome.

Dax Castro
You check a box that says Create PDF/UA compliant – well, it says Create PDF/UA – and then when you select the PDF document type, and it literally tags the document for you. Now, it’s not perfect, but it does tag the objects and then you can go back… And you know, the nice thing about Abbyy FineReader is it allows you to define suspects. Like, id defines the suspects. And it says, “Hey, what do you think this is supposed to be and that type of thing.” And the other thing it allows you to do is draw regions in your document. So actually before you run the OCR, it shows you the image in your document, and then you can draw a box around the logos or the signature or whatever. And then it treats those as images whereas Acrobat kind of guesses at what it thinks. And sometimes it catches it and sometimes not. Now, in fairness when you have a signature that goes over top of text, it tends to not get it as well, because obviously trying to separate the two, but I think there’s no good solution there. But if you’re doing scanning at any repetitive level, Abbyy FineReader is really just a great program. And we’d love to have them on as a sponsor.

Chad Chelius
Yeah, honestly.

Dax Castro
By the way, if anybody is listening to Abbyy FineReader, we’d love to have you guys on and talk. We actually love to have them on a talk about their process about how they go about doing that accessibility tagging, because it really is a great thing. So I might actually just reach out to them. That would be a great guest to have here.

Chad Chelius
Here we are promoting them and they are not even… But one of the things I want to mention, Dax, is you had mentioned the suspects. And Acrobat does that as well. Acrobat kind of allows you to kind of go through and find any suspects. And it’ll highlight the letter or letters that it doesn’t understand. And then you could type in what it’s supposed to be. And this kind of goes back to our conversation in a previous podcast about the auto tag feature. You know that there’s things about scan and OCR that it does pretty well. There’s things about auto tag that does really well. I wish both of those features would give me a selective option. Let me select a region and just say auto tag or run OCR on it and do it that way. But who am I. I don’t have any…

Dax Castro
You are Chad Chelius. The infamous and famous. Chad, it is time for… We’re at the end of our session here. But before we go, I want to talk about “Who’s on Twitter?”

Who’s on Twitter?

Chad Chelius
Yes. Who do we have on Twitter today? I think you had said it was Accessibility Scotland.

Dax Castro
That is right. They are @a11yscotland. And I stumbled across these people as I normally do – trolling through Twitter looking for accessibility content and topics about accessibility. And it was interesting. There’s a post on here that they actually retweeted from Christine Moorhouse (her Twitter handle is @christine_ moo). She says, “Another alt-text question from your fave marketer trying to do better on her own socials: why have I seen more and more @instagram profiles putting their alt text in the caption or directing people to the comments instead of using the inbuilt feature?” And I thought that was interesting. We don’t typically talk about social media here on Chax Chat, because it’s about document accessibility. But one of the good things that Instagram has done is they have implemented alt text. But I don’t think there’s enough awareness about accessibility on the platform. You know, I think people don’t use the alt-text the way it’s intended, because they’re not really sure what it’s intended for. And so they’re just like, “Hey, go look at the comment. I put the description in the comments.” You’re like, “But that’s not what alt-text is for.” And I think there’s just a disconnect there.

Accessibility for Social Media

Chad Chelius
And Dax I mean, that’s really the extent. I mean, is that really the extent of accessibility on social media basically adding alt-text to images? Now, it goes beyond that.

Dax Castro
There are some controls. So Twitter has done a great job at improving their accessibility. Matter of fact, there’s a sneaky way to actually… So, there is no character limit for alt-text inside Twitter. So you can literally put a single image and you know, your Twitter post is only how many characters? 240. So, you put part of your text in the Twitter, in your post, and then the rest of the text could go in the alt-text. And you can get away with adding a lot more content, which some people are doing, which is not really the use of alt-text. Alt-texts is just to describe the image that’s there.

Dax Castro
But there are some good controls within Twitter. Instagram has added captions. If you are on Instagram and you do a video, you can go in and add the sticker that says captions onto your video. And it will actually add caption overlays to your video. And it’s really great. That’s a really good feature. So you know, TikTok as well has got some of the ability to auto-caption, although it doesn’t do a great job. Sometimes, it does a decent job. So yeah, different platforms have different abilities. LinkedIn has the ability to add alt-text to a post, but if you reply to a post with an image, you don’t have the ability to add alt-text to the image you reply with, which is really odd.

Chad Chelius
That is.

Dax Castro
I don’t understand why you get in one place, but not the other. So if you reply to a LinkedIn post with an image, always put your image description in your post as well.

Chad Chelius
Interesting!

Dax Castro
Yeah. And so that’s Accessibility Scotland.

Chad Chelius
Yeah. Also, I wanted to correct myself. It’s 280 characters. I didn’t want our listeners to start firing bullets at me: “How can you not know how many characters?” I’m like, “I am not a Twitter guy.” So I just wanted to correct myself on that. Cool! Well, listen Dax, I mean, I think that was a pretty good amount of information there. I think we covered a lot of interesting topics and a little bit on the obscure side, but I think definitely worthy and hopefully our listeners enjoyed that.

Dax Castro
Yeah, definitely. So again, if you are thinking about someone you know that can benefit from our podcast, please do share our podcast to make a post forward the link to chaxchat.com or send them… I don’t know how you send someone your podcast list if you’re on a streaming. If you’re like on Spotify… I don’t know how that works, but…

Chad Chelius
And I just want to remind everybody, and this is relatively new, but we do have a Chax Chat Facebook page.

Dax Castro
Yeah, we do.

Chad Chelius
So feel free to go on there, and make any comments, and give us your feedback. We want to hear from you.

Dax Castro
Absolutely.

Upcoming Guest David Blatner

Chad Chelius
Alright guys, well listen, one of the things we have in the works here, we have planned a guest on our podcast. David Blatner of CreativePro has agreed to be a guest on our podcast. And if you don’t know David, if you do anything in the design world, I mean, you probably know David Blatner. But he is a wealth of information. He runs the CreativePro Week, the Design and Accessibility conference, the illustrator Photoshop conference. I mean, he does a lot of great stuff for the community. So we’re really excited to have him. And stay tuned. It’s going to be in an upcoming podcast. And we hope you all enjoy that one as well.

Chad Chelius
So that brings us to the end Dax. Once again, we want to thank AbleDocs for being the sponsor of our podcast. As a reminder, AbleDocs, makers of axesWord, axesPDF, as well as document remediation services. My name is Chad Chelius.

Dax Castro
And my name is Dax Castro and together we are Chax Chat, where each week we unravel accessibility.

Chad Chelius
Thanks, guys.