Listen to this post: (podcast audio is at the bottom of the page.)
Lately, there’s a growing push to make accessibility more creative, and I think it’s just rad! In fact, some of the major players at FWD-Doc (Documentary Filmmakers with Disabilities) are presenting at this year’s SXSW a panel called Creativity Enhanced: Documentary Tools for Smarter, More Accessible Storytelling. What I love about this panel by Day Al-Mohamed, Amanda Upson, and Lindsey Dryden is that the title sounds kind of formal and serious, but they’re planning to give examples of newer, more creative, more cutting-edge stuff.
If you want your film or visual arts to be creative, interesting, and unique, why wouldn’t you want the language-based translation of it to be creative and interesting and unique?
I’m hearing blind people and non-visual learners talk about how exciting and inviting it is to hear or read a description that’s pleasurable. If the world of visuals isn’t someone’s main point of reference, why do sighted audio describers spend so much time making references to only visuals? I’m not saying it’s easy. Even I’m still grappling with a question Thomas Reid asks me all the time, “Yeah, that’s what it looks like, but how did it make you feel when you looked at it?” It’s the perfect question to counter the old-school “say what you see” method of Audio Description that sounds like it’s trying to pretend to be objective. As a describer, you have the power to pick which things you saw that you want to say something about and how you want to say it. Neutrality is impossible. Your culture, your values, your politics, and your desires to be creative or not are always at play.
The New York Times recently published an article about the importance of image descriptions for pictures, paintings, and other visual art. They gave some examples of AI-generated descriptions that were outrageously inaccurate and/or unuseful, and even gave some lovely examples of descriptions written by people for contrast. I don’t subscribe, so I can’t open it anymore to give some examples, but it’s very well worth the read if you can get it. The Verge had a great article about how when you don’t use image descriptions, it creates a hellscape of either confusion, serious lack of information, or getting overloaded with Unicode character descriptions instead of being able to actually just read the text. You can’t make every single thing accessible to all people, but improving access starts with understanding where the problems are, what the barriers are, and what effect these barriers have on people.
So, today, have a listen or read to me sharing some of the bullshit auto-generated image descriptions I’ve come across in my endless time at my computer.
Here’s a downloadable transcript of Pigeonhole Podcast 38.
Transcript
[bright ambient music]
CHORUS OF VOICES: Pigeonholed, pigeonhole, pigeonhole, pigeonhole, pigeonhole, pigeonhole, pigeonhole, pigeonhole.
CHERYL: I’m trying to convince more people to add written image descriptions and/or alt text to their pictures on social media. Audio describers especially, I’m pointing my finger in your direction right now. Please, describe your photos. It’s not like the people who love your audio description finish an audio described film, then log onto social media and say to themselves, “Wow, I love having accessibility only part-time!” It doesn’t come naturally or easily for everyone to translate a picture into words, I know. So, team up. Work with someone who can help out.
[bright ambient music fades out]
Today, me reading the AI-generated descriptions I’ve found in Microsoft Office products and on social media when my images fail to load. I’m only going to translate one by telling you what’s actually in the picture. If you find yourself at any time wondering what the pictures actually are or wanting more, welcome.
Now, to be fair, some of the things I’m gonna read are kind of true. Many of them are completely wrong, but the things that do have some truth in them, really, um, I’m not sure that it even matters that they got some of the details accurate because you just still have no idea what’s going on. Please start adding image descriptions or alt text to your posts, on your blog, and in your e-newsletters!
[awkwardly distorted old-timey waltz]
May be an image of 1 person.
May be an image of 2 people, tree, and text.
May be an image of 4 people, people standing, and outdoors.
No photo description available.
A picture containing dog, grass, mammal.
No photo description available.
A picture containing food, plate, indoor, rice.
A person and person sitting at a table with food and drinks.
A cat lying on a chair.
A person with glasses and a cat.
A picture containing text, monitor, indoor, person.
Graphical user interface, website.
A collage of people.
A picture containing photo, different, piece, chocolate.
May be an image of text that says, “ROXBURY International FILM FESTIVAL.”
May be a cartoon of text that says, “When the teacher lets the class pick their own groups: made with mematic A II.”
A picture containing drawing, plate, cup. [music stops abruptly]
This is a PowerPoint slide where I placed the logos for closed captions and audio description on a blank slide for a presentation on, well, closed captions and audio description. Note to self: prepare presentations on drawing, plate, and cup for next time because the work of writing the alt text is already complete!
[awkward old-timey waltz starts up again]
A person wearing a garment. Description automatically generated with medium confidence.
[waltz plays a bit longer, then fades into an upbeat, jazzy, lounge-y number]
Check out the Audio Description in the Making online exhibit from the AIM Lab at Concordia University at AudioDescription.AccesInTheMaking.ca for some innovative, creative work. The students chose an artwork (or video game, book cover, mead fermenting, a binder, a logo, whatever they wanted) and described it. Then, many of them basically tossed the original artwork and created a new type of story starting from the audio description. I gobbled it up. I hope you will too.
And check out Andy Slater. He’s a blind sound artist and musician who did a whole exhibition of paintings that was presented as only alt text, meaning you had to use a screen reader to hear or read descriptions of the paintings. No visuals were even presented. It’s called Invisible Ink, and if you don’t know how to use a screen reader, that’s OK. Instructions are provided on the website at ThisIsAndySlater.net/InvisibleInk.
[lounge fades into bright ambient theme music]
Every episode is transcribed. Links, guest info, and transcripts are all at WhoAmIToStopIt.com, my disability arts blog. I’m Cheryl, and…
TWO VOICES: this is Pigeonhole.
CHERYL: Pigeonhole: Don’t sit where society puts you.
Music in the episode: “Dark Eyes.” by Teddy and Marge. (Source: FreeMusicArchive.org. Licensed under a Attribution-Noncommercial-Share Alike 3.0 United States License.) “Helges Friend Woke Up (ID 1268) – Remastered” by Lobo Loco. (Source: FreeMusicArchive.org. Licensed under a Attribution-NonCommercial-ShareAlike 4.0 International License.)
Hide
Podcast: Play in new window | Download | Embed
One response to “Pigeonhole Podcast 38: May be image of cup”
[…] and often interesting. Some platforms are beginning to attempt to bridge this gap with AI, but as this page so clearly illustrates, it frankly does a terribly inadequate job, often spouting things […]