Computer vision is a massive field in computer science. I guess you could say it’s a subfield of AI. If it’s unfamiliar to you (wikipedia), it’s basically about how computers see things. There are many applications for computer vision. How many? Hmmm. Think of the many ways you use your eyes and then multiply that by whatever you consider to be a very big number and then you’ll at least have a visceral sense of how many.
In the realms of digital humanities and media studies, there’s much interest in using the tools of computer vision to carryout research. This recent (2020) article by Lev Manovich represents this interest well. To be clear, I have no more essential objection to using the tools of computer vision than I have essential objections to using the tools of printing. In fact, I’ll likely find reasons to use both.
That said, as I lurch into sabbatical for the fall, my interests in computer vision are less in using such technologies to carry out research as they are in studying the technologies themselves. For my sabbatical proposal, I couched this interest in terms of deep fakes, but really that’s just a salient (and salacious) starting point. There’s no doubt that computer vision is part of the military-industrial-entertainment complex, with all the moral-ethical-political questions that follow. In fact, there’s no doubt that computer vision is at the leading edge of that complex. (And I would hope my readers recognize “complex” as complex.)
My interests are not in passing judgment on computer vision. It would be the height of hubris for me to do so. It would be equally foolish for me to attempt to advance a technical understanding of computer vision. I am not a computer scientist!
So what is left? If I am not advancing the technical knowledge of computer vision and I am not offering an interpretation/critique/argument on the cultural value of computer vision, what is left? It’s an interesting quasi-Latourian question. As you’ll recall from prior episodes, in We Have Never Been Modern, Latour laid out the coming together of scientific and political representations. And yet this would be something else: in between? a third space? whatever?
To invert and totally bastardize TS Eliot, this is how the world goes, not with a bang but a … (*click*?) Computer vision doesn’t “represent” the world; it is part of the world. It no more represents the world than my dog does, than I do. We’re all constructing the world, such as it is and wherever it’s going. There seems to be some weird thing that extends both from engineers and from critical cultural studies, that there is a certainty about how computers should see the world. It is a cybernetic certainty, one that they use in an effort to steer computer vision toward particular endpoints.
I have to say I’m not interested in an argument about how computers should see the world. I also have no investment in insisting on how humans (e.g., my students) should see the world.
I am interested in the ontological-epistemological operation of seeing. I am interested in the rhetorical-cultural operation of seeing. I am interested in how the nonhuman “sight” of computers intersects without our own vision both to expand/alter our capacities and help us understand more about them.
I’ll end by circling back to the “ripped from the headlines” version of this I offered in my sabbatical request (i.e. deep fakes). Why do we care about what we see on a screen? We know it’s not real. At the very best, what we see on a screen a limited representation of real time events. At worst, it’s pure fiction. A deep fake is a fiction constructed from digital representations of real events, so I suppose in some sense it’s in-between. I suppose we could say the same about Hollywood movies: those are real people saying those lines on a set in front of a camera. I’m not saying we shouldn’t care about these things. To the contrary, I am very interested in this question. These assemblages of technologies participate in our shared understanding of the world. How might we understand not simply their technical function, but their ontological and epistemological function? How might we produce understandings that expand our capacities as agents?