Designing Secure Desktop Operating System

by Timo Sirainen, July 2004, comments to tss@iki.fi.

Status in November 2005: Nothing has happened to this idea. I've thought about implementing a prototype of it once in a while, but it requires time, and I already have other projects that I'd also like to finish..

Abstract

This document aims to prove that it is possible to design a user interface which allows user to freely use his computer without having to worry much about potential security related consequences of those actions. User also doesn't even need to know that there even exist any security measures to prevent bad things from happening. This should be true for nearly all software the user might want to install and use.

It's of course still possible to shoot yourself into foot. If it wasn't, the operating system wouldn't be usable. There's however a huge difference between oops, I ran that program and now it destroyed all my files versus oops, I selected all my files and gave this program write permissions to them and it destroyed them.

Required kernel modifications to implement this kind of operating system is not discussed. There are several possibilities how to implement it using existing technologies such as capabilities or mandatory access controls (see links below).

Note that this document doesn't explain the full design of the system. It's rather about "ideas how the design should be done". The full design needs a lot of thinking and some things may need different solutions than what I've described here. Don't get stuck on minor details or inconsistencies.

I hope other people would get interested enough about this to actually start designing and implementing such operating system. If not, I'll probably slowly continue with it myself.

Definitions

Process
Normal UNIX-like process with private address space.
Component
One or more processes with private permanent data storage space (filesystem), ie. component is fully sandboxed.
Interface
Components may communicate with other components using interfaces (via some IPC method).
Element
Graphical object, possibly containing other embedded elements inside it. Element is owned by a component.
Window
Top level window with frames and titlebar. Windows are constructed from elements.

Safe Operating System

  1. Protect own data: Design components in a way that they allow other components access to their private data only in a way user intended.
  2. Protect other systems: Don't allow components to connect to other computers without permission.
  3. Protect privacy: Don't allow components to notify their presence to other computers without permission.

Making these usable and user friendly is the hard part. There has to be some compromises between 2), 3) and user friendliness.

Input

Input is divided to three categories:

  1. Trusted: Input from physical devices (eg. keyboard and mouse) is fully trusted. Interfaces doing potentially harmful actions should be allowed to be executed only via trusted input.
  2. Non-automated: Input originated from physical device. It may have gone through multiple components to get here, but it couldn't have been done automatically. This is because a single physical event can be sent only once to each component as non-automated input. The component can get the time when the event occured, so it may also check that the event hadn't been deliberately delayed for too long.
  3. Untrusted: The input may have come from anywhere.

Interfaces can be marked to accept only trusted and/or non-automated input.

Privileged components can produce trusted input from non-physical devices as well. This input is of course only as safe as the software, but it's pretty much required for remote use of the computer.

User Interface

Windows are constructed from elements. Elements are owned by components. One window can contain elements from multiple components. Only the element owner can "see" what is inside element, others can manipulate the element only via component interfaces.

Operating system prevents components from hiding other components' elements by moving them or placing other elements on top of them.

Only the component having focus sees keyboard or mouse events. Components can however register keyboard shortcuts which can be triggered from any component. Keyboard shortcuts cannot be overridden.

Some problems and solutions:

Shared Files

Normally components cannot open files not created by themselves. This is fine for configuration and private data, but not if user wants to share the file between multiple components. Typically this is done with Open file and Save file operations.

Shared files component allows user to give untrusted component access to files matching a given search criteria - usually one or more selected files. The access is granted by giving the untrusted component a token which can be used to access the file until the token is invalidated. The token is invalidated when:

The token can be permanently saved as well, either internally for the component or attached into saved shared file as metadata. If it's attached with the file, then any component opening the file gets access to the tokens.

Permanent tokens could be implemented by making shared files component sign it with a cryptographic signature. This would allow storing the tokens in untrusted storage.

Connecting to Other Computers

Programs doing connections could be divided to three categories:

  1. Clients connecting to servers, where the server name is written by user at least once in program's lifetime.
  2. Programs connecting to a predefined set of servers.
  3. Programs starting connections with 1) or 2), but depending on input they may connect to more computers.

First one is simple to solve. We'll have a connect element where user can type the server(s) where to connect to. The element returns the caller component an authorization token which can be used to connect to those specified servers.

This works very similiar to opening files. The authorization token can be saved with configuration file, so once user types the server name the first time in initial configuration, he also grants the program access to the server for as long as the configuration file exists.

Second and third ones are tricky. We would rather not allow spyware to announce it's presence to outside world, or run a DDoS client flooding some poor computers. But we would rather not want to specifically grant access to every program wanting to connect outside.

This should be configurable, but default should probably be to allow toplevel components to connect outside (and pass that privilege to it's child components if needed). Creating toplevel components requires special privileges. Typically you'd create them only using system's built-in components such as when you doubleclick a program in desktop or file browser.

Also because of spamming no-one would be able to connect to SMTP ports automatically (but see mail client example below).

Example: Web browsing

Typically you start browsing by typing web site name in the location bar and press enter to load it. Here the location would be the connect element which would grant access to the given web site.

Bookmarking site would save the connect token and opening a bookmark would use it.

Links would be shown using link elements. Link element would get screen areas as input where the link is located and URL where link points at. When user moves mouse over the link, the mouse pointer changes and URL would be printed in status bar. By clicking the link user would grant access to the web site in the URL.

"Web site access" means access to domain of the web site and it's subdomains (or possibly to everything under the top domain?).

The problematic part is when a single web page consists of elements from multiple domains. For example images, frames or applets. This can actually be a privacy issue and preventing it completely may be a good idea in any case. For example some message boards allow users to add image links, allowing users to see who read the board by looking at logs of who accesses the image.

The non-loaded elements could still be shown as boxes containing some "not loaded" icon and right-clicking them would give an option to (permanently) load elements from that domain. This could be done using link element.

Alternatively, the web browser could just be configured to allow connecting anywhere. With default settings that would be allowed if the web browser was started as a toplevel component.

Example: Mail client

Mail clients usually need to access SMTP and POP3 or IMAP servers. Their addresses would be configured using connect elements, so the mail client gets access to connect to wanted servers automatically while it's configured. No extra privileges needed or questions asked.

If you're running SMTP server yourself, that would work somewhat differently. It would obviously need special privileges to be able to connect to any SMTP server in the world. That privilege could have been given while system was installed with default mail server, and if you really want to run your own mail server which didn't come with the system, you should know enough how to give it access.

Note that sending mails to local users wouldn't be restricted. There's no point in preventing that. Nothing would either prevent mails for local users from being forwarded to another server (in aliases-table).

Mail client could also provide a mail component to let other software send mails interactively. This component would provide interfaces to other components to display destination address fields (To, Cc, Bcc) and a "Send mail" button. Mail would be sent to only in the addresses specified in the address fields. The fields could be placed somewhat freely, but not too far apart or invisibly so eg. Bcc-field couldn't be hidden and mail be sent there as well. Alternatively the mail component could just be called to display a "Compose mail window" with predefined content.

Example: Address Book

Address book is usable for many different programs, so it should be available to everyone. However, it shouldn't be possible for a program to corrupt your address book or steal the data in it.

Interaction with other components would typically be:

  1. Request user to select one or more addresses and return them to caller.
  2. Automatically expand an address while it's being typed.
  3. Add, remove or modify address book records.

First is simple - it would work just as Open file dialog.

Second should also be quite simple. Trusted and non-automated events could call expand interface which would modify the input field, or possibly show a dropdown list showing available matches and after selecting one of them it would modify the input field.

Interactively adding, removing and modifying records would be simple. Just request the address book component to show a specific record with predefined values. User would then decide if he wants to actually do the changes.

Batch processing (syncing with PDA, etc.) couldn't be done safely, so they'd require extra privileges.

Example: Spreadsheet

Say you want to create a spreadsheet which gets its input from company database and draws pretty charts from it.

So, once again you specify the database location using connect element which grants access to connect to the database. You do your calculations and finally save the spreadsheet, with the database connection token. Next time the database connection is opened automatically.

Say you want to quickly email the file. You simply select Send as email from menu and your mail client window pops up with the spreadsheet file already attached in the message body.

Extra Privileges

What exactly does it mean then to require extra privileges? It's simply that a component needs access to interfaces in another component that aren't available to untrusted components because they could be misused.

User can decide on a per-component basis the trust-relationships. A component may offer different levels of trust, such as read-only access. Component can list the trust-relationships that it can take advantage of, and explanations what they're useful for. The target component on the other hand can also provide a description why giving the trust would be harmful.

For example a trust configuration for PDA syncing program would look like:

Trusted Output

Component could be given more privileges by letting it drop the possibility to communicate with any component unless the interface is marked as trusted output.

Trusted output devices would be for example video card, sound card, local/trusted printer, all kind of USB devices such as camera, etc.

The point is that once a component gets more privileged access to data, it must not be able to send it to untrusted locations. That means it cannot do network connections, talk to untrusted components or even write files (except temporary which get deleted once the component is destroyed). Saving files interactively via shared files component could also be treated as trusted output.

This would allow for example screen magnifier program to get full access to screen input and ability to write the magnified part back to screen without requiring extra privileges.

Another example would be the PDA syncing program. If PDA was marked as having trusted output, the address book component could allow read-only access to all data. This could be taken even further by having a concept of privileged device where if a user has given component access to such device (simply via "Select device" dialog the first time program is run), such component might have also more privileged access to other components, such as write-access for address book.

Privilege Separation

Complex software contains lots of bugs. Some of the bugs can be security related. For example if HTML page renderer had a buffer overflow, it's instant remote code execution vulnerability. With todays systems that's the worst that can happen, such code could do almost anything with your system.

Now, what if web browser was designed so that each web page was executed in it's own process with very strict permissions about what it can do? Very much like Java applets work, except instead of a virtual machine doing the checks it would be your operating system doing the checks. Once you close the web page, the process dies and anything it was just doing would die as well. There would be no need to actually even try to prevent buffer overflows in HTML renderer - all the attacker could do would be to use your CPU cycles.

The same could be applied almost anywhere - you'd have highly restricted renderer components rendering web pages and emails or doing preview of files in file manager. Word processors and spreadsheets would separate documents from each others - in such environment it's impossible to have macro viruses as they could only harm themselves.

Command Line Interface

Above I've only been talking about graphical user interfaces. How would this work with CLIs then? To get full use out of it, it'd probably require a completely new shell language. However in general the programs would still take input and produce output, where input could be separated into trusted, non-automated and untrusted.

Below are some thoughts about modifying normal UNIX shell.

Think for example following script called bug.sh:

#!/bin/sh

to_address=$1
file=$2
subject=$3

if [ "$subject" = "bug" ]; then
  subject="Bug report"
fi

mail -s $subject $to_address < $file

Pretty simple script, but it'd work. User could start it with bug.sh bugs@example.com /tmp/crash bug. to_address and file variables came from trusted input and they're unmodified so they get passed to mail program as trusted input too. subject was modified, so it's passed as non-automated input. mail itself sends the mail only if the destination email address came from trusted input. The script can open file only because it came from trusted user input.

Now, think of a spammer.sh then:

#!/bin/sh

for word in `cat /usr/share/dict/words`; do
  echo "Buy our stuff" | mail -s "spam" "$word@domain.org"
done

This simply wouldn't work. First it doesn't have access to the words dictionary file. Second, parameters for mail would be from untrusted input, so mail would refuse to send the mail.

Now, how can we trust /bin/sh then to keep the trust correct? One way would be to just make it a privileged program that tries to remember the trust status internally. Probably a better way would be to allow parameters and environment to contain file descriptors rather than just strings. Trust status would be attached to file descriptors, so passing them around would be guaranteed to keep the trust correctly. This would all work transparently in scripts (if input was modified, the new value would be written to file descriptor and thus degrading it's trust status to non-automated input).

But can scripts then do anything useful? The good thing is that user is always able to give permissions to scripts manually. The bad thing is that it's probably required. There could be script editors however which would allow you to grant privileges as you type the scripts, eg. right-clicking a path could contain option "grant privileges to this file" menu item. Or it could do it automatically as you type, assuming it understands enough about the language (and that you actually wrote it, not simply opened an existing script which contained the data!).

The best initial plan is probably to just allow xterm and it's child components (/bin/sh) to do anything.

Reusing Existing Code

My implementation plans have been to build this on top of Linux kernel and much of it's userspace. It's possible to get full binary compatibility with old software by giving them enough privileges. It should also be possible to quite easily modify existing programs to use other components to perform tasks rather than doing it themselves. For example simply by modifying GTK, Qt, GNOME and KDE libraries' open_file_dialog() function to use shared files component gives binary-level compatibility with tons of applications which don't need more than that.

I think it's still possible to use X11 protocol, although there needs to be added restrictions as to who gets to access what data. X input events also need to contain trust status.

The Weakest Link: Users

The point of my Secure OS design is to make it easy for users to prevent computer from doing dangerous things without permission. But is it enough?

Home users are in full control of their system, so they can always override any security measures. If user really wants to install some program that requires full access to the operating system, then user will do it no matter how big warnings the operating system gives. There are legitimate programs that requires it, so it shouldn't tried to be completely prevented.

Today, most home users don't really care about security problems. That's because viruses of today don't really do anything harmful to the user. They might just slow things down a bit or do annoying things like reboot the computer, but no permanent damage. It's also because it's very difficult to be secure running today's operating systems. You can't install any new program or you're potentially infected with a virus. You can't even safely browse the web unless you keep constantly installing the latest security fixes for your web browser.

I think this problem will solve itself in not so distant future. Once it's common for viruses to steal money (web banks, credit cards) or destroy important files, users will start caring and be more paranoid when a program requests extra privileges. Then users will also start caring about operating systems which make it easy to install new programs safely.

In corporate environment the user could just be locked from doing anything potentially dangerous without ability to override it.

Links

Mandatory Access Control

Capability Systems

X Server Security

Other