ITAS-Publikationen

A hypervisor against ferrying away data

Interview with Chris Dalton, Hewlett Packard Laboratories, Bristol, UK

Which threats can be countered by the kind of hypervisor being built by the OpenTC consortium? This is first question addressed in the interview, which subsequently discusses how these threats can be reduced, what progress has been made here in the project, and what the remaining future issues are. Chris Dalton from Hewlett Packard, who worked on the project from the beginning, was interviewed by Franco Furger and Arnd Weber from ITAS, Forschungszentrum Karlsruhe, Germany, on January 27, 2009, in Bristol, UK.

Arnd	Do you think there are certain risks or attacks which have already occurred that could be addressed with a hypervisor-like approach?
Chris	In general, I think the answer is “yes”. Code is always going to have bugs that can be potentially compromised. The big thing that virtualisation brings is the ability to compartmentalise, to contain environments. This puts boundaries around that code. You accept that some code can be compromised, but you have, if you like, a ring-fence around it to defend against those compromises. An example would be a platform with Vista running on top of the virtualisation layer. Even if somebody manages to compromise Vista, which is likely, given its size – 50 million lines of code I think – there is a chance that you could have supplementary controls, within the virtualisation layer itself. Even if Vista is compromised, those controls are still effective.
Arnd	If there were an attack on Vista, how would virtualisation help?
Chris	Some of the attacks that we are aware of involve compromise of an end-user’s platform. Data from that end-user system are then ferried away to a remote location. If you had network controls that were outside Vista, but enforced by the virtualisation layer, they would then have to compromise both Vista and your virtualisation layer. In that regard – as a simple example – virtualisation could be quite useful for to setting up a network policy which remains in place even if Vista is compromised. Moving forward, users want to use applications. They do not care so much about the actual operating system. For example, if you are using a web browser, I don’t think you particularly care whether it is running under Windows or Linux. Similarly with something like iTunes, it would be nice if it were running on Linux as well as Mac/Windows. The user is interested in the application, not the operating system. With virtualisation, you could provide the applications separately. For example, you could have one virtual machine running your iTunes application, you might have one running your web browser, and you might have your office suite. In that way again, you manage to partition the amount of code that is in any particular environment. Hopefully, then you can contain any damage if your office suite is compromised. The big challenge – and this is something we have been struggling with at HP for a number of years – is that isolation itself isn’t that useful. You need isolation plus controlled sharing, and I think this is still an issue. At the moment with mainstream operating systems, at first everything is wide open and then controls are put in. Virtualisation is a better approach, where at the start “nothing is allowed” and those controls are selectively relaxed. I think that’s a better starting point. Unfortunately, you may end up in the same place. To me this is a research question, but potentially isolated environments running the applications that people want to use with controlled sharing is a more robust starting point. But then we have to worry about ease of use. That’s the killer. One of the reasons that Vista has grown so big is because Microsoft is trying to satisfy users’ expectations, i.e., give them the experiences they want. When we introduce security controls, we can’t afford to compromise those user experiences: that’s a challenge.
Franco	Is this feasible on a technical level? Do you need a separate OS for every set or family of applications? If you have 20 applications, that sounds like you need 40 GB of RAM?
Chris	You end up with groups of applications within a VM. You do not necessarily need to run them all at the same time. You might have, say a Skype application, which you only need when making a call. Certainly on the server side over the past few years, hardware has started to accommodate the needs of virtualisation. For example, the number of memory chips you can put in has increased. The hardware is evolving to meet the needs of virtualisation. On the client side, that will also happen.
Arnd	You mentioned that “data were ferried away”. I remember hearing on the news that there were organised crime attacks on home-banking solutions, and I have read various reports of a kind of organised crime where a competitor tries to get hold of confidential company data. Are you referring to one of these two or to a third option of which I am unaware?
Chris	I am certainly referring to the first two. I can give you some public examples (see references). From the references, it seems that state-organized industrial espionage is something to be concerned about too, though it is less of a concern for an individual using their home banking.
Arnd	Can you close the channel back to the espionage organisation?
Chris	In general, it seems that the attacks are quite specific, for example targeted against a particular Windows vulnerability. The problem with the current security solutions is that they typically only work against those specific attacks. With virtualisation, our feeling is one can deal with more general classes of attacks as opposed to specific ones. Let us take the network controls as an example: Windows might have a particular vulnerability that allows data to be sent over a particular network port by exploiting a particular application. There may be a patch that patches that application. But with the controls on the virtualisation layer, the user is immune from this. There are different classes of attacks that virtualisation is much better at preventing than currently existing security technology, which is more reactive. The latter copes with known threats and attacks, whereas virtualisation provides a certain degree of ability to cope with unknown attacks and threats. The general class of attacks may be known, but not the specifics. I think this is where virtualisation has an advantage in terms of security. The downside is that if virtualisation becomes more prevalent, people will start attacking the virtualisation layer. In principle, the virtualisation layer can be quite small and simple. If you look at lines of code and bugs per line of code, it should be more robust and more easily maintainable, in theory. In practice, current virtualisation layers are still pretty big. Hopefully that will change. What we have seen in the virtualisation space is that there has been more concentration on virtualisation features in general rather than security. Because the threats are not well known at the moment, or because people do not care about the current threats, it may be that more can be done within the main operating system than what is done at the moment, and this might be effective for a short while. Under, say, XP there were a number of incoming ports opened and, for a few years, there were a number of attacks based on that. But still, ultimately, having independent controls around an operating system is a good thing. I still do worry about the robustness of that underlying layer.
Arnd	Now that the OpenTC project is almost over, how far did we get in producing a relatively secure hypervisor?
Chris	I think the project has reached a good understanding of how a trustworthy or trusted virtualisation layer should look in terms of privilege separation, maintaining manageability, for instance, and, theoretically, which security services are required by a guest virtual machine. Where there is still work to be done is on the underlying robustness of the particular implementations on the one hand, and the practicality of implementations on the other. If we look, for example, at L4, it has good robustness and is a worthy placeholder for trust. The problems are its practicality in terms of the guest operating systems it can support, e.g. Windows, and also the overall manageability of an L4-based system. Enterprises have their own standard management tools and expect to be able to use them; they are not going to run under L4, or not going to run easily. XEN, the other implementation choice, on the other hand is usable, since it supports Vista and Win 7, for example. With its domain 0, it has a management partition where standard management tools can run. As an underlying code base, however, it is still pretty huge. There is still some distance to go. Theoretically, we have an architecture that should be able to merge the two, but in practical terms the OpenTC project has not managed to implement this.
Arnd	What is the problem here? Improving XEN or making L4 more manageable? Or to merge the two?
Chris	In the L4 case, there is other work going on outside OpenTC to make it a more general platform. It may be just a matter of time for this to happen. On the XEN side: It is large, because of the feature set, and it needs to be competitive in the marketplace. L4 does not really have that problem, because they do not have a large share of the virtualisation market. People expect XEN to have all the features of VMware, which means more code, and more potential vulnerabilities. There are some research angles surrounding a modular hypervisor. An example for XEN might be a pretty static environment which does not need migration functionality: don’t have that code on the system. By restructuring in this way, people could have a cut-down system. On the plus side, and this will benefit both L4 and XEN, we are seeing more features being provided in hardware that allow a reduction in the complexity of the hypervisor. We have already seen the early Intel VMX extensions, and AMD have similar extensions, that allow unmodified operating systems to be run without the need for all the complicated code that VMware had to use – they had to do binary re-writing. With modern hardware, this is unnecessary. For example, around memory management for guest virtual machines, there is a lot to reduce the complexity. Having said that, for a commercial product such as XEN, there is the problem of supporting old hardware, which is quite a big issue. If we said that all the hardware that will be used had to have certain features, that would allow us to take out a whole chunk of code.
Arnd	But if we are thinking of an OpenTC-like hypervisor, users will have to use hardware with a new TPM anyway?
Chris	But the problem from OpenTC’s perspective is that the project has been working the mainline XEN source tree and has not made any large deviations from that. You could take the XEN source tree and cut it to match the exact hardware being used by OpenTC, but OpenTC has not done that. The difficulty in taking the XEN tree and cutting it down is: Who is going to maintain it? If we take the standard tree, maintenance is taken care of. I think it is too much for OpenTC to take on the maintenance of their own version of it. This is the main problem for OpenTC: We have a theoretical architecture, but no real means of supporting that practically in an ongoing fashion. Our prototypes are very useful in terms of putting systems in front of people and saying: What do you think about this? But in terms of maintainable systems, it is not practical for OpenTC to support them.
Arnd	What are good ways to have a future virtualisation layer on PCs? Pursue the TC idea, pursue the open source idea, or is a third path more realistic?
Chris	Both of the first two constitute the right approach. My hope is that, as more threats emerge, more emphasis will be placed on security in general. There will be more willingness to accept the re-architecting of virtualisation layers. The problem is that proper implementation of the architecture developed by OpenTC would require large-scale restructuring of something like XEN.
Arnd	In the German media, there have been reports of Asian competitors hacking into corporate systems. Do too few people take the threats seriously? Is this a motivation for Citrix and the developers of XEN? Are the attacks small in economic terms?
Chris	I think people will become more aware of the threats out there. To take XEN as an example, there is also a XEN branding issue. If XEN becomes the Microsoft Windows of the virtualisation world, they are going to worry more about their system being compromised. They don’t have that to worry about at the moment.
Franco	Do you expect this change to occur fairly soon or might it just go on as it is today for years to come?
Chris	In the next 2 or 3 years.
Franco	Are you saying we need some kind of catastrophic attacks?
Chris	I think more attacks are occurring than we hear about. And I think we can expect this to get worse. If you look at some of the developments in the US (see reference), you see that the US government is starting to tackle this at a national level. You can expect this to take place in the Western world, too. The problem is that, by and large, companies are stuck with the products that are out there that they can buy. OpenTC is research, XEN is an open source project, and it takes time to move research results into actual products. Certainly over the next year or so there are not going to be any products out there that people can buy that could really help. Certainly over the next 2 or 3 years, I think people are going to become gradually more aware of the threats, and what they should be protecting against.
Franco	When you say "people" do you mean IT security people?
Chris	It certainly has to start at the top. It has to be the board.
Arnd	We have sometimes discussed the issue of runtime protection. How do you view this issue today?
Chris	What TC provides is a set of properties that are verified at boot-time. This is very useful – to establish that your trustworthy hypervisor is running. Trusted Computing gives you that first anchor. There are more mechanisms that can be built into software that would allow you to do runtime integrity checks. But these are not going to be tied into the TC hardware. Since the hardware is merely a simple, essentially recording chip which costs less than 1 US dollar, it does not do very much. Hardware mechanisms like Intel TXT exist that attempt to build on this. But even so, you still end up with an initial software running on the system and are dependent on that software doing the right thing. It is very hard to get guarantees from the hardware that the software is going to do the right thing. I don’t see that changing.
Franco	Assuming the software is compromised after an initial check. At the next boot, would it be possible to determine that the system has been compromised?
Chris	Yes, the problem is, it might be compromised again while it is running. You can always get your system to boot up in a known good state. Like a server platform, it might be running for months. TPM and associated mechanisms allow you to record measurements of a known good state. Some of that state may change legitimately, some may change illegitimately. It is very hard to distinguish between the two in terms of data. In terms of binaries, it is easier. Let us assume, there is a version of Office on it which should always have the same measurements every time you boot. But if you start to include data in your known good state, you may have a macrovirus in there. Trusted Computing would not be effective to measure it. You cannot just measure all your data, it would become too inflexible. Trusted Computing is a good foundation for getting your system up and running. You then have to apply additional mechanisms, from your operating system or hypervisor, to enforce runtime security.
Arnd	What other research issues remain now that the research project is almost over? For example, the potential weaknesses, Trojans etc. in the hardware, problems with the methods or evaluation methods used, or the graphics output which we try to control. What are the big open issues?
Chris	All of these. The big open issue is moving from a theoretically strong architecture to a practically strong architecture, without compromising user experiences. This may include user experiences surrounding performance and resource usage or supported functionalities. Can you still run Windows? In terms of the graphics, the work that HP here and Cambridge University are doing around Gallium will give us a good answer on how to convey trust to somebody on the screen. That has been quite successful for OpenTC. It was a big issue when we started. The challenge was performance, giving users fancy 3D-graphics, and at the same time confidence in what they are seeing is what they are getting. I think the big challenge is moving OpenTC into something that people can really use, commercially buy, and get access and support for. Research-wise: A lot of hardware support allows a simpler software layer. That pushes trust into the hardware. Of course, there is less visibility into the hardware: that is an issue. I don’t know what we can do about that. In general, hardware tends to have better quality control than software, I guess primarily it is harder to fix hardware bugs once it has been released. Pushing things to hardware may be a good thing, but how do you build people’s confidence in that hardware other than saying: “That’s Intel. I trust them”, or “That’s AMD. I trust them”. We have talked about theoretical architecture, but practical implementation of, say, XEN that cannot follow the architecture for commercial reasons, is an open issue. They know the code is big. They know it should be smaller. But how do you do that? They want to be widely used. We know that the implementation of the underlying hypervisor is not perfect, but that will happen over time. The most important thing that OpenTC has done is to have developed scenarios that can be tested with people, for example the PET prototype (see newsletter of January 2008), and the Virtual Data Center, and whether different customers/users see value in these scenarios. Assuming that the virtualisation layer underneath is strong, i.e., robust and trustworthy, are these particular scenarios of interest to you? That then motivates the strengthening of the underlying technologies. Without any scenarios, any customers or enterprises, asking for the technology, I don’t think it is going to develop by itself, in particular for scenarios where isolation is important.

References:

[1]	CBS 3: Researchers: Chinese Cyberspies Strike In Canada. Computer Break-Ins May Have Targeted Tibetans In Exile. March 28, 2009. [ http://cbs3.com/topstories/china.canada.computers.2.970521.html ]
[2]	Security Service MI5: Espionage. [ http://www.mi5.gov.uk/output/espionage.html ]
[3]	Senate Committee on Homeland Security & Governmental Affairs: Lieberman and Collins Step up Scrutiny of Cyber Security Initiative (Press release of May 2, 2008). [ http://hsgac.senate.gov/public/index.cfm?FuseAction=PressReleases.Detail&Affiliation=C&P ressRelease_id=a32aba11-4443-4577-b9a5-3b2ea2c2f826&Month=5&Year=2008 ]

About the interviewee: Chris Dalton is a Principal Research Engineer at Hewlett Packard Laboratories, based in Bristol, UK.

Contact: cid (at) hp.com

[ Arnd Weber ]