What kind of problem is presented by the authors and why this problem is important? Approach & Design: briefly describe the approach designed by the authors Strengths and
- Problem statement: what kind of problem is presented by the authors and why this problem is important?
- Approach & Design: briefly describe the approach designed by the authors
- Strengths and Weaknesses: list the strengths and weaknesses, in your opinion
- Evaluation: how did the authors evaluate the performance of the proposed scheme? What kind of workload was designed and used?
- Conclusion: by your own judgement.
Experiences Building PlanetLab Larry Peterson, Andy Bavier, Marc E. Fiuczynski, Steve Muir
Department of Computer Science Princeton University
Abstract. This paper reports our experiences building PlanetLab over the last four years. It identifies the re- quirements that shaped PlanetLab, explains the design decisions that resulted from resolving conflicts among these requirements, and reports our experience imple- menting and supporting the system. Due in large part to the nature of the “PlanetLab experiment,” the discus- sion focuses on synthesis rather than new techniques, bal- ancing system-wide considerations rather than improving performance along a single dimension, and learning from feedback from a live system rather than controlled exper- iments using synthetic workloads.
1 Introduction PlanetLab is a global platform for deploying and eval- uating network services [21, 3]. In many ways, it has been an unexpected success. It was launched in mid- 2002 with 100 machines distributed to 40 sites, but to- day includes 700 nodes spanning 336 sites and 35 coun- tries. It currently hosts 2500 researchers affiliated with 600 projects. It has been used to evaluate a diverse set of planetary-scale network services, including content dis- tribution [33, 8, 24], anycast [35, 9], DHTs [26], robust DNS [20, 25], large-file distribution [19, 1], measurement and analysis [30], anomaly and fault diagnosis [36], and event notification [23]. It supports the design and evalua- tion of dozens of long-running services that transport an aggregate of 3-4TB of data every day, satisfying tens of millions of requests involving roughly one million unique clients and servers. To deliver this utility, PlanetLab innovates along two
main dimensions:
• Novel management architecture. PlanetLab ad- ministers nodes owned by hundreds of organiza- tions, which agree to allow a worldwide community of researchers—most complete strangers—to access their machines. PlanetLab must manage a complex relationship between node owners and users.
• Novel usage model. Each PlanetLab node should gracefully degrade in performance as the number of users grows. This gives the PlanetLab community an incentive to work together to make best use of its shared resources.
In both cases, the contribution is not a new mechanism or algorithm, but rather a synthesis (and full exploitation) of carefully selected ideas to produce a fundamentally new system. Moreover, the process by which we designed the sys-
tem is interesting in its own right:
• Experience-driven design. PlanetLab’s design evolved incrementally based on experience gained from supporting a live user community. This is in contrast to most research systems that are de- signed and evaluated under controlled conditions, contained within a single organization, and evalu- ated using synthetic workloads.
• Conflict-driven design. The design decisions that shaped PlanetLab were responses to conflicting re- quirements. The result is a comprehensive architec- ture based more on balancing global considerations than improving performance along a single dimen- sion, and on real-world requirements that do not al- ways lend themselves to quantifiable metrics.
One could view this as a new model of system design, but of course it isn’t [6, 27]. This paper identifies the requirements that shaped the
system, explains the design decisions that resulted from resolving conflicts among these requirements, and reports our experience building and supporting the system. A side-effect of the discussion is a fairly complete overview of PlanetLab’s current architecture, but the primary goal is to describe the design decisions that went into building PlanetLab, and to report the lessons we have learned in the process. For a comprehensive definition of the Plan- etLab architecture, the reader is referred to [22].
2 Background This section identifies the requirements we understood at the time PlanetLab was first conceived, and sketches the high-level design proposed at that time. The discussion includes a summary of the three main challenges we have faced, all of which can be traced to tensions between the requirements. The section concludes by looking at the relationship between PlanetLab and similar systems.
OSDI ’06: 7th USENIX Symposium on Operating Systems Design and ImplementationUSENIX Association 351
2.1 Requirements
PlanetLab’s design was guided by five major require- ments that correspond to objectives we hoped to achieve as well as constraints we had to live with. Although we recognized all of these requirements up-front, the follow- ing discussion articulates them with the benefit of hind- sight. (R1) It must provide a global platform that supports
both short-term experiments and long-running services. Unlike previous testbeds, a revolutionary goal of Planet- Lab was that it support experimental services that could run continuously and support a real client workload. This implied that multiple services be able to run concurrently since a batch-scheduled facility is not conducive to a 24×7 workload. Moreover, these services (experiments) should be isolated from each other so that one service does not unduly interfere with another. (R2) It must be available immediately, even though
no one knows for sure what “it” is. PlanetLab faced a dilemma: it was designed to support research in broad- coverage network services, yet its management (control) plane is itself such a service. It was necessary to deploy PlanetLab and start gaining experience with network ser- vices before we fully understood what services would be needed to manage it. As a consequence, PlanetLab had to be designed with explicit support for evolution. More- over, to get people to use PlanetLab—so we could learn from it—it had to be as familiar as possible; researchers are not likely to change their programming environment to use a new facility. (R3) We must convince sites to host nodes running
code written by unknown researchers from other organi- zations. PlanetLab takes advantage of nodes contributed by research organizations around the world. These nodes, in turn, host services on behalf of users from other re- search organizations. The individual users are unknown to the node owners, and to make matters worse, the ser- vices they deploy often send potentially disruptive pack- ets into the Internet. That sites own and host nodes, but trust PlanetLab to administer them, is unprecedented at the scale PlanetLab operates. As a consequence, we must correctly manage the trust relationships so that the risks to each site are less than the benefits they derive.
(R4) Sustaining growth depends on support for auton- omy and decentralized control. PlanetLab is a world- wide platform constructed from components owned by many autonomous organizations. Each organization must retain some amount of control over how their resources are used, and PlanetLab as a whole must give geographic regions and other communities as much autonomy as pos- sible in defining and managing the system. Generally, sustaining such a system requires minimizing centralized control.
(R5) It must scale to support many users with mini- mal resources. While a commercial variant of PlanetLab might have cost recovery mechanisms to provide resource guarantees to each of its users, PlanetLab must operate in an under-provisioned environment. This means con- servative allocation strategies are not practical, and it is necessary to promote efficient resource sharing. This in- cludes both physical resources (e.g., cycles, bandwidth, and memory) and logical resources (e.g., IP addresses). Note that while the rest of this paper discusses the
many tensions between these requirements, two of them are quite synergistic. The requirement that we evolve PlanetLab (R2) and the need for decentralized control (R4) both point to the value of factoring PlanetLab’s man- agement architecture into a set of building block compo- nents with well-defined interfaces. A major challenge of building PlanetLab was to understand exactly what these pieces should be. To this end, PlanetLab originally adopted an organiz-
ing principle called unbundled management, which ar- gued that the services used to manage PlanetLab should themselves be deployed like any other service, rather than bundled with the core system. The case for unbundled management has three arguments: (1) to allow the sys- tem to more easily evolve; (2) to permit third-party de- velopers to build alternative services, enabling a software bazaar, rather than rely on a single development team with limited resources and creativity; and (3) to permit decentralized control over PlanetLab resources, and ulti- mately, over its evolution.
2.2 Initial Design
PlanetLab supports the required usage model through dis- tributed virtualization—each service runs in a slice of PlanetLab’s global resources. Multiple slices run con- currently on PlanetLab, where slices act as network-wide containers that isolate services from each other. Slices were expected to enforce two kinds of isolation: resource isolation and security isolation, the former concerned with minimizing performance interference and the latter concerned with eliminating namespace interference. At a high-level, PlanetLab consists of a centralized
front-end, called PlanetLab Central (PLC), that remotely manages a set of nodes. Each node runs a node man- ager (NM) that establishes and controls virtual machines (VM) on that node. We assume an underlying virtual ma- chine monitor (VMM) implements the VMs. Users create slices through operations available on PLC, which results in PLC contacting the NM on each node to create a local VM. A set of such VMs defines the slice. We initially elected to use a Linux-based VMM due to
Linux’s high mind-share [3]. Linux is augmented with Vservers [16] to provide security isolation and a set of schedulers to provide resource isolation.
OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation USENIX Association352
2.3 Design Challenges Like many real systems, what makes PlanetLab interest- ing to study—and challenging to build—is how it deals with the constraints of reality and conflicts among re- quirements. Here, we summarize the three main chal- lenges; subsequent sections address each in more detail. First, unbundled management is a powerful design
principle for evolving a system, but we did not fully un- derstand what it entailed nor how it would be shaped by other aspects of the system. Defining PlanetLab’s man- agement architecture—and in particular, deciding how to factor management functionality into a set of independent pieces—involved resolving three main conflicts:
• minimizing centralized components (R4) yet main- taining the necessary trust assumptions (R3);
• balancing the need for slices to acquire the resources they need (R1) yet coping with scarce resources (R5);
• isolating slices from each other (R1) yet allowing some slices to manage other slices (R2).
Section 3 discusses our experiences evolving PlanetLab’s management architecture. Second, resource allocation is a significant challenge
for any system, and this is especially true for PlanetLab, where the requirement for isolation (R1) is in conflict with the reality of limited resources (R5). Part of our ap- proach to this situation is embodied in the management structure described in Section 3, but it is also addressed in how scheduling and allocation decisions are made on a per-node basis. Section 4 reports our experience balanc- ing isolation against efficient resource usage. Third, we must maintain a stable system on behalf of
the user community (R1) and yet evolve the platform to provide long-term viability and sustainability (R2). Sec- tion 5 reports our operational experiences with Planet- Lab, and the lessons we have learned as a result. 2.4 Related Systems An important question to ask about PlanetLab is whether its specific design requirements make it unique, or if our experiences can apply to other systems. Our response is that PlanetLab shares “points of pain” with three simi- lar systems—ISPs, hosting centers, and the GRID—but pushes the envelope relative to each. First, PlanetLab is like an ISP in that it has many
points-of-presence and carries traffic to/from the rest of the Internet. Like ISPs (but unlike hosting centers and the GRID), PlanetLab has to provide mechanisms that can by used to identify and stop disruptive traffic. PlanetLab goes beyond traditional ISPs, however, in that it has to deal with arbitrary (and experimental) network services, not just packet forwarding.
Second, PlanetLab is like a hosting center in that its nodes support multiple VMs, each on behalf of a differ- ent user. Like a hosting center (but unlike the GRID or ISPs), PlanetLab has to provide mechanisms that enforce isolation between VMs. PlanetLab goes beyond hosting centers, however, because it includes third-party services that manage other VMs, and because it must scale to large numbers of VMs with limited resources.
Third, PlanetLab is like the GRID in that its resources are owned by multiple autonomous organizations. Like the GRID (but unlike an ISP or hosting center), PlanetLab has to provide mechanisms that allow one organization to grant users at another organization the right to use its re- sources. PlanetLab goes far beyond the GRID, however, in that it scales to hundreds of “peering” organizations by avoiding pair-wise agreements.
PlanetLab faces new and unique problems because it is at the intersection of these three domains. For exam- ple, combining multiple independent VMs with a single IP address (hosting center) and the need to trace disrup- tive traffic back to the originating user (ISP) results in a challenging problem. PlanetLab’s experiences will be valuable to other systems that may emerge where any of these domains intersect, and may in time influence the direction of hosting centers, ISPs, and the GRID as well.
3 Slice Management This section describes the slice management architecture that evolved over the past four years. While the discus- sion includes some details, it primarily focuses on the de- sign decisions and the factors that influenced them.
3.1 Trust Assumptions Given that PlanetLab sites and users span multiple orga- nizations (R3), the first design issue was to define the un- derlying trust model. Addressing this issue required that we identify the key principals, explicitly state the trust as- sumptions among them, and provide mechanisms that are consistent with this trust model. Over 300 autonomous organizations have contributed
nodes to PlanetLab (they each require control over the nodes they own) and over 300 research groups want to deploy their services across PlanetLab (the node own- ers need assurances that these services will not be dis- ruptive). Clearly, establishing 300×300 pairwise trust relationships is an unmanageable task, but it is well- understood that a trusted intermediary is an effective way to manage such an N×N problem. PLC is one such trusted intermediary: node owners
trust PLC to manage the behavior of VMs that run on their nodes while preserving their autonomy, and re- searchers trust PLC to provide access to a set of nodes that are capable of hosting their services. Recognizing this role for PLC, and organizing the architecture around
OSDI ’06: 7th USENIX Symposium on Operating Systems Design and ImplementationUSENIX Association 353
Owner 1
2
3
4
PLC Service Developer (User)
Node
Figure 1: Trust relationships among principals.
it, is the single most important aspect of the design be- yond the simple model presented in Section 2.2. With this backdrop, the PlanetLab architecture recog-
nizes three main principals:
• PLC is a trusted intermediary that manages nodes on behalf a set of owners, and creates slices on those nodes on behalf of a set of users.
• An owner is an organization that hosts (owns) Plan- etLab nodes. Each owner retains ultimate control over their own nodes, but delegates management of those nodes to the trusted PLC intermediary. PLC provides mechanisms that allow owners to define re- source allocation policies on their nodes.
• A user is a researcher that deploys a service on a set of PlanetLab nodes. PlanetLab users are currently individuals at research organizations (e.g., universi- ties, non-profits, and corporate research labs), but this is not an architectural requirement. Users create slices on PlanetLab nodes via mechanisms provided by the trusted PLC intermediary.
Figure 1 illustrates the trust relationships between node owners, users, and the PLC intermediary. In this figure:
1. PLC expresses trust in a user by issuing it credentials that let it access slices. This means that the user must adequately convince PLC of its identity (e.g., affiliation with some organization or group).
2. A user trusts PLC to act as its agent, creating slices on its behalf and checking credentials so that only that user can install and modify the software running in its slice.
3. An owner trusts PLC to install software that is able to map network activity to the responsible slice. This software must also isolate resource usage of slices and bound/limit slice behavior.
4. PLC trusts owners to keep their nodes physically se- cure. It is in the best interest of owners to not cir- cumvent PLC (upon which it depends for accurate policing of its nodes). PLC must also verify that ev- ery node it manages actually belongs to an owner with which it has an agreement.
Given this model, the security architecture includes the following mechanisms. First, each node boots from an immutable file system, loading (1) a boot manager pro- gram, (2) a public key for PLC, and (3) a node-specific secret key. We assume that the node is physically secured by the owner in order to keep the key secret, although a hardware mechanism such as TCPA could also be lever- aged. The node then contacts a boot server running at PLC, authenticates the server using the public key, and uses HMAC and the secret key to authenticate itself to PLC. Once authenticated, the boot server ensures that the appropriate VMM and the NM are installed on the node, thus satisfying the fourth trust relationship. Second, once PLC has vetted an organization through
an off-line process, users at the site are allowed to cre- ate accounts and upload their private keys. PLC then installs these keys in any VMs (slices) created on be- half of those users, and permits access to those VMs via ssh. Currently, PLC requires that new user accounts are authorized by a principal investigator associated with each site—this provides some degree of assurance that accounts are only created by legitimate users with a con- nection to a particular site, thus satisfying the first trust relationship. Third, PLC runs an auditing service that records in-
formation about all packet flows coming out of the node. The auditing service offers a public, web-based interface on each node, through which anyone that has received un- wanted network traffic from the node can determine the responsible users. PLC archives this auditing information by periodically downloading the audit log.
3.2 Virtual Machines and Resource Pools
Given the requirement that PlanetLab support long-lived slices (R1) and accommodate scarce resources (R5), the second design decision was to decouple slice creation from resource allocation. In contrast to a hosting cen- ter that might create a VM and assign it a fixed set of resources as part of an SLA, PlanetLab creates new VMs without regard for available resources—each such VM is given a fair share of the available resources on that node whenever it runs—and then expects slices to engage one or more brokerage services to acquire resources. To this end, the NM supports two abstract objects: vir-
tual machines and resource pools. The former is a con- tainer that provides a point-of-presence on a node for a slice. The latter is a collection of physical and logical resources that can be bound to a VM. The NM supports operations to create both objects, and to bind a pool to a VM for some fixed period of time. Both types of objects are specified by a resource specification (rspec), which is a list of attributes that describe the object. A VM can run as soon as it is created, and by default is given a fair share of the node’s unreserved capacity. When a resource pool
OSDI ’06: 7th USENIX Symposium on Operating Systems Design and Implementation USENIX Association354
is bound to a VM, that VM is allocated the corresponding resources for the duration of the binding. Global management services use these per-node opera-
tions to create PlanetLab-wide slices and assign resources to them. Two such service types exist today: slice cre- ation services and brokerage services. These services can be separate or combined into a single service that both creates and provisions slices. At the same time, different implementations of brokerage services are possible (e.g., market-based services that provide mechanisms for buy- ing and selling resources [10, 14], and batch scheduling services that simply enforce admission control for use of a finite resource pool [7]).
As part of the resource allocation architecture, it was also necessary to define a policy that governs how re- sources are allocated. On this point, owner autonomy (R4) comes into play: only owners are allowed to invoke the “create resource pool” operation on the NM that runs on their nodes. This effectively defines the one or more “root” pools, which can subsequently be split into sub- pools and reassigned. An owner can also directly allocate a certain fraction of its node’s resources to the VM of a specific slice, thereby explicitly supporting any services the owner wishes to host.
3.3 Delegation PlanetLab’s management architecture was expected to evolve through the introduction of third-party services (R2). We viewed the NM interface as the key feature, since it would support the many third-party creation and brokerage services that would emerge. We regarded PLC as merely a “bootstrap” mechanism that could be used to deploy such new global management services, and thus, we expected PLC to play a reduced role over time. However, experience showed this approach to be
flawed. This is for two reasons, one fundamental and one pragmatic. First, it failed to account for PLC’s cen- tral role in the trust model of Section 3.1. Maintaining trust relationships among participants is a critical role played by PLC, and one not easily passed along to other services. Second, researchers building new management services on PlanetLab were not interested in replicating all of PLCs functionality. Instead of using PLC to boot- strap a comprehensive suite of management services, re- searchers wanted to leverage some aspects of PLC and replace others. To accommodate this situation, PLC is today struc-
tured as follows. First, each owner implicitly assigns all of its resources to PLC for redistribution. The owner can override this allocation by granting a set of resources to a specific slice, or divide resources among multiple bro- kerage services, but by default all resources are allocated to PLC. Second, PLC runs a slice creation service—called
pl conf—on each node. This service runs in a stan- dard VM and invokes the NM interface without any ad- ditional privilege. It also exports an XML-RPC interface by which anyone can invoke its services. This is impor- tant because it means other brokerage and slice creation services can use pl conf as their point-of-presence on each node rather than have to first deploy their own slice. Originally, the PLC/pl conf interface was private as we expected management services to interact directly with the node manager. However, making this a well-defined, public interface has been a key to supporting delegation. Third, PLC provides a front-end—available either as
a GUI or as a programmatic interface at www.planet- lab.org—through which users create slices. The PLC front-end interacts with pl conf on each node with the same XML-RPC interface that other services use.
Finally, PLC supports two methods by which slices are actually instantiated on a set of nodes: direct and delegated. Using the direct method, the PLC front-end contacts pl conf on each node to create the correspond- ing VM and assign resources to it. Using delegation, a slice creation service running on behalf of a user con- tacts PLC for a ticket that encapsulates the right to create a VM or redistribute a pool of resources. A ticket is a signed rspec; in this case, it is signed by PLC. The agent then contacts pl conf on each node to redeem this ticket, at which time pl conf validates it and calls the NM to create a VM or bind a pool of resources to an existing VM. The mechanisms just described currently support two slice creation services (PLC and Emulab [34], the latter uses tickets granted by the former), and two bro- kerage services (Sirius [7] and Bellagio [2], the first of which is granted capacity as part of a root resource allo- cation decision).
Note that the delegated method of slice creation is push-based, while the direct method is pull-based. With delegation, a slice creation service contacts PLC to re- trieve a ticket granting it the right to create a slice, and then performs an XML-RPC call to pl conf on each node. For a slice spanning a significant fraction of PlanetLab’s nodes, an implementation would likely launch multiple such calls in parallel. In contrast, PLC uses a polling approach: each pl conf contacts PLC periodically to re- trieve a set of tickets for the slices it should run. While the push-based approach can create a slice in
less time, the advantage of pull-based approach is that it enables slices to persist across node reinstalls. Nodes cannot be trusted to have persistent state since they are completely reinstalled from time to time due to unrecov- erable errors such as corrupt local file systems. The pull- based strategy views all nodes as maintaining only soft state, and gets the definitive list of slices for that node from PLC. Therefore, if a node is reinstalled, all of its slices are automatically recreated. Delegation makes it
OSDI ’06: 7th USENIX Symposium on Operating Systems Design and ImplementationUSENIX Association 355
possible for others to develop alternative slice creation semantics—for example, a “best effort” system that ig- nores such problems—but PLC takes the conservative approach because it is used to create slices for essential management services. 3.4 Federation Given our desire to minimize the centralized elements of PlanetLab (R4), our next design decision was to make it possible for multiple independent PlanetLab-like systems to co-exist and federate with each other. Note that this issue is distinct from delegation, which allows multiple management services to co-exist withing a single Planet- Lab. There are three keys to enabling federation. First, there
must be well-defined interfaces by which independent in- stances of PLC invoke operations on each other. To this end, we observe that our implementation of PLC natu- rally divides into two halves: one that creates slices on behalf of users and one that manages nodes on behalf of owners, and we say that PLC embodies a slice authority and a management authority, respectively. Correspond- ing to these two roles, PLC supports two distinct inter- faces: one that is used to create and control slices, and one that is used to boot and manage nodes. We claim that these interfaces are minimal, and hence, define the “narrow waist” of the PlanetLab hourglass. Second, supporting multiple independent PLCs im-
plies the need to name each instance. It is PLC in its slice authority role that names slices, and its name space must be extended to also name slice authori- ties. For example, the slice cornell.cobweb is implicitly plc.cornell.cobweb, where plc is the top-level slice au- thority that approved the slice. (As we generalize the slice name space, we adopt “.” instead of “ ” as the delimiter.) Note that this model enables a hierarchy of slice author- ities, which is in fact already the case with plc.cornell, since PLC trusts Cornell to approve local slices (and the users bound to them). This generalization of the slice naming scheme leads
to several possibilities:
• PLC delegates the ability to create slices to regional slice authorities (e.g., plc.japan.utokyo.ubiq);
• organizations create “private” PlanetLab’s (e.g., epfl.chawla) that possibly peer with each other, or with the “public” PlanetLab; and
• alternative “root” naming authorities come into exis- tence, such as one that is responsible for commercial (for-profit) slices (e.g., com.startup.voip).
The third of these is speculative, but the first two scenar- ios have already happened or are in progress, with five private PlanetLabs running today and two regional slice
Service Lines of Code Language Node Manager 2027 Python
Proper 5752 C pl conf 1975 Python Sirius 850 Python Stork 12803 Python
CoStat + CoMon 1155 C PlanetFlow 5932 C
Table 1: Source lines of code for various management services
authorities planned for the near future. Note that there must be a single global naming authority that ensures all top-level slice authority names are unique. Today, PLC plays that role.
The third key to federation is to design pl conf so that it is able to create slices on behalf of many different slice authorities. Node owners allocate resources to the slice authorities they want to support, and configure pl conf to accept tickets signed by slice authorities that they trust. Note that being part of the “public
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.