Additional Publications
Abstract
Hierarchical decomposition is a fundamental principle that encourages the organization of program elements into nested scopes of access, instead of treating all as ``global.'' This paper offers a foundational study of heap decomposition inference, the problem of statically extracting the decomposition hierarchy latent in the runtimes of object-oriented programs, henceforth revealing the compositional nature of the heap. The centerpiece of the paper is Cypress, a sound, precise, and scalable constraint-based ownership type inference coupled with a novel application of linear programming over integers. All constraints in Cypress are linear, and the precision of decomposition -- placing objects to scopes as non-global as possible -- can be reduced to a linear programming problem. Cypress has been implemented as an open-source tool that can decompose real-world Java applications of more than 100K LOC and up to 6000 statically distinct instantiations.
Abstract
This paper introduces Green Streams, a novel solution to address a critical but often overlooked property of data-intensive software: energy efficiency. Green Streams is built around two key insights into data-intensive software. First, energy consumption of data-intensive software is strongly correlated to data volume and data processing, both of which are naturally abstracted in the stream programming paradigm; Second, energy efficiency can be improved if the data processing components of a stream program coordinate in a "balanced" way, much like an assembly line that runs most efficiently when participating workers coordinate their pace. Green Streams adopts a standard stream programming model, and applies Dynamic Voltage and Frequency Scaling (DVFS) to coordinate the pace of data processing among components, ultimately achieving energy efficiency without degrading performance in a parallel processing environment. At the core of Green Streams is a novel constraint-based inference to abstract the intrinsic relationships of data flow rates inside a stream program, which uses linear programming to minimize the frequencies -- hence the energy consumption -- for processing components while still maintaining the maximum output data flow rate. The core algorithm of Green Streams is formalized, and its optimality is established. The effectiveness of Green Streams is evaluated on top of the StreamIt framework, and preliminary results show the approach can save CPU energy by an average of 28% with a 7% performance improvement.
Abstract
Atomicity enforcement in a multi-threaded application can be critical to the application’s safety. In this paper, we take the challenge of enforcing atomicity in a multilingual application, which is developed in multiple programming languages. Specifically, we describe the design and implementation of
JATO, which enforces the atomicity of a native method when a Java application invokes the native method through the Java Native Interface (JNI). JATO relies on a constraint-based system, which generates constraints from both Java and native code based on how Java objects are accessed by threads. Constraints are then solved to infer a set of Java objects that need to be locked in native methods to enforce the atomicity of the native method invocation. We also propose a number of
optimizations that soundly improve the performance. Evaluation through JATO’s prototype implementation demonstrates it enforces native-method atomicity with reasonable run-time overhead.
- Michael Cohen, Haitao Steve Zhu, Senem Ezgi Emgin, Yu David Liu, "Energy Types," OOPSLA, 2012.
Abstract
This paper presents a novel type system to promote and facilitate energy-aware programming. Energy Types is built upon a key insight into today’s energy-efficient systems and applications: despite the popular perception that energy and power can only be described in joules and watts, real-world energy management is often based on discrete phases and modes, which in turn can be reasoned about by type systems very effectively. A phase characterizes a distinct pattern of program workload, and a mode represents an energy state the program is expected to execute in. This paper describes a programming model where phases and modes can be intuitively specified by programmers or inferred by the compiler as type information. It demonstrates how a type-based approach to reasoning about phases and modes can help promote energy efficiency. The soundness of our type system and the invariants related to inter-phase and inter-mode interactions are rigorously proved. Energy Types is implemented as the core of a prototyped object-oriented language ET for smartphone programming. Preliminary studies show ET can lead to significant energy savings for Android Apps.
Abstract
In multithreaded object-oriented programs, locality refers to the dynamic
scoping relation among threads and objects in a potentially
shared-memory context. This paper calls for type-theoretic investigations
into two forms of locality -- thread locality and aggregate
locality -- and proposes a unified type inference algorithm to reason
about them over unannotated real-world Java programs.
Abstract
This paper addresses energy consumption in
multi-threaded programs. In particular, it demonstrates why
synchronizations
-- a fundamental fabric of multi-core software
-- may lead to unnecessary energy consumption, and proposes
a pattern-based compilation technique to improve energy
efficiency. The key insight is that energy efficiency may be
improved by adjusting the relative speed of individual threads
participating in a synchronization, and different synchronization patterns can offer clues on how adjustments should be
made.
Abstract
This paper describes an operational semantics for futures, with the primary target on energy efficiency. The work in progress is built around an insight that different threads can coordinate by
running at different "paces," so that the time for synchronization and the resulting wasteful energy consumption can be reduced. We exploit several inherent characteristics of futures to determine how
the paces of involving threads can be coordinated. The semantics is inspired by recent advances in
computer architectures, where the frequencies of CPU cores can be adjusted dynamically. The work is
a first-step toward a direction where variant frequencies are directly modeled as an essential semantic
feature in concurrent programming languages.
Abstract
This paper proposes an energy-efficient approach for programming languages that support work stealing. The key insight is that thieves and victims in the work stealing algorithm can coordinate their execution paces for more energy efficiency, through dynamic adjustment of CPU frequencies.
Abstract
Staging is a powerful language construct that allows a program at one stage
of evaluation to manipulate and specialize a program to be executed at a later stage.
We propose a new staged language calculus, <ML>, which extends the programmability
of staged languages in two directions. First, <ML> supports dynamic type specialization:
types can be dynamically constructed, abstracted, and passed as arguments, while
preserving decidable typechecking via a System F-sub style semantics combined with a
restricted form of Lambda-omega-style runtime type construction. With dynamic type specialization
the data structure layout of a program can be optimized via staging. Second, <ML>
works in a context where different stages of computation are executed in different process
spaces, a property we term staged process separation. Programs at different stages
can directly communicate program data in <ML> via a built-in serialization discipline.
The language <ML> is endowed with a metatheory including type preservation, type
safety, and decidability as demonstrated constructively by a sound type checking algorithm.
While our language design is general, we are particularly interested in future
applications of staging in resource-constrained and embedded systems: these systems
have limited space for code and data, as well as limited CPU time, and specializing code
for the particular deployment at hand can improve efficiency in all of these dimensions.
The combination of dynamic type specialization and staging across processes greatly
increases the utility of staged programming in these domains. We illustrate this via
wireless sensor network programming examples.
Abstract
Cyber-physical systems are a coordinated combination of computational and physical elements. This position paper calls for the design of a unified object model that blends the boundary of the two. The unified object model has the benefits of bringing classic software engineering technologies and tools -- such as UML -- to the new application domain of cyber-physical systems, and further equipping programs written for these systems with the traditional strengths of object-oriented languages, such as encapsulation, code reuse
and customization, and strong guarantees for avoiding run-time errors.
-
Aditya Kulkarni, Yu David Liu, Scott Smith, "Task Types for Pervasive Atomicity," Proceedings of the 25th ACM Conference on Object-Oriented
Programming, Systems, Languages and Applications (OOPSLA 2010). Reno, Nevada, USA, October 2010.
Abstract
Atomic regions are an important concept in correct concurrent
programming: since atomic regions can be viewed
as having executed in a single step, atomicity greatly reduces
the number of possible interleavings the programmer
needs to consider. This paper describes a method for building
atomicity into a programming language in an organic fashion.
We take the view that atomicity holds for whole threads
by default, and a division into smaller atomic regions occurs
only at points where an explicit need for sharing is needed
and declared. A corollary of this view is every line of code
is part of some atomic region.We define a polymorphic type
system, Task Types, to enforce most of the desired atomicity
properties statically.We show the reasonableness of our type
system by proving that type soundness, isolation invariance,
and atomicity enforcement properties hold at run time. We
also present initial results of a Task Types implementation
built on Java.
Abstract
Today's cloud computing platforms are typically "opaque": Amazon EC2 users only receive virtual units of CPU and memory, and physical details of the platform are hidden. Such opacity prevents programs from online optimizations and deployment adjustment, and is penalizing the very applications cloud computing attempts to attract: high-performance software. On the other extreme, a completely transparent design of clouds would lead to severe security and reliability concerns. In this position paper, we take the middle-of-the-road approach, proposing a language model for well-defined programmable interactions between the cloud platform and the client program. Our proposed ideas draw from the previous work of Classages, where first-class interactions can be dynamically established and terminated for two parties with mutually satisfiable contracts. In this paper, we justify how our design can help cloud programs to fine-tune performance, and how it may impact on other important issues of cloud computing, such as security, scheduling, and pricing.
Abstract
Staging is a powerful language construct that allows a program at
one stage to manipulate and specialize a program at the next. We
propose <ML> as a new staged calculus designed with novel
features for staged programming in modern computing platforms such
as embedded systems. A distinguishing feature of <ML> is a
model of process separation, whereby different stages of computation
are executed in different process spaces. Our language also supports
dynamic type specialization via type abstraction, dynamic type
construction, and a limited form of type dependence. <ML> is
endowed with a largely standard metatheory, including type
preservation and type safety results. We discuss the utility of our
language via code examples from the domain of wireless sensor
network programming.
- Yu David Liu, Scott Smith. Pedigree
Types. In Proceedings of the 2008 International Workshop on Aliasing, Confinement and Ownership in Object-Oriented Programming (IWACO).
Paphos, Cyprus. July 2008.
Abstract
Pedigree Types are an intuitive ownership type system requiring
minimal programmer annotations. Reusing the vocabulary of human
genealogy, Pedigree Types programmers can qualify any
object reference with a pedigree -- a child, sibling, parent, grandparent, etc -- to indicate what
relationship the object being referred to has with the referant on the
standard ownership tree, following the owners-as-dominators convention.
Such a qualifier serves as a heap shape
constraint that must hold at run time and is enforced statically.
Pedigree child captures the intention of encapsulation,
i.e. ownership: the modified object reference is ensured not to
escape the boundary of its parent. Among existing ownership type
systems, Pedigree Types are closest to Universe Types. The former can be viewed as extending the latter with
a more general form of pedigree modifiers, so that the relationship
between any pair of objects on the aforementioned ownership tree can be named and --
more importantly -- inferred. We use a constraint-based type system
which is proved sound via subject reduction. Other technical
originalities include a polymorphic treatment of pedigrees not
explicitly specified by programmers, and use of linear diophantine
equations in type constraints to enforce the hierarchy.
Abstract
This paper introduces a new language model, Coqa, for deeply embedding
concurrent programming into objects. Every program written in our
language has built-in the desirable behaviors of atomicity, mutual
exclusion, and race freedom. Such a design inverts the default mode of
e.g. Java, where programmers may write programs highly vulnerable to
surprising runtime errors if they do not take advantage of
synchronized blocks and other advanced concurrency features. Coqa
takes advantage of basic object-oriented notions to express important
concurrent programming primitives, resulting in a powerful but concise
model that includes thread creation, shared object access and open
nesting. A key property of our model is the notion of quantized
atomicity: all concurrent program executions can be viewed as being
divided into quantum regions of atomic execution, greatly reducing the
number of interleavings to consider. The language design draws from
different schools of concurrent language design, especially Actors,
and constructs a unified design close to mainstream object-oriented
programming practice. We justify our approach both from a theoretical
basis by showing that a formal representation, CoqaKernel, has
provable quantized atomicity properties, and by implementing CoqaJava,
a Java extension incorporating all of the Coqa features.
Abstract
Software deployment is a complex process, and industrial-strength
frameworks such as .NET, Java, and CORBA all provide explicit support
for component deployment. However, these frameworks are not built
around fundamental principles as much as they are engineering efforts
closely tied to particulars of the respective systems. Here we aim to
elucidate the fundamental principles of software deployment, in a
platform-independent manner. Issues that need to be addressed include
deployment unit design, when, where and how to wire
components together, versioning, version dependencies, and
hot-deployment of components. We define the
application buildbox as the place where software is developed
and deployed, and define a formal Labeled Transition System (LTS) on
the buildbox with transitions for deployment operations that include
build, install, ship, and update. We establish formal properties of the LTS,
including the fact that if a component is shipped with a certain
version dependency, then at run time that dependency must be satisfied
with a compatible version. Our treatment of
deployment is both platform- and vendor-independent, and we show how
it models the core mechanisms of the industrial-strength deployment
frameworks.
Abstract
This paper presents Classages, a novel
interaction-centric object-oriented language. Classes and objects in
Classages are fully encapsulated, with explicit interfaces for all
interactions they might be involved in. The design of Classages
touches upon a wide range of language design topics, including
encapsulation, object relationship representation, and object
confinement. An encoding of Java's OO model in Classages is provided,
showing how standard paradigms are supported. A
prototype Classages compiler is described.
Abstract Module systems are well known
as a means for giving clear interfaces for the static linking of code. This
paper shows how adding explicit interfaces to modules for 1) dynamic linking
and 2) cross-computation communication can increase the declarative nature
of modules, and build a stronger foundation for language-based security and
version control. We term these new modules Assemblages. We additionally
develop a sound constraint-based type system particularly suited to a module
system supporting bounded type parametricity, cross-module type recursion,
and polymorphic type binding during dynamic linking and cross-computation
communication.
Abstract In this paper we describe Ensemble, a proposed language framework for sensor network programming. Our goal is to provide a programming framework to scientists and engineers that will allow them to directly code sensor network applications, without the need for expertise in low-level device programming. The key concepts in Ensemble are high-level communication protocol connectors, and the ability for systems programmers to define new communication protocols as metaprotocol extensions.
Abstract This paper proposes a component programming
language that supports an integrated notion of both compile-time and run-time
component. The centerpiece of this paper is the static, compile time notion
of assembly, complementing our previous work on the dynamic, runtime
notion of cell. An assembly is a declarative, stateless piece of
code that facilitates code combination. It offers explicit typed interfaces
to outsiders, called linkers, which can be used to link smaller
assemblies into bigger, compound assemblies. Each assembly may in turn be
loaded at run-time, producing a cell in the runtime environment.
A cell is a dynamic, stateful component that interacts with other cells via
explicit runtime interfaces. Thus, the static assemblies and the dynamic
cells are fully integrated.
Abstract This paper defines a security infrastructure
for access control at the component level of programming language design.
Distributed components are an ideal place to define and enforce significant
security policies, because components are large entities that often define
the political boundaries of computation. Also, rather than building a security
infrastructure from scratch, we build on a standard one, the SDSI/SPKI security
infrastructure.
|