ISMP 2022 consisted of virtual invited talks. They were streamed live using Zoom during the week August 14-19.
Recordings will be accessible to anyone with an MOS membership later this year.
||Sun, Aug 14
||Mon, Aug 15
||Tue, Aug 16
||Wed, Aug 17
||Thu, Aug 18
||Fri, Aug 19
|9:00 AM||Simge Küçükyavuz||Alexandre d’Aspremont||Dick den Hertog||Georgia Perakis||Coralia Cartis,|
closing and business meeting
|12:00 PM||Stefanie Jegelka||Juan Pablo|
|Amitabh Basu||Rekha Thomas|
|9:00 PM||opening |
|Yinyu Ye||Yu-Hong Dai||Chung Piaw Teo||Sebastien Bubeck|
August 14, 9 pm EDT: Opening Ceremony
August 14, 9.30 pm EDT: Paul Tseng Memorial Lecture
August 14, 9.30 pm EDT: Yin Zhang, Chinese University of Hong Kong
Seeking Consensus on Subspace in Distributed PCA Calculations
Variable-splitting followed by consensus-seeking is a popular strategy in constructing distributed algorithms in optimization. When objective functions are invariant under change of basis, seeking unanimous consensus on a single basis is a demanding task, leading to slow convergence. In this work, we propose a subspace-splitting strategy and build an effective algorithm to accelerate convergence for distributed principal component analysis (PCA) calculations. Unlike existing methods that rely on distributed matrix multiplications, the proposed algorithm has a built-in safeguard against leakages of private data. Theoretically, a convergence guarantee to stationarity is obtained. Empirically, the proposed algorithm is shown to have a high communication-efficiency on a range of test problems. We will also discuss extending the approach to regularized PCA problems, in particular, to the sparse PCA problem.
August 15, 9 am EDT: Simge Küçükyavuz, Northwestern University
New Perspectives on Mixed-Integer Convex Optimization with Applications in Statistical Learning
Mixed-integer linear optimization (MILO) solvers have come a long way in the past couple of decades, enabling the solution of practical large-scale problems that were previously deemed unsolvable. Mixed-integer convex optimization (MICO) is the next frontier; however, it is not clear how polyhedral theory that has fueled the revolution in MILO solvers, can be leveraged for mixed-integer convex sets. To this end—motivated by recent applications in statistical learning—we consider a broad class of MICO problems which include indicator variables associated with continuous variables, and arbitrary constraints on indicators. First, we give a convexification scheme that simultaneously considers a nonlinear non-separable objective and combinatorial constraints on indicators. Second, for the convex quadratic case, we give a convex hull description of the epigraph formulation, which consists of a quadratic number of additional variables, a single positive semidefinite constraint, and linear constraints describing a polyhedral set in an extended formulation. The new theory presented here unifies and generalizes previously established results for special cases, e.g., perspective reformulation for separable functions. It also paves the path toward employing polyhedral methods to strengthen MICO models, thereby bridging the gap between MILO and MICO. We conclude with a novel decomposition method for convex quadratic optimization problems with indicator variables when the matrix defining the quadratic term in the objective is sparse. Our computational experiments with sparse structured regression problems demonstrate the effectiveness of the proposed approaches.
August 15, 12 pm EDT: Stefanie Jegelka, MIT
Deep Learning for Combinatorial Optimization, from a Machine Learning Perspective
Recently, there has been growing interest in exploiting machine learning for discrete algorithms and, conversely, interest in learning algorithmic tasks in the machine learning community. In general, machine learning aims at estimating an approximation of a target function based on observed data samples. To ensure this process succeeds, a number of questions are in place from a machine learning perspective: Is the machine learning model able to approximate the function of interest, or are there any fundamental limitations? Which neural network architectures are suitable for guiding the learning process? How robust will the resulting model be to shifts in the input distribution, and what kinds of shifts matter? This talk will summarize some emerging results related to these questions, with a focus on popular graph neural networks.
August 15, 9 pm EDT: Yinyu Ye, Stanford University
Online Linear Programming: Applications and Extensions
A natural optimization model that formulates many online resource allocations and dynamic decision-making problems is online linear programming (OLP) where the constraint matrix, along with the objective coefficients and decision variables, are revealed and decided column by column sequentially. We review the near optimal algorithms and theories for solving this surprisingly general class of online problems under the assumption of random order of arrivals and/or stationary distributions of the input data. Then we present few recent applications of the model/algorithm, including a fast online algorithm as a pre-solver for solving large-scale offline (binary) LPs, an interior-point online algorithm to address “fairness” for resource allocation, a provable logarithmic regret bound for the Bandits with Knapsacks (BwK) problem, an extension to online Fisher markets with a geometric aggregation of individual utilities, and how to deal with non-stationary data distributions in online learning.
August 16, 9 am EDT: Alexandre d'Aspremont, ENS
Approximation Bounds for Sparse Programs
We show that sparsity-constrained optimization problems over low dimensional spaces tend to have a small duality gap. We use the Shapley-Folkman theorem to derive both data-driven bounds on the duality gap, and an efficient primalization procedure to recover feasible points satisfying these bounds. These error bounds are proportional to the rate of growth of the objective with the target cardinality k, which means in particular that the relaxation is nearly tight as soon as k is large enough so that only uninformative features are added.
August 16, 12 pm EDT: Juan Pablo Vielma, Google
Modeling and Duality in Domain Specific Languages for Mathematical Optimization
Domain specific languages (DSL) for mathematical optimization allow users to write problems in a natural algebraic format. However, what is considered natural can vary from user to user. For instance, JuMP’s interface makes a distinction between conic formulations and nonlinear programming formulations whose constraints are level-sets of nonlinear functions. Tradeoffs between such alternative modeling formats are further amplified when dual solutions are considered. In this talk we describe work related to these tradeoffs in JuMP and OR-Tools. In particular, we consider modeling using a wide range of non-symmetric cones (and their solution with Hypatia.jl) and precise dual contracts for two-sided constraints (and other non-conic convex constraints).
August 16, 9 pm EDT: Yu-Hong Dai, Chinese Academy of Sciences
Optimization with Least Constraint Violation
Study about theory and algorithms for nonlinear programming usually assumes the
feasibility of the problem. However, there are many important practical nonlinear programming
problems whose feasible regions are not known to be nonempty or not. This leads to a class
of problems called optimization with least constraint violation.
Firstly, the optimization problem with least constraint violation is proved to be an Lipschitz
equality constrained optimization problem and an elegant necessary optimality condition,
named as L-stationary condition, is established. Properties of the classical penalty method for
this Lipschitz minimization problem are developed and the proximal gradient method for the
a penalized problem is studied.
Secondly, the optimization problem with least constraint violation is reformulated as an
MPCC problem and a local minimizer of the MPCC problem is proved to an M-stationary
point. The smoothing Fischer-Burmeister function method is constructed and analyzed for
solving the related MPCC problem.
Thirdly, the solvability of the dual of the optimization problem with least constraint
violation is investigated. The optimality conditions for the problem with least constraint
violation are established in term of the augmented Lagrangian. Moreover, it is proved that the
augmented Lagrangian method can find an approximate solution to the optimization problem
with least constraint violation and has linear rate of convergence under an error bound
Finally, the constrained convex optimization problem with the least constraint violation is
considered and analyzed under a general measure function. Several other related works on
the optimization problem with least constraint violation will also be mentioned.
August 17, 9 am EDT: Dick den Hertog, University of Amsterdam
Robust nonlinear optimization
Practical optimization problems often contain uncertain parameters. Robust optimization is a methodology that obtains solutions that are robust against uncertainties. For robust linear optimization generic and efficient methods have been developed. In this talk we will describe generic and efficient methods for robust nonlinear optimization. We distinguish two cases of robust nonlinear constraints. The first case is where the uncertain parameter occurs in a concave way. We show, by using Fenchel Duality, how such a nonlinear constraint can be equivalently reformulated into a set of computationally tractable constraints. The second case is where the uncertain parameter occurs in a convex way. Such robust constraints are difficult to handle, since finding the worst-case scenario is equivalent to maximizing a convex function. We propose a new approach to deal with such constraints that unifies approaches known in the literature and extends them in a significant way. The extension is either obtaining better solutions than the ones proposed in the literature, or obtaining solutions for classes of problems unaddressed by previous approaches. Our solution is based on an extension of the Reformulation-Linearization-Technique. It generates a sequence of conservative approximations which can be used to obtain both upper- and lower-bounds for the optimal objective value. We illustrate the numerical benefit of our approach on a robust control and robust geometric optimization example.
August 17, 12 pm EDT: Amitabh Basu, Johns Hopkins University
Complexity of optimizing over integers
We will survey classical and new results in the complexity of minimizing a convex function with convex constraints in the presence of integer constrained decision variables. The talk will include both information theoretic and algorithmic bounds on the complexity, with essentially tight bounds in some scenarios. We shall aim to show how diverse techniques from convex geometry, geometry of numbers and polyhedral theory come together to answer these questions. A second goal of the talk is to present a unifying perspective on the different approaches coming from the continuous and discrete optimization communities, hoping to contribute to the trend of building bridges between them. We will close with a discussion of future research themes in this fundamental problem in mathematical optimization.
August 17, 9 pm EDT: Chung Piaw TEO, National University of Singapore
Market Share Analytics - Theory and Applications
Market share analysis is often used by companies to track how well a firm is doing in the marketplace, compared to its competitors. In this talk, we explain how such information can be embedded in an analytical model to enhance the quality of prescriptive actions, and to improve performance.
More specifically, we consider a random utility model for customers' choice problem, and show that the choice model can be reformulated into a perturbed utility model (PUM) over the convex hull of the feasible solutions. Furthermore, we demonstrate how we can obtain a good approximation to the PUM using an additive perturbed utility model (APUM). This allows us to establish a set of closed-form relationships between prices, expected market shares, and interestingly, expected slacks in the constraint matrix of the customer choice problem. This opens up a new way to calibrate the APUM using market share shift information obtained from varying the prices of the products and services. Using piecewise linear approximation, we show that the resulting data-driven pricing problem can be solved as mixed integer linear programs. We finally showcase how this approach can be extended to address competition, and discuss a wide range of other applications of this approach. The talk will also showcase how some of these optimization tools have been deployed in the new SIA-NUS Digital Aviation Corp Lab.
August 18, 9 am EDT: Georgia Perakis, MIT
The Role of Optimization in Some Recent Advances in Data-Driven Decision-Making
Data-driven decision-making has garnered a growing interest due to the increase in data availability in recent years. With that growth many opportunities as well as challenges arise. Optimization can play an important role to address these challenges in both predictive and prescriptive tasks. In this talk, we review some recent advances that highlight the role of optimization in data-driven decision-making when learning is handled in an offline manner. We discuss some of our contributions in this area that aim to advance both predictive and prescriptive models in the following ways. First, we describe how clustered models can benefit from prediction tasks through optimization. Next, we consider how we can optimize over objective functions that arise from tree ensemble predictive models in order to obtain better prescriptions. Finally, we discuss how to improve prescriptions in order to learn optimal solutions directly from the data when possible.
August 18, 12 pm EDT: Rekha Thomas, University of Washington
Graphical designs on undirected graphs were defined by the analyst
Stefan Steinerberger and are discrete analogs of spherical
designs. They provide quadrature rules on graphs in the sense that a
design consists of a subset of vertices with prescribed weights so
that the weighted average of a class of graph functions on these
vertices is also the global average of the functions on the
graph. Depending on the allowed weights, and class of functions to be
averaged, one obtains different types of designs. An important
question about designs is how to compute them and optimize over
them. In this talk I will explain how positively weighted designs can
be organized on the faces of a polytope and using this
connection, one can compute the smallest designs in several families of
graphs. Designs also connect to random walks on graphs and other
well-studied graph entities.
August 18, 9 pm EDT: Sebastien Bubeck, Microsoft Research
Set Chasing, with an application to online shortest path
Since the late 19th century, mathematicians have realized the importance and generality of selection problems: given a collection of sets, select an element in each set, possibly in a "nice" way. Of particular importance in computer science is the scenario where the ground set is a metric space, in which case it is natural to ask for *Lipschitz* selection (with Hausdorff distance between sets). In this talk I will describe a far-reaching extension of this classical Lipschitz selection problem to an *online* setting, where sets are streaming to the selector. I will show how Riemannian gradient descent (aka mirror descent) can be used to approach this type of problems. I will illustrate the power of the framework by solving a long-standing problem in online shortest path known as layered graph traversal (introduced by Papadimitriou and Yannakakis in 1989).
August 19, 9 am EDT: Coralia Cartis, Oxford University
Dimensionality reduction techniques for nonconvex optimization problems
Modern applications such as machine learning involve the solution of huge scale nonconvex optimization problems, sometimes with special structure. Motivated by these challenges, we investigate more generally, dimensionality reduction techniques in the variable/parameter domain for local and global optimization that rely crucially on random projections. We describe and use sketching results that allow efficient projections to low dimensions while preserving useful properties, as well as other tools from random matrix theory and conic integral geometry. We focus on functions with low effective dimensionality, a common occurrence in applications involving overparameterized models and that can serve as an insightful proxy for the training landscape in neural networks. We obtain algorithms that scale well with problem dimension, allow inaccurate information and biased noise, have adaptive parameters and benefit from high-probability complexity guarantees and almost sure convergence.
August 19, 10 am EDT: Closing ceremony and MOS Business meeting