As is often the case, the first step consisted of a detailed behavioral analysis of the user population. User behavior varied both by day and time of day (shown at left), but also by customer type.
I’ve been involved in dozens of projects in many capacities, from lead scientist to software developer to visualization specialist. I have also managed many teams, though I find hands-on work more rewarding.
In terms of tools, I’ve worked a lot in Java, along with a bit of JS/HTML/CSS, though I spend most of my time now in Python or R, and using other tools like Bash or SQL when appropriate. I love learning new languages.
Some samples of my work are provided, with a few redactions to protect confidential client names and/or implementation details.
Other Work
Optimization in Content-Delivery Networks
I worked for a leading French telecom vendor on strategies for designing the layouts of their content-delivery network (used for things like on-demand video)
We segmented the population into several groups with well-defined behavior patterns, e.g., light watchers, midnight bingers, etc.
We were provided with a reference network architecture within which to evaluate traffic, with content cached in certain nodes, and accessed by users at others. Based on six behavioral clusters we identified in the user population, we built a system for generating synthetic network traffic, with simulated users choosing movies and shows to simulate network loads. A user-facing simulation tool was produced to design and compare different strategies for caching in response to varying assumptions about user makeup and demand.
Dashboards showed summaries of network traffic, problems, and network hot spots (right)
Nymbler:
In this fun project, we build a web site to help prospective parents navigate the space of baby names. We used an evolutionary algorithm approach for finding names, with crossover and mutation operators bridging the gaps across name styles and spelling variations.
The site would let you collect names you like, and would give you a continuously tuned stream of new names derived from the ones you like, and distinct from the ones you blocked.
I got to wear many hats on this project:
- interviewed prospective parents about the factors they consider when choosing names
- scraped many web sites for name data, origins and meanings, and then cleaned and merged for a new name data set
- worked with the visual design team on the design elements
It was a fascinating experience having to scrape and distill our own data from disparate sources, cross-validating as well as integrating it with 100+ years of US Social Security data on name popularity.
Human-centered Route Optimization
In contrast with most route-optimization work, we built a tool that gave the postal workers a voice in the process. This tool was adopted in post offices across the whole of France for producing the découpages, the segmentation of the routes in a district.
The tool we developed was based on the concept of interactive evolution, in which we iterated the following steps:
- Produce a few sets of route segmentations
- Postal workers rate each set according their subjective preference.
- Generate new route sets that take into account both worker preferences and traditional metrics like time and fuel. Back to step 2 and repeat until convergence.