Summary

I am an applied mathematician/data scientist with extensive experience in stochastic modeling, simulation, statistics and machine learning. I extract actionable information from large volumes of messy data, and present it in compelling ways to all audiences.


Education

Ph.D. in Probability and Statistics, Carleton University, Ottawa, 2012

  • Thesis was titled “Uniform and Mallows Random Permutations: Inversions, Levels & Sampling.”

M.Sc. in Information and Systems Sciences, Carleton University, Ottawa, 2000

  • Thesis was titled “Statistical Estimation of Effective Bandwidth”"

B.A. in Mathematics, Concordia University, Montreal, 1992


Experience

Data for Good Ottawa, Ottawa (2014-Present)

  • Co-organizer of “Data For Good Ottawa” (http://www.meetup.com/DataforGood-Ottawa).
  • Grew the team from five members to its current complement of about fifty active participants
  • Recruited members from academia and industry
  • Led the team through completion of their first two projects “Understanding Demand” for the Ottawa Food Bank, and “Understanding Donor Behaviour” for the Youville Centre.
  • Initiated projects by soliciting clients, brainstorming project ideas, presenting at various meetups, etc
  • Developed relationship with University of Ottawa so students can contribute to a project as course work
  • Kick started “Data for Good Montreal” effort

Senior Performance Engineer at Akamai, Ottawa (2015-Present)

  • Developed empirical Bayesian methodology to assess differences in real user measurements perceived performance of varying experiments
  • Developed on-line models to predict the next web page a user will visit in order to optimize page delivery and user satisfaction. Developed an empirical relationship between the cost and benefit of such a scheme.
  • Reviewed existing experimental plans and proposed changes to analysis procedures to ensure the maximum amount of information is extracted.

Senior Infrastructure Architect at BlackBerry, Ottawa, (2012-2015)

  • Developed models of alternative system architectures to evaluate system reliability and availability.
  • Applied topic models to understand segmentation of social media users and forecast growth.
  • Developed a Bayesian model to evaluate the likelihood that the user of a device is its owner to allow for an improved user experience.
  • Developed a methodology to allocate traffic to data center clusters to reduce inter-cluster traffic, and therefore cost. This involved using simulated annealing to minimize graph conductance, subject to many other business requirement related constraints.
  • Analyzed multi-gigabyte data sets in order to extract information about user behavior of social networks, including inter-messaging time, message read times via survival analysis, connected component sizes, degree distributions, effects of geography, evolution of user behavior, etc.
  • Analyzed potential next generation systems including equipment requirements, traffic behavior, performance, system cost, etc.
  • Developed tools to analyze sensitivity of service cost to various input assumptions in order to prioritize system developments.
  • Analyzed production data center characteristics to assess performance of load balancing algorithms, and the impact of any load imbalance.
  • Analyzed queueing models of system infrastructure in order to improve performance

Systems Architect at Research in Motion, Ottawa, (2008-2012)

  • Applied statistical techniques to understand performance bottlenecks.
  • Applied data mining techniques to massive datasets in order to derive actionable knowledge.
  • Developed queueing theoretic models of infrastructure components, using Mathematica to encode the models in an executable form that could be used to analyze what-if???s, in order to better understand system performance bottlenecks under varying scenarios.
  • Developed a machine learning prototype to identify anomalies in network traffic and determine whether they are shifts in the process, or one-time glitches.
  • Developed a prototype tool and specification to triage misbehaving source/destination pairs in order to allow analysts to investigate the most important flows first.
  • Developed a methodology to ensure the validity of the data entering the data warehouse.
  • Developed an experimental test plan and analyzed the results of the experiments to determine the accuracy and potential biases of an internally developed performance monitoring system.
  • Developed methodologies to ensure browsing testing was as efficient as possible, including which pages to test against, appropriate randomization, etc.
  • Developed an algorithm for real-time analysis of the index of the tail of the delay distribution. This involved developing the initial concept, simulations and analysis to assess its validity, creation of user guidelines to interpreting results and operating scenarios as well as algorithm specification and sample code. Also included liaising with developers to ensure code was correct.
  • Provided statistical expertise to many projects, including analyzing the effects of device location in the lab on device performance, correlation of cell/wifi signal strength, adaptive flow control algorithms, etc.
  • Presented to audiences, of sizes up to four hundred of varying technical sophistication, on topics such as “System Design for Heavy Tails”, “Critical Paths in Random Networks”, “Hierarchical Bayesian Models” and “#$%^ My Data Says”
  • Prepared and presented a one-day statistics refresher course to 50 engineers.

Senior Research Scientist at Alcatel-Lucent’s Bell Labs, Ottawa, (2001 - 2008)

  • Applied techniques from the intersection of probability, statistics, simulation and algorithms to solve large, complex problems.
  • Applied data mining techniques in real time to network traffic streams to quickly identify malware and to allow resource sensitive applications to be allocated more network resources, thus improving the end-user’s experience.
  • Prototyped systems to automatically characterize new traffic types, after being triggered by a change in the statistical behavior of traffic flows, in an environment with minimal processing capabilities (eg. 1:1000 sampling).
  • Developed simulation models of various router components, and their interactions, in order to ensure that future products will perform as required.
  • Developed online traffic management, measurement and engineering algorithms to allow the router to adapt quickly to changing network demands, including traffic matrix estimation, call admission control, online effective bandwidth estimation, etc.
  • Designed alternative router architectures that are lower cost, simpler, more adaptable, or have higher performance.
  • Lead collaborative projects with academia on various topics, including Network Calculus Applied to Datapath Performance (with Carleton University), Traffic Prediction (with Stanford University), P2P Detection (with University of Ottawa), as well as leading many graduate student internships.
  • Developed intellectual property for the organization in all of the above areas, leading to over a dozen submissions in various stages of the patent process.

Datapath Designer at Sedona Networks, Ottawa, (2000 - 2001)

  • Simulated and analyzed QoS algorithms in order to make recommendations for system design.
  • Developed algorithms that allowed for low cost systems to approximate the performance of high cost ones.

Manager, Technical Services at the Liberal Party of Canada, Ottawa, (1995 - 2000)

  • Managed the MIS department and set information technology standards and direction for the organization.
  • Developed client/server software system for managing all aspects of registration for three national conventions.
  • Managed several large IT projects involving multiple external organizations, including the organization’s first web site.

Consultant at ISA Corporation, Ottawa, (1993 - 1995)

  • Consulted on the full life cycle of software projects for clients such as Nortel, Frontec, PWGSC, etc.

Software Engineer at MDS Aero Support Corporation, Ottawa, (1992 - 1993)

  • Developed software for a real-time data acquisition system for gas turbine jet engines in C, SQL, HyperScript.

Information Services Assistant at Lafarge Canada Inc., Montreal, (1987 - 1992)

  • Sole source of user support for a staff of eighty.
  • Performed simulations and analyses in order to assist sizing of plant components, and assess Lafarge and competitive cost structures

Strategic Planning Analyst at Canada Cement Lafarge, Montreal, (1986 - 1987)

  • Modeled, analyzed and optimized distribution networks.
  • Forecasted construction materials consumption.

Other

Programming Languages/Systems

  • R (incl. tidyverse), Python, Mathematica, SQL, Visual Basic, Matlab, C/C++, Opnet, OMNet++, SimPy
  • Familiar with basics of Hadoop, Spark, H2O

Relevant graduate level courses

  • Probability Theory; Statistical Inference; Time Series Analysis; Nonlinear Dynamics and Chaos; Stochastic Models of Broadband Networks; Cryptography; Algorithm Design and Analysis; Queueing Theory; Genome Rearrangement Algorithms; and Topics in Stochastic Processes which surveyed Markov Random Fields and Stochastic Differential Equations.

Publications & Presentations

  • “Uniform and Mallows Random Permutations: Inversions, Levels & Sampling”, Ph.D. thesis, Carleton University, 2012.
  • “On the Length of the Longest Common Subsequence”, Applied Probability Seminar, Carleton University, Ottawa ON, 2008.
  • “From ‘How should I structure my code?’ to the Euler Pentagonal Numbers: A Problem in Resequencing Buffer Sizing”, CANQUEUE 2008, Ottawa ON.
  • “Algorithms for Identification of Network Data Streams” with Jun Li, Second Canada-France Congress poster session, June 2008
  • “Expected Length of the Longest Common Subsequence: A Survey”, course project, Carleton University, 2007
  • “A TCP Connection Establishment Filter: Symmetric Connection Detection” with Brad Whitehead, Chung-Horn Lung, ICC 2007 Technical Papers
  • “Classification of Peer-to-Peer Traffic Using Neural Networks” with Bijan Raahemi, Ahmad Hayajneh, International Conference on Artificial Intelligence and Pattern Recognition, (AIPR-07), Orlando, Florida, USA, July 9-12, 2007
  • “Peer-to-Peer IP Traffic Classification using Decision Tree and IP Layer Attributes” with Bijan Raahemi, Ahmad Hayajneh, International Journal of Business Data Communications and Networks, Vol. 3, Issue 4, pp. 60-74, August 2007.
  • “Sampling Constrained Contingency Tables and Traffic Matrix Estimation”, 13th INFORMS Applied Probability Conference, Ottawa, Canada, July 6-8, 2005
  • “The Hydra Switch Architecture: a Simple, High Performance Alternative to VOQ” with Wladek Olesinski, IEEE INFOCOM 2005 Poster/Demo Session, Miami, Florida, March 9-11, 2005
  • “What is Effective Bandwidth?” with Danny De Vleeschauwer, unpublished, 2004
  • “Surrogate data and fractional Brownian motion” in “Recent advances in statistical methods. Proceedings of statistics 2001 Canada: The 4th conference in applied statistics, Montreal, Canada 6-8 July, 2001.”, Chaubey, Yogendra P.(ed.)
  • “Statistical Estimation of Effective Bandwidth”, M.Sc. Thesis, Carleton University, 2002

Patent Submissions

  • Software Configurable Cluster-Based Router Using Stock Personal Computers as Cluster Nodes
  • Software Configurable Cluster-Based Router Using Heterogeneous Nodes as Cluster Nodes
  • Agent Based Router Monitoring, Diagnostics and Maintenance
  • Method and Apparatus for Closed Loop, Out-of-Band Backpressure Mechanism
  • Communication Session Admission Control Systems and Methods
  • Network Service Level Agreement Arrival Curve Based Conformance Checking
  • Simulated Annealing for Traffic Matrix Estimation
  • A Fast Simulated Annealing for Traffic Matrix Estimation
  • Statistical Trace-Based Methods for Real-Time Traffic Classification
  • Method for Estimating the Fan-In and/or Fan-Out of a Node
  • Worm Detection by Trending Fan Out
  • Method and System for Counting New Destination Addresses
  • A Method, System and Service for Structured Data Filtering, Aggregation, and Dissemination
  • Personalized Commercial Cache

Service

  • Frequent contributor to “Open Data Ottawa” (http://www.meetup.com/Open-Data-Ottawa)
  • Occasional blogger (http://datascienceottawa.wordpress.com)
  • Served on technical program committee for the 3rd international workshop on QoS in Multiservice IP Networks (QoS-IP 2005)
  • Was instrumental in setting up the MITACS internship program
  • Was a Member of the Alcatel Technical Academy
  • Organized a session on “Telecom Research: An Industrial Perspective” at the INFORMS APS conference, July 2005.