A senior engineer with experience at every layer of the
technology stack from the OS to the front-end. A leader,
both technical and cultural, passionate and opinionated
about best practices yet seasoned enough to recognize when
Done beats Done Right. Take advantage of twenty+ years of
- designing, building, and deploying complex scalable
cloud-native systems that perform well, degrade gracefully,
and fail rarely, to help solve the complex problems that
drive the business.
- driving adoption of patterns and processes that manage
the complexity of scaling the business both
technologically and as a team.
- leading by example and mentoring other engineers to
help them grow into the technical virtuosos you need as
the core of your engineering team.
- Intermediate to expert capacity planning skills paired
with significant experience managing costs for a technical
- Expert-level understanding of the AWS cloud computing
platform and related services.
- Expert understanding of software development methodologies
and how they break down in real teams; significant experience
creating functional development processes that work for the
team(s) I'm responsible for.
- Expert-level Python programming skills; particularly
development of scalable web services. A deep understanding
of the power and limitations of the Python standard
- Expert-level understanding of the performance
characteristics of the Python interpreter.
- Expert-level understanding of scalable infrastructure
- Expert-level understanding of source-code management
principles and systems, particularly git.
- Expert understanding of distributed data storage and
processing technologies, particularly Cassandra and
- Expert-level understanding of Linux/Unix administration
- Intermediate to expert understanding of performance
tuning of Python software.
- Intermediate to expert programming skills in several
programming languages; developed non-trivial applications
and utility scripts in Scala, Go, Java, Common Lisp, Ruby,
Node.js, and bash.
- Intermediate understanding of relational database
systems, including scaling techniques.
- Comprehensive high-level understanding of programming
concepts and talent for applying those concepts to rapidly
develop proficiency in unfamiliar programming
- Experience scaling technical operations teams from a
single engineer to over fifteen. Experience advising
engineering teams in general on applying those scaling
techniques to their teams.
- Experience providing technical leadership across several
mid-size engineering teams (30-40 engineers).
- Experience managing a small team (2 direct
- Developed complex software in Python; worked with common
Python frameworks such as Tornado, Django, and
- Developed complex realtime stream processing software in
- Transitioned from full-time Operations to full-time
Development and back. Gained a deep understanding of both
sides of the "DevOps" divide.
- Built and lead an Operations team from the ground up;
ensured Operations stayed highly integrated with
Engineering as both teams grew.
- Designed, built and automated several multi-AZ
infrastructures on top of Amazon Web Services.
- Participated in design and implementation of custom
cluster management software, used to manage 1000 servers
across bi-coastal datacenters.
- Implemented Puppet modules to automate configuration of
a broad range of services.
- Contributed to the maintenance and improvement of
Spectrum Labs, Inc.
- Designed, built, and deployed a distributed
ad-hoc data processing system using Spark,
Kubernetes, and Argo.
- Wrote a Terraform module for deploying the
kubernetes autoscaler and node termination
handlers for EKS.
- Participated in code reviews of significant
changes across several large pieces of machine
- Improved performance of on-premises machine
learning classifier data ingestion by several
orders of magnitude.
- Developed several utility Terraform modules
to codify best practices and provide reusable
Software Engineering Architect
2018/02 to 2020/06
- Designed and lead implementation of system(s) for
deep operational observability of the Heroku platform.
Drove conversations about observability across the
broader Salesforce architecture organization.
- Lead architecture for Heroku's Production Engineering
department. Help drive creation and evolution of the
architecture organization inside of both Production
Engineering and the broader Heroku org.
- Lead architecture for next-generation hybrid
deployment model, focusing on cloud coherence and
Software Engineering PMTS
2017/02 to 2018/02
- Introduced Embedded Platform Engineering model to
Salesforce engineering teams.
- Read and absorbed Salesforce Architecture Strategy
documentation, began work on applying to Salesforce DMP
(formerly Krux) infrastructure.
- Mentored new Platform Engineering technical leads,
helped transition team leadership of both Core and
- Architected and executed transition of data collection
architecture (hundreds of thousands of req./s.) to
new-generation platform with zero downtime and zero data
- Led lossless transition from legacy continuous
integration systems to new CI/CD platform.
- Led architecture of standard libraries for metrics,
logging, and command-line applications.
- Authored light-weight processes and policies to address
frictions involved in scaling a team and integrating a
team into an enterprise company.
- Mentored platform engineers in learning Python and
Software Engineering LMTS
- Led initial integration efforts between Krux systems and
- Documented culture and standard operating procedures of
Krux platform engineering for sharing with wider Salesforce
Krux Digital, Inc.
2015/07 to 2016/11
- Developed and drove adoption of standards for service
lifecycle, handoff, and documentation across entire
Platform Engineering team; standards subsequently adopted
across multiple engineering teams.
- Developed and deployed automated build & release
pipeline from verifying unit tests on pull requests
through packaging, promotion, deployment to staging, and
execution of integration & regression tests.
- Developed and deployed a distributed user matching
webservice in Scala + Play! currently serving 13k QPS in
- Divided and scaled a large-scale Kafka cluster with
minimal impact to dependent services. Improved capacity by
50% while simultaneously decreasing costs by 11%
- Re-architected an internally-developed Java application
using open-source components (Logstash &
Elasticsearch); completely eliminating the need to
maintain a custom codebase while improving throughput by
an order of magnitude, and improving uptime.
- Provided technical leadership across several engineering
teams. Drive adoption of standard patterns and processes
- Established and lead the Embedded Platform Engineering
team; chartered to be embedded domain experts in systems
and operational concerns within the development team and
to guide the development and delivery of scalable,
reliable software following best patterns and
practices. Similar to the Google SRE model, but
independently developed and adapted for Krux.
Sr. infrastructure Engineer
Krux Digital, Inc.
2012/03 to 2015/07
- Designed, advocated, and implemented the reorganization
of the DevOps team to scale the organization. Drove
adoption of new patterns and processes to support the new
- Led the daily tactical operations of the DevOps team for
a year. Assisted with vendor negotiation. Managed 2 direct
reports; gave direction to a team of five engineers.
- Re-designed and re-wrote a core service that is central
to the functioning of most other Krux services. Improved
performance by at least 40% and as much as 60% for some
cases. Improved stability, reliability, and
- Built a standard library for python applications at Krux
to enable developers to build applications the "Krux Way"
without re-implementing standard components.
- Architected and lead the implementation of a real-time
data processing system scalable to thousands of requests
- Built a data collection service capable of ingesting
2000 data points per second per process using the Tornado
framework and python.
- Built a websockets service for delivering real-time data
to a thick web client application.
- Contributed bug-fixes to front-end angular.js code for
Krux' primary web application.
- Developed internal administration tools for Krux'
primary web application using the Django framework.
- Introduced Kanban process to improve team workflow and
- Evaluated several "Big Data" storage engines; chose and
deployed DataStax enterprise to serve as the back-end data
store for Krux' user data service.
- Mentored teammates in best practices for Python
development and deployment.
- Mentored teammates in the use of git for version
- Tracked down persistent performance difficulties with
Tornado-based services; implemented solutions enabling
continued scaling of those services.
- Refactored puppet manifests to reflect best
practices. Wrote Puppet modules for:
- Managing cowbuilder environments for building
- Installing, configuring, and monitoring DataStax
- Installing and managing Java versions.
- Upgrading the linux kernel.
- Managing persistent SSH tunnels.
- Gathering system metrics via sysstat.
- Wrote python scripts to monitor a variety of
- Led standardization efforts for deployment
Contract DevOps Engineer
2013/08 to 2014/03
- Wrote puppet manifests for deploying, configuring, and
managing collectd for metric collection and monitoring.
- Wrote a python plugin for collectd to write metrics to
statsd (due to native statsd plugin being unavailable for
the version of collectd used at Lyft).
Contract Operations Engineer
2011/05 to 2013/09
- Led a zero-downtime migration between server
- Managed infrastructure for production website, including
MySQL master/slave replication.
- Researched and resolved several performance problems
caused by poor query optimization.
- Advised developers on technology choices for full
rewrite of site.
- Advised founders on recruitment of new developers.
- Researched hosted Drupal platform, assist in migration
to hosted Drupal platform and hand-off to hosted solutions
Sr. Linux Systems Administrator
2011/10 to 2012/03
- Introduced Kanban process to improve team workflow and
- Build automatic provisioning system with kickstart and
2010/06 to 2011/10
- Assisted in the design of an "Operations API" which we
will use to control our systems programatically for ad-hoc
tasks like restarting services, running consumption tests,
kicking off Puppet runs, etc.
- Built and
a system for remotely controlling services in a plugabble
and programmatic fashion.
- Created a system which fully records incoming requests
and spools a rolling window of requests to disk in order
to allow us to replay requests in the event of a failure
where we would otherwise lose data.
- Contributed to development of user-facing API
- Responded to a malicious attack on infrastructure by a
privileged user; prevented attack from destroying critical
systems. Led recovery from attack and implementation of
new security measures. Led investigation of attack which
ended in prosecution of responsible party.
- Built Puppet manifests and bootstrap scripts to allow us
to bootstrap instances to various roles without having to
- Built Operations team and acted as team lead:
represented operational interests to the development team
leads, established processes (scrum, kanban), set
priorites in response to business needs, delegated tasks
to team members.
- Migrated configuration management from Chef to
Senior Systems Engineer
2008/03 to 2010/06
- Actively worked to close gap between engineering and
operations, foster transparency,
engineering to iterate rapidly while maintaining site
- Participated in development of internal cluster
- Wrote command-line application to automatically allocate
and provision servers,
including options to specify
minimum hardware requirements.
- Redesigned Digg infrastucture to take advantage of heavy
automation & configuration management.
- Developed Puppet modules to automate deployment,
configuration, and lifecycle management of key
- Developed FAI scripts which bootstrap systems from 'bare
metal' to functioning Puppet clients.
- Assisted in migration from Subversion to Git as primary
source code management system.
- Responsible for ensuring reliable operation of
production, staging, and development systems.
- Perform code pushes and maintain change
Contract Infrastructure Architect
2007/10 to 2009/12
- Built an infrastucture which ran smoothly for over two
years with minimal intervention.
- Automated system configuration using Puppet.
- Firewall design, implementation, and maintenance.
- Create, deploy, and manage Xen virtualized servers.
- Patch managment and server maintenance.
- Proposed and implemented automated backup system.
- Deploy & maintain split-horizon DNS services.
- Replace Zenoss monitoring system with Nagios.
- Implement SNMP infrastructure.
- Deploy Zenoss monitoring system, notifications, and
- Created extensive documentation, including
straightforward how-to procedures for common
- Configure & maintain MySQL database systems.
Senior Systems Administrator
Kapor Enterprises, Inc.
2007/02 to 2008/03
- Created Twiki-based project management application.
- Identified key areas of network and process improvement,
- Proposed, planned, and implemented single sign-on
solution and corporate directory service.
- Provided desktop support for a heterogenous network of
Mac OS X, Windows, and Linux desktops.
- Responsible for researching and procurement of
best-of-breed equipment to implement the needs of
2005/09 to 2007/02
- 24x7 pager support for critical production systems.
- Planned and implemented migration from SiteScope to
Nagios network monitoring, implemented custom service
plugins, distributed & redundant architecture,
performance metrics, and custom reporting interface.
- Planned and implemented OpenLDAP directory service;
researched and proposed site-wide integration of
applications with directory service.
- Administered heterogeneous network of several hundred
nodes spread across several sites; maintained WAN links,
VPN, remote administration between locations; maintained
two data center locations with over 125 servers.
- Performed audit of poorly maintained RAID systems,
implemented automated maintenance and reporting of RAID
array performance and issues.
- Proposed, planned, and test-deployed Linux desktop
solution for sales representatives to replace costly
- Performed extensive documentation of server and network
- Researched, planned, and assisted deployment of Novell
eDirectory and ZenWorks infrastructure.
- Assisted in migration from Checkpoint firewalls to
- Assisted in software license compliance audit.
- Oversaw replacement of aging desktop-class hardware with
- Researched and proposed migration plan from NT-style
domain to Active Directory infrastructure.
- Researched and proposed deployment of enterprise XMPP
chat service for internal messaging.
- Researched and proposed replacement of overloaded
internal mail architecture with distributed, scalable
2004/04 to 2005/09
- Planned and implemented migration from archaic
Sendmail-based email system to a database-backed Postfix
mail server, delegating specific tasks to system
administration staff. Supervised programming of customized
email administration software for new system.
- Replaced several outdated firewalls with OpenBSD
firewalls. Wrote sophisticated firewall rule sets to
filter and monitor network traffic. Created automatic log
processing scripts to analyze firewall logs and produce
- Created standardized server configuration procedures and
documentation. Oversaw redeployment of servers to comply
with standardized configuration.
- Created and administered network access/security
policies and procedures.
- Maintained and administered heterogeneous intra-office
network, including Windows 2000/XP and Mac OS X
workstations, Unix file and web servers, wireless access
points, and network printers.
- Planned migration of intra-office network to Windows
2003 Active Directory infrastructure.
- Remotely administered and maintained DNS, web, and
database servers in several co-location facilities.
- Provided Tier II technical support.
- Oversaw system backup and network maintenance as well as
server monitoring (Nagios) and IDS.
- Performed software license audit and inventory.
- Maintained relationships with hardware, software, and
- Installed and administered DHCP server for automatic
- Automated a variety of administrative tasks through
shell scripts and custom programming.
Jr. Systems Administrator
- Administered web, DNS, and database servers running
FreeBSD, NT 4, and Windows 2000 Server.
- Administered MySQL and MS-SQL 2000 databases.
- Improved uptimes by as much as 29%.
- Created, implemented, and enforced security policies on
network, server, and individual workstation levels.
- Managed server migrations to/from varios operating
- Planned and installed internal networks: CAT5 cabling,
network ports, high-bandwidth switches, wireless access
points, routers, firewall, and VPN.
- Secured wireless network segment against unauthorized
access and packet monitoring.
- Proposed, planned, and implemented conversion of aging
collection of 486 PCs into Citrix Metaframe thin-client