Sasha Goldshtein is the CTO of Sela Group, a Microsoft C# MVP and Azure MRS, a Pluralsight author, and an international consultant and trainer. Sasha is the author of ”Introducing Windows 7 for Developers” (Microsoft Press, 2009) and ”Pro .NET Performance” (Apress, 2012), a prolific blogger and open source contributor, and author of numerous training courses including .NET Debugging, .NET Performance, Android Application Development, and Modern C++. His consulting work revolves mainly around distributed architecture, production debugging and performance diagnostics, and mobile application development.
Workshop: Crunching Big Data with Apache Spark
Spark is a rising star in the big data world, usually surpassing Hadoop performance by at least an order of magnitude — and recently also surpassing Hadoop in terms of adoption! It is also more intuitive to use because you don’t have to force every problem into the confines of map and reduce operations. At the same time, Spark is compatible with the Hadoop distributed filesystem (HDFS), execution engine, and deployment tools. In this workshop you will learn how to use Spark for one-off data analysis and investigations, how to build and submit Spark jobs, and how to use some higher-level libraries such as Spark SQL. We will also cover the basics of the Scala programming language required for working with Spark.
Spark is one of these things you can’t learn without a lot of hands-on work: it involves some new concepts, probably a new language (Scala or Python), and a slightly different mindset where you have to carefully deconstruct operations so that they can be efficiently performed in a distributed manner. This is why this workshop is accompanied by multiple hands-on labs: using basic Spark transformations and actions, parsing logs and data files, compiling and submitting Spark programs, and many others. Overall, expect to spend 50% of the time building Spark programs and discussing your work with the group.
Session: Case Studies – Investigating Production Issues in the Field
After a decade of investigating, chasing, catching, and mutilating production issues, I have a lot of stories to tell. This session brings you along for a tour of some of the most interesting performance optimization and production debugging engagements from my career. We will talk about the tools I use — Sysinternals, ETW, PerfView, WinDbg, etrace, msos — and how to apply these tools quickly to figure out what’s wrong and how to fix it. We will review detailed scenarios of production problems and describe a methodical approach for diagnosing them: figuring out which resource is the bottleneck, measuring and instrumenting the system for accurate traces and performance statistics, and presenting your findings to stakeholders in order to make the best decision for the product.