As a generalist, what must I know regarding ‘performance’

I believe, a generalist, or a team of generalists, must offer these skills, related to the performance of software systems.

We are either going to be working with new software, which we write, or we will be working on a preexisting, complex and heterogeneous system.

A new system

On a new system, I am going to need the following skills.

  • Create a flexible profiling infrastructure.
    • Be able to log the turnaround time on relevant functionality, and at relevant layers of the system.
    • Be able to turn these logs on and off, increase and decrease the granularity of these logs, etc.
    • Create log output that is easy to analyze.  Bottlenecks must be easy to find.
  • Architect the system such that, bottlenecks can be replaced, without affecting any other part of the system.
  • Load test the system.  Throw large amounts data, and users at the system, in order to expose its fault-lines.

An existing system

More often than not, you will have to deal with preexisting systems which you cannot instrument.   Hence performance measurement will essentially have to be a black-box affair.  In such situations, these skills are needed.

  • Load test the system from whatever interface is exposed to you.
    • Throw users, and data at the available GUI.
    • Throw users and data at available APIs.
  • Specialized knowledge on profiling the various parts of the existing system will help.  For instance, say you have a PHP app running in Apache.   Knowing how to manage Apache profiling, will help.   But you cannot anticipate what tools a client will use.  So this expertise is something we have to be able to acquire quickly, as needed, and on the job.

What platforms and tools exactly?

My interest is in two platforms – the Java eco-system, and node.js.  Much of my programming experience has been in the Java world, and I am newbie to node.js.    In any event, the above requirements translate into the following concrete skills, I believe.

  • Implement profiling (logging by another name) which can be configured at run-time.
    • In a system that runs on the JVM, which includes these languages – Java, Scala, Groovy
      • Through common logging frameworks – log4j, JDK logging, SL4J
      • Through aspects – AspectJ
      • Through meta-programming – Groovy, Scala
    • In node.js
  • Break the system into modules whose boundaries are milestones that are relevant to performance.  Implement these modules such that we can look into and change the performance on any one of them, without affecting anything else.
    • For instance, presentation, service, data access layers.
    • Or web service access to an external site, a computation performed in a rule engine, a search query against a text-based index, a MapRequce query against a NoSQL db, etc.
  • Get turnaround times out of at least commonly used tools, which are listed below.  Clients may use other tools.  So this is often going to be something you figure out on the job.  Also, getting profiling information is only one problem.  Actually administering these tools to improve performance is often a complex, and specialized job.   It is almost impossible for a generalist to master all these tools.  At most, when the need arises, we need to know how and where to look for a solution.
    • Web / App Servers
      • Apache HTTP server
      • Jboss Application server
    • At the client
      • Rendering engines, and Javascript engines on web browsers (Chrome, Safari, Firefox, IE, Opera)
      • A mobile app – Android, and iOS
    • Data repositories
      • Oracle
      • MySQL
      • Lucene
    • Others
      • JBPM (Business process management)
      • Jboss Rules (Rule engine)
      • CXF, Axis (Web service frameworks)
  • Load (data and users) test.  There are plain old test automation tools, and there are specialized load and performance testing tools.  I, in fact, do not know what is adequate for this purpose exactly.   I have used simple Junit based tools in the past for this.
    • By driving a web UI that is in some browser.
    • Through a mobile app – Android, and iOS
    • By driving an API (JVM language, or node.js) directly, without going through a UI.


Now what?

On second glace, this is no trivial list of skills.  If nothing else, there should be little room for boredom.