As a generalist, what must I know regarding ‘performance’

I believe, a generalist, or a team of generalists, must offer these skills, related to the performance of software systems.

We are either going to be working with new software, which we write, or we will be working on a preexisting, complex and heterogeneous system.

A new system

On a new system, I am going to need the following skills.

  • Create a flexible profiling infrastructure.
    • Be able to log the turnaround time on relevant functionality, and at relevant layers of the system.
    • Be able to turn these logs on and off, increase and decrease the granularity of these logs, etc.
    • Create log output that is easy to analyze.  Bottlenecks must be easy to find.
  • Architect the system such that, bottlenecks can be replaced, without affecting any other part of the system.
  • Load test the system.  Throw large amounts data, and users at the system, in order to expose its fault-lines.

An existing system

More often than not, you will have to deal with preexisting systems which you cannot instrument.   Hence performance measurement will essentially have to be a black-box affair.  In such situations, these skills are needed.

  • Load test the system from whatever interface is exposed to you.
    • Throw users, and data at the available GUI.
    • Throw users and data at available APIs.
  • Specialized knowledge on profiling the various parts of the existing system will help.  For instance, say you have a PHP app running in Apache.   Knowing how to manage Apache profiling, will help.   But you cannot anticipate what tools a client will use.  So this expertise is something we have to be able to acquire quickly, as needed, and on the job.

What platforms and tools exactly?

My interest is in two platforms – the Java eco-system, and node.js.  Much of my programming experience has been in the Java world, and I am newbie to node.js.    In any event, the above requirements translate into the following concrete skills, I believe.

  • Implement profiling (logging by another name) which can be configured at run-time.
    • In a system that runs on the JVM, which includes these languages – Java, Scala, Groovy
      • Through common logging frameworks – log4j, JDK logging, SL4J
      • Through aspects – AspectJ
      • Through meta-programming – Groovy, Scala
    • In node.js
  • Break the system into modules whose boundaries are milestones that are relevant to performance.  Implement these modules such that we can look into and change the performance on any one of them, without affecting anything else.
    • For instance, presentation, service, data access layers.
    • Or web service access to an external site, a computation performed in a rule engine, a search query against a text-based index, a MapRequce query against a NoSQL db, etc.
  • Get turnaround times out of at least commonly used tools, which are listed below.  Clients may use other tools.  So this is often going to be something you figure out on the job.  Also, getting profiling information is only one problem.  Actually administering these tools to improve performance is often a complex, and specialized job.   It is almost impossible for a generalist to master all these tools.  At most, when the need arises, we need to know how and where to look for a solution.
    • Web / App Servers
      • Apache HTTP server
      • Jboss Application server
    • At the client
      • Rendering engines, and Javascript engines on web browsers (Chrome, Safari, Firefox, IE, Opera)
      • A mobile app – Android, and iOS
    • Data repositories
      • Oracle
      • MySQL
      • Lucene
    • Others
      • JBPM (Business process management)
      • Jboss Rules (Rule engine)
      • CXF, Axis (Web service frameworks)
  • Load (data and users) test.  There are plain old test automation tools, and there are specialized load and performance testing tools.  I, in fact, do not know what is adequate for this purpose exactly.   I have used simple Junit based tools in the past for this.
    • By driving a web UI that is in some browser.
    • Through a mobile app – Android, and iOS
    • By driving an API (JVM language, or node.js) directly, without going through a UI.

     

Now what?

On second glace, this is no trivial list of skills.  If nothing else, there should be little room for boredom.

User experience in an enterprise system

I have a decided fixation on ‘user experience‘.  However, I couldn’t tell you what I am looking for exactly.  This is an attempt to remove the fuzziness.

It seems to me that there are two primary questions.

  • Who are the users anyway, of an enterprise system?
  • What should these users be able to expect from a system?

The users of an enterprise system

Anyone who interacts with the enterprise system in any way at all, I consider its users.

  • Employees who use the system to conduct the business of the enterprise.
  • Tech folks who create, maintain, enhance, and run the system.
    • Managers
    • Programmers
    • Testers
    • Dev Ops (Software configuration folks, System Admins)
    • Software Operators
  • Business partners who use the system
  • Customers who use the system

All of these folks, in their various capacities, must have good ‘user experience‘.   What does ‘user experience‘ mean to each of these folks?

Common to all users

In this post I will only list behavior that I believe applies across all classes of users.  In subsequent posts, I’ll go more into what might make sense for various individual classes of users.

Business functions must exhibit the following characteristics.

  • They must be performant – must execute just as fast as necessary, regardless of the load on the system.
  • Failure of a function must not leave the system in an incorrect state, which might require invasive, and manual cleanup.  Rollback as necessary.
I've been involved in exercises where dealing with the errors that 
production code left in its wake, became a full time job.  There are 
many reasons why a development task ends up here, including holes in 
business analysis, code construction, and testing.  None of this is 
rocket science.  The relevant skills can be acquired, and these potholes
can, and must be avoided.
  • Every business command must be available through three avenues – a GUI, a DSL that you use at the command line, and as pluggable units in external systems like schedulers, ESBs, and workflow management systems.
A single API must support all these various methods of invoking business
functionality.

You are taught to create interfaces that are appropriate to all levels 
of expertise.  Stepping through 8 web pages in order to make one edit 
on the 9th page might be acceptable for non-tech savvy customer service 
rep.  However, a developer, who is dealing with a customer support issue, 
does not have to deal with the overhead of all that UI.  Two lines of 
script ought to do the same job just nicely.  You could use SQL and change 
data directly in the system's database.  But wouldn't you rather go 
through the application's infrastructure, which presumably has essential 
checks and balances in place. That is where the domain specific language 
(DSL) comes in.

A DSL would also enable functional testing, while avoiding the drudgery of 
dealing with the UI.
  • You must be able to access the business function from the device of your choice – desktops, laptops, tablets, phones.
This is where the world seems to be heading.  Either get on the bandwagon, 
or be left behind.
  • Command line access must be in the form of a domain specific language (DSL) with which you can write scripts to tie together various functions in a workflow.
  • Long lived commands must have these behavior.
    • Run them in the background, and bring them into the foreground as necessary.
    • Expose current status, and progress.
    • Allow you to abort in the middle of execution, and return the system safely to the previous state.
    • Allow you to stop and resume where you left off.

Required skills 

Not every project, at every client, will ask for all the features described above.  However, I believe it represents a set of skills that an engineering outfit can reasonably be expected to have.  I want to be able to deliver this functionality.

There is little here that is exotic.   I only see three or four main streams of skills, all of which are well understood, and well supported.

  • Learn to measure, manage, and deliver performance.
  • Learn to architect business functions such that they lend themselves to manipulation from the command line, and GUI.  This mostly consists of creating well-designed APIs.
  • Learn to create GUI in a browser neutral fashion, and for various mobile devices.
  • Learn to create well-engineered and expressive DSLs.

Is it practical for a small and competent team to be able to develop these competencies, and employ them  successfully?   I would have to think so.

 

After trying to conserve bandwidth for a few days ….

I looked around for help on slowing down the speed at which I was eating up my monthly allotment of bandwidth, and found that many other people have tread this path before. Here is one pretty nice reference at Million Clues.

As various references suggested, I switched to FireFox, and installed two Add-ons, AdBlock Plus, and ImgLikeOpera.

I don’t have any scientific proof that this has reduced my bandwidth consumption, but I can certainly see that both banner ads (AdBlock Plus), and images (ImgLikeOpera) are not fetched.   Pages are loading faster.

ImgLikeOpera has the more significant impact on my user experience.  I have set it to block all images by default.   It has been interesting going about my business on the web without any images.   Some types of sites accommodate the missing images much better than others.

EspnCricinfo looks bad without images.  They use images for backgrounds of menus, without the background image, the menu items simply get mixed up with the other text.

Shopping sites were hard to deal with, which I suppose is natural.  After about 10 minutes on the Croma site, I gave up and turned on all images.   Large shopping sites like Zappos, and Amazon, will not work very well on low bandwidth.   On the other hand, I imagine that their mobile sites are designed with bandwidth constraints in mind.

News sites did much better.   The New York Times site was lovely.  Without images it continued to look very familiar.   Missing images were clearly marked, and stayed out of the way of the text.  The text itself was in exactly the same places that it always is.  Washington Post was a little uneven.  Talking Points Memo was clean.   Political Wire never has news images, and isn’t that a nice decision.

I have a renewed appreciation for the tech and design folks at the New York Times.   They probably wouldn’t even hire me as a janitor.

Facebook was interesting.   At first, it was very hard to use with all images turned off.   After a couple of days, it is beginning to feel natural.  I guess the brain adjusts.  It gets used to the visual cues that the images provide.   After a couple of days without images, you begin to learn the cues that the other content provides.

I have deliberately stayed away from many of the Indian sites.   My experience these past few weeks at these sites has not been very pleasant.   Their attention to user experience, to put it mildly, is uneven.   Dedicated E-commerce sites (FlipKart, InfiBeam, SnapDeal, etc.) have been generally much better than enterprise sites (Tamil Nadu Electricity Board, Tata Docomo, Karur Vysya Bank, etc.).

This is all very interesting.  I have never had to take available band-width into account when coming up the design for a web UI.   I wonder how seriously this is addressed by developers who cater to folks in the first world.   However, everybody is moving towards ‘mobile design first’.   One of the basic constraints of that world, is limited bandwidth.  Decisions made for the mobile design will bleed into the desktop, and that will address bandwidth shortage in a natural manner, I expect.