This is the start of a post I’ve added to my Pivotal Labs blog
What does an agile software development team look like?
At core, software engineers turn ideas into results. Of course we are not the only ones with this job description. We share it with many other creative professions. Writers, for example, turn ideas into words that inspire, educate and inform. That’s pretty much what we do too: we turn ideas into words that instruct a computer system to perform a desired behaviour.
Focusing on writers for a moment… there are a wide range of writing environments and styles. On one side of the spectrum are novelists who secret themselves away into a quiet room, where they can find the time and space to breathe life into their intricate vision. On the other are journalists on the newsroom floor, sampling from an avalanche of information, responding quickly to what’s new and what’s important.
At Pivotal Labs, we work with many fast-moving companies, helping them to fashion the software engineering capability they need to succeed. Our approach is to create an environment that resembles a newsroom more than a writer’s hide away.
Like most engineers, I do a lot of optimizing, often just for fun. When walking to work I seek the shortest route. When folding my laundry I minimize the number of moves. And at work, of course, I optimize all day long alongside all my engineering colleagues.
Optimization, by definition, requires an objective function as the basis for measuring improvement or regression. How short is the route? How many moves does it take to fold a shirt? But what is the objective function at work around which my team and I should optimize?
I’ve worked in many software engineering organizations where the objective function is an unstated confusion that evolved on its own over time. It’s often a bit of “don’t break things” mixed with a dose of “conform to the standards”. Sometimes more destructive objectives find their way into the culture: “get yourself seen,” or worse “don’t get noticed.” And my least favorite, “horde your knowledge.”
Recently, while working with a client, I had to state my views on a good objective function for a software engineering team. It’s to predictably deliver a continuous flow of quality results while minimizing dark time — the time between when a feature is locked down for development and when a customer starts using it.
Predictable: Your process has to be stable and sustainable. It’s not about sprinting to collapse; nor is it about quick wins followed by a slog through technical debt. It’s about a steady pace over a long time. Hence the volatility measure in Pivotal Tracker; a good team has low volatility, and therefore their rate of delivery is highly predictable.
Delivery: Code is not worth anything until it is in use by customers. Calling delivery anything else often leads to spinning wheels and wasted effort.
Continuous flow: Activities that potentially disrupt the flow and would be better off if dealt with in the moment, in the normal run of things. For example, I find mandatory code reviews disruptive and demoralizing. Gatekeepering steps like these, by definition, stop the normal flow and send things back for rework. In contrast, pair programming often achieves the same quality and consistency objectives in real time and without disrupting the flow
Quality: This is a relative measure. The work needs to be done sufficiently to avoid rework (i.e. bugs) and to prevent the accumulation of technical debt. Spending more time trying to achieve “quality” beyond these measures is just waste.
Results: What it’s all about.
Minimizing dark time: Many software engineering organizations miss this one because it’s driven by the business rather than the needs and craftsmanship of the engineers themselves. And yet, minimizing dark time is perhaps the most critical contribution that an engineering team can make to a business.
Dark time is what the business experiences between when the engineers remove the businesses ability to re-spec a bit of work and when they hand back a working result. In this dark time the business can no longer refine their decision nor observe and learn from the results. They’ve ordered (and paid for) the new car, but are waiting for the keys. It’s dark because during this stage there is nothing for the business to do, with respect to that feature, but wait.
While coding I experience the same dark time when working on a slow code base or, worse yet, working with a slow test suite. My TDD cycle grinds to a crawl as I wait a minute or more (the horror!) between when I hand my code to the compiler/interpreter and when it spits back the red-green results.
If you hate twiddling your thumbs waiting for slow tests to run, think how frustrating it is for the business folks when their engineering team throws them into the dark for days, perhaps even weeks. Of course they pipeline their work and find ways to be useful, but the dark time still sucks.
When a software engineering team chops dark time down from a month to a week business folks cheer. When the engineers chop dark time down to a day or less, the business folks do what us coders do when working with a fast test suite… we fly.
This post also appears on my blog at Pivotallabs.com.
Every once in a while I spend a bit of time reviewing and streamlining my GTD process. This time I hit on a pretty big win — automating a connection between Gmail and Remember the Milk so that collecting actions from across all of my Gmail accounts is one-click easy. This automation makes it a breeze to process email on my phone. Woohoo – GTD on the can!
Remember the Milk has been at the core of my GTD stack for several years now. I’ve looked into other systems, even trying Astrid for a month, but RTM still wins for my requirements set. It’s got a scriptable API, a command line interface, a solid Android app, and a nice web interface that’s keyboard friendly. Of course it has it quirks and annoyances (why can’t I move tasks between lists using the keyboard?), but it’s the best that I’ve found.
For email, Gmail has wormed its way into being my primary mechanism. It too has it quirks and annoyances, but the network effects are strong and the price is right (well, was right for business accounts). I still use Thunderbird too, but now primarily as a local Gmail client.
For this round of streamlining the challenge I set was to enable keyboard-based creation of a RTM task from a Gmail email that includes a link back to the original thread for quick action. This is now possible using a Google App Script. Here’s the code:
Please feel free to form the gist and improve it!
I’ve recently been consulting as a lean startup expert at the retail side of a large bank in the UK which is exploring how to increase their rate of innovation. The project has been challenging, inspiring and filled with lessons learned.
The bank hired me to help them create a new venture, and in doing so it became clear that the bank is struggling with a larger underlying challenge: how to drive innovation in and around their organization.
I went into the project expecting to find bank managers who had no interest in innovation – why would a bank that is “too big to fail” have much interest in shaking things up? To my surprise the situation was quite the opposite. Almost everyone I met, which spanned a range of management levels, shared three views. First, they wanted to innovate. Second, they were frustrated at the inability to innovate within their organization. And third, they were proud of the ideas that the bank had managed to nurture and launch.
The bank, it turns out, has the potential to be what Steve Blank would call an earlyvangelist customer.
Earlyvangelists are a special breed of customers willing to take a risk on your startup’s product or service. They can actually envision its potential to solve a critical and immediate problem—and they have the budget to purchase it. Unfortunately, most customers don’t fit this profile. (source)
In this case:
- The bank management have a problem – lack of innovation
- They are aware of the problem.
- They are actively looking for a solution.
- The problem is so painful that they cobbled together interim solutions.
- And they have even allocated budget to continue to tackle the problem.
With this in mind I used the project as an opportunity to iterate towards a business model for delivering an effective intervention for driving innovation in large banks. As you’ll see in the remainder of this post, my team and I have learned many lessons. We’ve also made considerable progress towards finding a model that is likely to work.
The rest of this post details the business models we tested and the resulting lessons we learned. It ends with a proposal for a new model, the Inside-Out Incubator, that’s centered around seeding an ecosystem of innovation in and around the bank. It uses an indirect approach that is more likely to succeed than trying to directly change the bank’s deeply engrained culture.
So, without further ado…
With the pandas library, if you have read in a csv file with a date field that is sometimes empty and are getting the error:
TypeError: can't compare offset-naive and offset-aware datetimes
It may be caused by dateutil.parser.parse which is the default function for parsing dates when the csv file is read. This function returns the current non-timezone aware date when given an empty string as input. According to the dateutil documentation “the default value is the current date, at 00:00:00am.” This causes confusion in the context of pandas for three reasons:
- the data in the DataFrame is not derived from the source CSV file
- the expected value of an empty field is numpy.NaN.
- the returned datetime object is not time zone aware.
Fortunately pandas.read_csv has the date_parser argument which allows you to include your own parsing function. One might thing that the following function is the right fix…
def date_or_nan_parse(str): if not(str): return numpy.NaN return dateutil.parser.parse(str)
BUT NO. If you then try to compare a date against the pandas.Series of the dates the same “TypeError: can’t compare offset-naive and offset-aware datetimes” may get thrown. In fact, both of these will fail:
df['datefield'] > dateutil.parser.parse("January 1 1901 00:00 UTC") df['datefield'] > dateutil.parser.parse("January 1 1901 00:00")
This time the problem is that the comparison fails on the numpy.NaN values.
My fix is:
OLD_DATE = dateutil.parser.parse("January 1, 1901 00:00 UTC") def date_or_nan_parse(str): if not(str): return OLD_DATE return dateutil.parser.parse(str)
In this case I introduce a new value rather than (the more correct) numpy.NaN. The new value is a date so it doesn’t fail during comparison operations (assuming that OLD_DATE has the same time zone awareness as the rest of your data). And at least it is a date that I’ve explicitly chosen, rather than just the current date.
I hope this saves you some confusion.
Going from data to action is a recurring challenge in a start up. And the process has never been easier due to the wealth of amazing open source tools including Python (pandas, numpy, matplotlib), iPython Notebook, and D3,js.
I’ve recently worked on a project in the container shipping industry where we had a large database of information about repairs to shipping containers. The challenge was to find actionable opportunities based on insights gleaned from the data. Here’s how I went about the data analysis.
Mungeing and Probing
I started the project by flexing the data this way and that using pandas and the ipython notebook (both amazing tools you should get to know). This took a few passes. First I got it loaded into a DataFrame. Then I altered the structure to make it easier to understand, such as replacing coded names with full text. With that out of the way it was time to explore. The most helpful chart I made was this pareto chart which reveals the relative significance of various drivers in the data. Below is the code to generate the chart for any data series.
Using these pareto charts, plus a variety of histograms and scatter plots, I was able to provide the team with an initial window into the data which we used to identify an avenue that was worthy of further investigation.
With a more clear destination in mind, my goal became creating a visualization that would reveal the opportunity within the data. The tool for this is D3.js. D3 is a little bit confusing to get ones head around at first, but it is well worth figuring out because the things that you can do with it are amazing.
In our case, I wanted to let our team explore the impact of various interventions to curtail types of damage or to protect various parts of the containers. While the pareto chart (above) provides a insight about the cost of various damage types or container parts, it falls short when the two dimensions need to be considered together.
My solution is at this interactive visualization (view full size) . With it our team has been able to explore the data set without having to write more code. They are no longer dependent on me to “run the numbers”. And, it didn’t even take too long to make.
I highly recommend adding data analysis and visualization tools to your toolkit. They aren’t hard to learn and they are amazingly powerful.
At the bottom of this post are instructions if you want to do the same:
Big Data about people = stereotyping and prejudice
Recently in my work for LevelBusiness I’ve been learning about big data. It’s powerful, amazing and fun tech, and it’s all about stereotyping. As we all learned in primary school stereotyping, and it’s flip-side prejudice, are generally bad things that often lead to bad ends.
Big data involves boiling down vast data sets into actionable conclusions. Big data gets dicey when the topic at hand is people rather than things. The conclusion about people are of the form:
- persons a,b,c…. are likely to be pregnant.
- this set of people are often left-leaning (this TED talk on filter bubbles is worth watching).
- this people in this neighborhood claim more frequently on their insurance.
Then the actions are respectively:
- promote baby products to this group.
- only show certain search results to this group.
- decline to insure this group.
The critical but often overlooked point as that these grouping are always just probabilities based on the underlying data set. For example, Amazon infers your religion based on the wrapping paper you buy. They don’t know for sure that your are Christian, Jewish, Muslim or Sikh, but they think that they have enough evidence to make it worth their while to treat you as if their assumption is true. This is the definition of prejudice:
Prejudice (or foredeeming) is making a judgment or assumption about someone or something before having enough knowledge to be able to do so with guaranteed accuracy, or “judging a book by its cover”.
Prejudice + Hearsay = No Good
Every company needs to get to know its customers. But when every internet-using person is your customer you have to take care to be responsible about what conclusions you draw from your vast data sets and what actions you take based on them. Google may or may not be up to this task, I think this remains an open question. What bothers me though is the following clause in the new Google Terms and Conditions:
We have a good faith belief that access, use, preservation or disclosure of such information is reasonably necessary to (a) satisfy any applicable law, regulation, legal process or enforceable governmental request…
Adam Levin’s of the Huffington Post’s analysis clarifies the risk:
Hold on, Bucky.
What exactly constitutes an “enforceable governmental request?” This sentence should read: “We will share information with a Governmental entity only when presented with a valid search warrant issued by a court of competent jurisdiction.” Such a provision would make it obvious that by giving information to Google, you do not intend to waive your constitutional rights, and it would make it clear that despite the fact that your information was shared willingly with a private sector entity, you reasonably retained an expectation of privacy against Government intrusion.
In other words, Google is stereotyping you, and then not only are they acting on that prejudice but they are saying that if a government comes calling then they will happily share what they think they know about you. If you have even the slightest distrust of government, your own or any other in the world, then this should worry you.
I know, this data-driven stereotyping and prejudice is happening all over the place, but that does not mean it’s good or safe. And, it certainly doesn’t mean that you have to be a willing sheep in the process. That’s why I’m switching away from Google as my default search service. I don’t want to feed more of my data into their prejudice machine.
Here are instructions if you want to do the same.
How to make DuckDuckGo your Default
Instructions from http://seodesk.org/address-bar-awesome-hacks/
- Chrome: Right-click the Chrome Omnibox » in the last entry fill the search engine name and keyword and copy/paste the URL http://duckduckgo.com/?q=%s » click on ‘make default’. (Alternatively you may add DuckDuckGo with suggestions to your search engines’ list and make it your default engine).
- IE: Enter DuckDuckGo hompage » click on the left arrow and select DDG from the sub-menu ‘Add Search Provider’ » check ‘Make this my default search provider’ » click on ‘Add’. (As in the previous example you may set DDG with suggestion as your default engine).
- Firefox: Type ‘about:config’ in the awesome bar and press ‘enter’ » confirm the declaration » type ‘Keyword.URL’ in the filter box » copy and past the following URL https://duckduckgo.com/?q= » click ‘OK’ and close this tab or window.You can also install the DuckDuckGo search plugin here.
- Opera: Right-click the DDG search box » Select ‘Create Search’ » type your preferred keyword » check ‘Use as default’ » click ‘OK’. (Note: Adding DDG and suggestions is more complicated and described here).
- Safari: Enter DDG hompage » click on ‘Add to Safari’ » follow the instructions.