0:00:06.080,0:00:11.840 Software rarely works as intended while it's being  written, things go wrong, we know that. A standard 0:00:11.840,0:00:16.800 behavior in some software development is, see  the bug, swat the bug, and be done with it. 0:00:19.280,0:00:24.560 In contrast, one of the things that I've  observed over time is that for experts error, or 0:00:24.560,0:00:28.960 more broadly things that go wrong or go amiss  during software development, is opportunity: 0:00:28.960,0:00:34.400 to understand better, to question assumptions,  to detect miscommunication or misconceptions, 0:00:34.400,0:00:38.960 to stumble onto insight. So experts  are not fearful of error but watchful. 0:00:39.760,0:00:45.360 They often hold off swatting the bug, instead  asking, that's odd, why is that? So indeed 0:00:45.360,0:00:50.720 error is seen as a useful input in the course of  progressive development. So this talk is meant to 0:00:50.720,0:00:58.320 summarize some of the insights about how experts  and high performing teams use that opportunity. 0:00:58.320,0:01:03.280 I've spent some 30 years studying experts  and high performing teams - at work, 0:01:03.280,0:01:07.200 in industry - in order to - to articulate  their strategies and practices, 0:01:07.840,0:01:13.520 and in effect I act as a mirror or a lens,  reflecting and focusing. I'm most interested in 0:01:13.520,0:01:18.320 articulating what successful software developers  actually do, not in dictating to them what they 0:01:18.320,0:01:25.840 should do. And I'm hoping that the talk will  have some resonance with your experience. 0:01:28.160,0:01:33.600 Research in software engineering predominantly  considers error retrospectively, based on analysis 0:01:33.600,0:01:38.960 of software and operation, usually of massive  projects, usually in the context of flaws left 0:01:38.960,0:01:43.680 in the code that need to be fixed or of software  system failures that arise from a collection of 0:01:43.680,0:01:49.760 smaller flaws. We've been taking a more ecological  view of error during software development. 0:01:51.840,0:01:57.600 The psychology literature offers the concept  of active error: these human errors during a 0:01:57.600,0:02:04.480 task that take the form of slips of action things,  like typos, or lapses of memory or attention, or 0:02:04.480,0:02:10.800 mistakes made in forming and executing intentions  during problem solving - so bad decisions. For 0:02:10.800,0:02:17.200 recovery, a person must know that an error has  occurred, must identify both what was done wrong 0:02:17.200,0:02:22.160 and what should have been done, and then must  understand how to undo the effects of the error. 0:02:22.160,0:02:27.760 So active errors can be caught in the act, or  they may be detected later during standard checks 0:02:27.760,0:02:34.080 and evaluation, by obstacles to progress, from  cues from the environment, or through unexpected 0:02:35.040,0:02:40.560 outcomes. So error detection and recovery unfold  in the course of progressive problem solving. 0:02:41.120,0:02:45.600 So, what is it that experts and high performing  teams do that gives them better results? 0:02:48.560,0:02:52.640 Experts mind the gaps. Rather than  just looking for what they expect, 0:02:53.280,0:02:58.160 they pay attention to the feedback and cues  that might alert them to something unexpected, 0:02:58.160,0:03:01.680 something amiss. They pay attention  to the spaces between things, 0:03:02.400,0:03:07.520 so for example to interfaces, interactions between  components, integration with other systems, 0:03:07.520,0:03:12.880 domain concepts hidden behind standard data  types. They pay attention to what isn't shown, 0:03:12.880,0:03:16.960 to what's missing, whether from the design  or from the information or from the reasoning 0:03:16.960,0:03:21.280 tool that they're using, and this minding  of the gaps promotes detection of flaws. 0:03:23.760,0:03:28.160 Whereas many people look for evidence  that things are working as expected, 0:03:28.160,0:03:32.080 experts in high performing teams are  more available to contrary evidence, 0:03:32.080,0:03:36.800 and indeed their practices prime them to look  for it. They seek evidence, they ask why, 0:03:36.800,0:03:43.360 they engage users, they de-correlate, eliciting  and contrasting different perspectives. 0:03:43.360,0:03:47.680 So they challenge themselves - they challenge  their assumptions, their models, their designs, 0:03:47.680,0:03:51.520 through mechanisms such as the skeptic  in the corner or pair debugging. 0:03:52.640,0:03:57.200 They seek falsification: they don't just  ask, "How would I know if this is right?" 0:03:57.200,0:04:01.120 but they also ask, "How would I know if this  were wrong?" and "How would I would know if an 0:04:01.120,0:04:06.080 alternative were right?" Importantly, they  understand that code is read by people, 0:04:06.080,0:04:14.160 and they write comments about what is not in the  code, that is their intentions and assumptions. 0:04:14.160,0:04:18.960 Understanding something by breaking it is a  form of analytic that's common in many branches 0:04:18.960,0:04:23.920 of engineering - introducing errors or flaws  deliberately can be a way of gaining insight into 0:04:23.920,0:04:29.600 how a system operates. Experts who have experience  doing that intentionally, to test their system, 0:04:29.600,0:04:34.240 also see unexpected breakage as a potential  analytic, and seize the opportunity to use it. 0:04:39.520,0:04:45.040 In contrast to eliminating bugs as quickly  as possible, experts reflect on the problem 0:04:45.040,0:04:50.160 and on the solution model. They recognize that  a small bug may sink - signal something more. 0:04:50.160,0:04:53.520 Rather than dismissing simple bugs as  novice errors or one of those things, 0:04:54.080,0:04:56.560 they look around to detect  if there's a fuller story, 0:04:56.560,0:05:01.280 thereby often detecting other deeper issues  such as design flaws or misconceptions. 0:05:03.440,0:05:07.120 So, experts don't just fix the bug  - the one bug - they stand back 0:05:07.680,0:05:12.000 and look for the other bugs that hang out with  it. They consider dependencies, and reflect 0:05:12.000,0:05:16.400 on the code structure in order to understand  whether the bug is part of a bigger picture. 0:05:19.040,0:05:23.280 And this is all part of reassessing the  landscape and deliberately expanding the search 0:05:23.280,0:05:29.280 space - a way of examining barriers, understanding  constraints, revealing assumptions, looking beyond 0:05:29.280,0:05:34.160 the immediate issues, and hence potentially  admitting more potential solutions or broadening 0:05:34.160,0:05:39.520 the definition of the problem in a way that  provides insight and overcomes flaws. And they 0:05:39.520,0:05:43.520 do this periodically throughout the design and  development process, not just at the beginning. 0:05:44.400,0:05:49.840 Now this is at odds with many software development  methodologies which typically concern convergence 0:05:49.840,0:05:55.280 to a solution. And so sometimes the high  performing teams step away from a methodology. 0:05:55.920,0:05:59.680 This business of standing back and  reflecting on the landscape is crucial. 0:05:59.680,0:06:05.840 We all know of examples where the software met  the spec but the specification was inadequate. 0:06:08.240,0:06:13.760 Software developers don't work in an ideal world  - we know that - but rather in an environment 0:06:13.760,0:06:19.360 dominated by conflicting demands and time  pressures, so bugs are understood in the context 0:06:19.360,0:06:25.760 of software use. Effective triage has to do with a  cost-benefit assessment of the relative impact of 0:06:25.760,0:06:31.680 the bug against the cost of fixing it. Bugs that  aren't important are often tolerated or deferred. 0:06:32.400,0:06:37.680 Brian Randall encompasses this in his concept  of dependability. His definition leaves room 0:06:37.680,0:06:43.440 for imperfection in the code if the imperfection  doesn't impair the software's dependability. So 0:06:43.440,0:06:49.280 tolerance is about managing the bug technically,  but also about managing the bug socially: leaving 0:06:49.280,0:06:54.480 compilation warnings in the code as reminders,  documenting the deferral, and its rationale. 0:06:57.440,0:07:02.240 Similarly, developers have been shown to  compromise at times in order to keep the 0:07:02.240,0:07:07.680 work moving along. Their strategies may include  deliberate sub-optimal choices calculated to 0:07:07.680,0:07:14.240 serve immediate needs but enabling progressive  improvements. So deliberate compromising suggests 0:07:14.240,0:07:18.880 that the developer is actively managing the  issue over time, implementing incremental 0:07:18.880,0:07:24.000 pragmatic solutions as required to advance  the larger program of work. This strategy 0:07:24.000,0:07:29.840 allows the developer to explore the problem over  time and ultimately to find the better solution. 0:07:32.480,0:07:37.600 But in addition to this, developers have  safety nets, and one of them is pair debugging. 0:07:37.600,0:07:42.880 Pair debugging is something most of the high  performing teams do and people don't talk 0:07:42.880,0:07:48.880 about much. They sit together and talk through  the code, often deliberately matching people of 0:07:48.880,0:07:52.800 notionally different levels of expertise or  who know different parts of the code base. 0:07:52.800,0:07:57.360 And this brings a fresh perspective to the code,  spreads the knowledge of the code among the team, 0:07:57.360,0:08:02.000 and has a tendency to expose assumptions,  misconceptions, and miscommunications. 0:08:04.240,0:08:07.760 Expert - experts reflect on their  tools as well as their code. 0:08:09.200,0:08:13.200 How can you verify that an analysis tool  is doing what it's meant to? Well, experts 0:08:13.200,0:08:18.400 play methods against each other to increase the  likelihood of detections: for example, building 0:08:18.400,0:08:24.080 errors into code to test the test harness. Experts  address tool limitations by combining or swapping 0:08:24.080,0:08:28.880 among multiple tools; to quote one developer,  often it's a mishmash of different ways of 0:08:28.880,0:08:33.760 thinking that gets you the answer. So multiple  techniques and tools imply more ways to think, 0:08:33.760,0:08:38.960 but they also require greater cognitive overheads,  and that requires intelligent coordination. 0:08:38.960,0:08:44.000 So the selection is not arbitrary: teams try  tools, assess their merits, assemble tool kits 0:08:44.000,0:08:50.960 that both fit their development culture and  span different perspectives. So, in summary, 0:08:50.960,0:08:56.320 experts use systematic discipline practices  that are socially embedded and reinforced. 0:08:59.120,0:09:03.360 Importantly, because there is a  disciplined culture, they're able 0:09:03.360,0:09:07.840 to rely on the team to catch slips, thereby  giving individuals the freedom to experiment. 0:09:08.480,0:09:12.000 A study of high performing teams  makes it clear that the interplay 0:09:12.000,0:09:18.640 between developers is crucial - plays a crucial  part in both nurturing creativity and innovation, 0:09:18.640,0:09:24.240 and in handling errors effectively and embedding  systematic practice and rigor. So the team culture 0:09:24.240,0:09:29.600 which leverages both individual strengths and  multiples perspectives provides the safety net. 0:09:33.600,0:09:38.560 There is a caveat to this approach to error,  which is that the focus is on fixing the error 0:09:38.560,0:09:42.880 rather than fixing the blame. The team  culture matters - it embodies the mindset 0:09:42.880,0:09:47.760 that sees error as opportunity, that  embraces multiple perspectives, that 0:09:47.760,0:09:52.720 reinforces practices such as triage or playing  methods against each other or pair programming, 0:09:52.720,0:09:56.640 that routinely challenge understanding and  assumptions. This helps strengthen and develop 0:09:56.640,0:10:00.480 the team as well as improving  the software. But differently, 0:10:01.040,0:10:05.200 software expertise doesn't happen by accident.  There are - these are practices that you can 0:10:05.200,0:10:10.320 understand and invest in by making space in  your organizational culture and by investing 0:10:10.320,0:10:15.120 time for this mindset these sorts of practices  - these dialogues - you're making space for 0:10:15.120,0:10:20.720 expertise to work and to grow and for expert  level software development to, to become possible. 0:10:20.720,0:10:31.840 So perhaps treat this as an opportunity to  reflect on your practice. Thank you for listening.