Dan Learns Stuff

Monday, October 26, 2015

Testing uTest: or How I Learned to Stop Worrying and Love the Gig Economy

I’m a member of an online community of testers called uTest (the practitioner-facing side of Applause). The company hosts a social network for testers as well as offers short-term gigs to its users (the cliched Uber for testers). I was called on this weekend for my first gig: testing a payment method in Taxis. It was pleasant, if stressful, and it made me think of ways a company could take advantage of it to expand the perspective on their products.

After the initial invite to the test cycle, I communicated with a project manager via Skype to ensure I was able to carry out the test scenarios. I brought to the table a specific model of phone and a verified ability to pull debug logs from it (thanks Verizon for turning off the op codes on my S4; I resorted to adb logcat). They provided technical assistance and reimbursement for the transactions, but the primary incentive was a reward for test cases completed.

Throughout, I felt like a skilled secret shopper rather than a functional tester. I was asked more about the non-software components of the launch than the app or phone functionality. I reported on the advertisement of the feature, the education of the employees, and the capability of the equipment that supports the feature. In spite of expectations from the participating companies that the advertisement would match hardware would match training, this happened 0% of the time, and no employee had been informed why their equipment had been updated. I wasn’t the only tester in this project on the ground, and the others testers saw related issues, and none had all their ducks in a row. In all, the information they were most excited about was the boots-on-the-ground report I provided. It was fascinating to see a live product launch from that perspective, and doubly so considering my history in this product space.

The final bulk of time spent on this gig was an experience report. Complete with detailed feedback, photos of the equipment, and other evidence, this is where I was able to comment on the process as a whole. From a testing perspective, I was able to provide detailed UI/UX feedback, evaluate the consistency of the process, and help them understand how much of their plans actually came off during the launch period. There was some repetition in these reports. One was a document with pictures, the other was a survey that linked to the doc and a third was a JIRA-in-spreadsheets for tracking testers. These reports were all submitted to the project manager for feedback, and I received an up/down "accepted" in less than 24 hours. While there is definitely room for improvement on the platform that would reduce tester overhead, it wasn't enough of a burden to avoid gigs.

Participation in this project taught me that coordination is hard, education of low-skill workers is even harder, and successful coordinated nation-wide launches like this are next to impossible. This mirrored my experience with other companies. There are always bits and pieces of projects that do not make their way down the chain to the workers that are face-to-face with customers, so having a boots-on-the-ground perspective is vital.

Overall, a company can leverage uTest services to add layers of testing to their launch activities over and above internal testing. Specific and targeted verification is where they are the strongest. They can provide feedback on as to whether an uneducated but motivated customer can make it through the new process right now. Specific errors and third-party verification can be farmed out for feedback through such a relationship: things the company wants to do but can’t do efficiently given a small staff or tight timelines. This can provide a company with an enhanced view of the product and help challenge their assumptions before their customers do.

Note: edited to be a little more coy about who I was testing for and what I was testing.

Tuesday, September 1, 2015

Failing Faster, Succeeding...Soon?

I listened to a good podcast about having and executing on ideas. Here was the gist of it:

Have an Idea: Gather info directly from customers. Implement now or Punt for later
Once implemented, get a Minimum Viable Product to a Website, county fair, etc. Fulfillment can be slow at first. Persevere and refine or Punt
Once it is selling, enter a Customer Validation Loop and handle their concerns first. New ideas? Start at top.
Once major customer concerns are addressed, enter a Product Design Loop: Change design or manufacturing as needed.

The core of the idea is to fail faster in the hopes that you succeed sooner. Your backlog of unvalidated ideas are there to experiment on and validate. Then you Implement, Persevere, Resolve and Redesign or Punt and wait until you've churned through your good ideas.

http://www.ssireview.org/blog/entry/fail_faster_succeed_sooner/

Another formulation of this is the 2-2-2-2-2 method. When you are trying to determine if an idea is feasible, first spend 2 minutes getting it down on paper. If it still captures your interest, spend 2 hours fleshing it out. As it grows, time box your commitment to the project. See it through or bin it. By the time you're spending 2 weeks or months on an idea, it should be clear whether it can bear fruit or not. I cannot find an online version of this idea. If you can place it, let me know in the comments.

While this applies to product development, it can also apply to hobbies, chores and other activities. Have an idea for homemade Christmas presents? Try it out on a small batch before you become consumed with a monster of a project with no practical timeline for delivery. Have a request from a friend to help you with a project? Spend a few minutes talking logistics. If you get down to a trip to the hardware store, make sure you can finish that phase with results in an afternoon. Re-evaluate before committing to future efforts: is the benefit still worth your collective time?

Monday, August 17, 2015

Magnetic Bottle Openers

In the tradition of doing something snazzy for the DEF CON Toxic BBQ, I created a bottle opener that would both mount magnetically as well as catch bottle caps with the same force.

Amazon had a selection of sturdy bottle openers by Starr X, and a particularly helpful blog post by K & J Magnetics helped me pick out the featured magnet. I'm relying on the interesting grain of the Indian Rosewood to give the piece character as I didn't have the tools to do a fancy profile, and my router bits are incredibly lacking, so I just went with dog-eared corners and a chamfered edge. The burning visible on the below pre-finishing shot (accompanied by my favorite Wasatch brew) was due to the bit I used.

The magnet was epoxied in place after I cleared out a spot for it. In order to prevent the opener from sliding on slick surfaces, I added slightly inset tiny rubber feet. This also set the opener off from the fridge by just enough that you can get your fingers behind it to pry it off with ease. Lots of sanding from 100 to 600 grit made a great smooth base for some stain and spar urethane. After three days of curing time, I plopped it on the post at the Toxic BBQ and had a pile of at least 50 caps by the time the night was through. A great first run!

Tuesday, August 11, 2015

Responding to Vulnerabilities and Weaknesses in your Product: A Tester's Perspective

Compare and Contrast: Tesla's response to researchers:

http://www.businessinsider.com/heres-how-tesla-will-win-the-coming-hacking-wars-in-the-auto-industry-2015-8

With Oracle's:

https://web.archive.org/web/20150811052336/https://blogs.oracle.com/maryanndavidson/entry/no_you_really_can_t

To be fair, no company should have to sift through an automated report from a static analysis tool. It’s not worth their time. In fact, the tone of the Oracle Blog that isn’t completely unproductive is, “Do the research for yourself! Give me exploits or give me death!” As a Tester, this is the core of bug advocacy, and I want to destroy the trust lazy researchers put in automated scanners, lazy managers put into automated checking, and the lack of human interaction endemic in development in general.

That being said, chiding someone for spending their own coin to find a exploit with, “But you really shouldn’t have broken the EULA. Nanny Nanny Boo Boo,” is unproductive at best and an invitation to become the target of malicious actors at worst. No one cares about your EULA. Not even the government gives it the time of day. Your tantrum just makes that many more people want to do things to piss you off.

Monday, February 16, 2015

The Uncanny Valley of User Interfaces

Of all the things to get peeved about, you wouldn't think commercials would be one of them. In spite of avoiding commercials by cutting the cable cord, I have found myself employed at a place that runs ESPN 24/7. While it's not as bad as commercials on Fox News, certain commercials have begun to irk me for reasons that only a software guy can understand. In general, I believe this revulsion can teach us how to test user experiences.

To start, allow me present two exhibits: Jillian Michaels for Nordic-track and Trivago Guy for Trivago. Have you watched them? Good. Are ads for them already populating your sidebar? Allow me to apologize. So, did anything bother you?

Now, here is a second exhibit: Fiddler on the Roof. I'm sorry if I ruin this for you, but the fiddler is not actually playing his instrument. In high school, I saw this movie once in class, and again after I spent a year in orchestra. The second time through, I knew as little about playing the violin as one can, and it still bothered me that the fiddler wasn't successfully playing in sync with the music. The distraction caused by bad miming is well documented. Recent Amazon series Mozart in the Jungle has it. Other movies aren't immune either. Even a talented musician can be affected by it without equal attention to production (try not to notice the out-of-sync miming and white truck in the background 52 seconds in). The core problem is that you need to train someone who is good at acting to fake a highly skilled activity convincingly enough to keep the audience's focus. If the audience knows even a little, their belief is unsuspended, and you can lose all credibility.

Normally, this is fine. in a two hour film or season of a television show, the viewer is pulled in by the drama. The story isn't about how well the artist can play the violin; it is about the character's story. Also, most directors can rely on the majority of their audiences not knowing what skilled activities like playing the violin and hacking actually look like. Camera tricks can make some instruments easier to fake than others. Furthermore, some miming has become acceptable and celebrated through parody. We know it's hard to convey on film. Should that mean we don't tell this story?

Unfortunately, there is a class of film that cannot rely on the social contract to ignore non-experts acting out difficult skills. Its short format leaves no time to pull a person back in after having been jarred from their witless stupor. What pitiless medium is this? Commercials. In 30 seconds or less, the disingenuous spokesman is called out. The thankless medium that pays for most of our entertainment is reviled for cheesy effects, leaps of credulity, and now one more heresy: poorly mimed user interfaces.

Let's return to our exhibits, shall we? Nordic-track wants to sell us another piece of exercise equipment that we will only use as long as we are paying for it. To do this, they've strapped on a touchscreen to navigate workouts. What does Jillian Michaels use to navigate? Two fingers, and she watches the screen as she swipes left. Thinking back to your own interactions with iPads, touchscreen kiosks and mobile phones, when have you ever watched the old screen disappear? And when have two fingers ever been used to navigate a touchscreen?

The second is Trivago Guy. Not only has he drawn attention for being creepy, but his ham-handed interaction with the user interface makes me cringe. Poking at points on a calendar, full-handed presses of the Search button, and similar miming make me look up from my desk and gag. Similar interactions with after-effects like those in the Endurance Extended Warranty commercial make me wonder if anyone thought to proof the commercial before buying ad spots day after day, year after year. An alternative explanation would be that the producers honestly think this is the way people interact with computers. Either one disarms the viewer and places the product as unfavorable in their eyes.

I would like to propose that each of the above cases can be grouped together as potential examples of the Uncanny Valley. As a movie viewer familiar with how a violin is played, I connect notes and the movement of bow in a way that the uninitiated cannot. I reject the characterization as invalid for a brief period, but my emotions pull me back in with other human interactions elicited by the actor's performance. For these commercials, this does not happen. The terrible user interface interactions remove focus from the message of the commercial, and it is judged as unfit just long enough that I reject the product on offer. Worse, subsequent viewings reinforce my first impressions.

Generalized lessons in the User Experience design space are many. After testing a new user interface, I have found it helpful to let the uninitiated take it for a spin. While I have been long-desensitized to bad interaction by other considerations ("It actually works once you get used to it!" being one, "I just want to be done with this." being another), initiates see the unnatural interaction, and one can read the revulsion in their face even when they don't come right out and say it. This commercial for the new product is a failure, and I don't wait around to see them lulled into the same sense of security. It stinks, they know it, and now I do to. While users might not be expected to use a new interface right away, something that is counter-intuitive from the start should be avoided. Depending on the medium, counter-intuitive steps can be overcome, but not without good decisions elsewhere to draw the user back in.

Wednesday, February 11, 2015

Bike Rentals, An Adventure

The local subreddit received a post requesting one of us rent a bike to a visitor while their significant other was at a conference. This spawned an interesting adventure, and I learned a ton.

The post provided me with a list of expectations, and we quickly moved to Private Messages to hash out the details. The end result was $20 per day for a bike, helmet and tools needed to keep you going.

I quickly found out I wasn't as prepared to rent as I had presumed. For the bike, I had a $150 Wal-Mart special with a good amount of wear, baskets on the pedals, and some upgrades like a headlight. The back tire on the bike was completely shot, so any money from this venture was going to go right back into it. I didn't have patches, a portable pump, and my bike tool was nowhere to be found. A trip to JT's and I was set.

The renter was staying at the Green Valley Ranch, a local hotel/resort, and their bell desk was endlessly accommodating. I dropped it off with a note for the person staying in the hotel. I communicated the tag ID to the renter, and I was off. I even did it on the way to work, so it was relatively painless.

Until I got the phone call.

As a software tester, you would think I would learn to test my own stuff before I deploy. Unfortunately, I forgot this portion and ended up handing over a bike with a disabled chain. I got the phone call in the morning after the renter's arrival, and I was frantic and embarrassed. U rushed over on an early lunch, fixed the mangled chain, gave it a spin around the parking lot, and kicked the tires for good measure. Again, the bell staff was extremely accommodating, and it was stowed securely in time for lunch. The rest of the experience was relatively painless. I picked it up after the renter had flown. I paid the bell desk a tip on pick-up. All the kit was there and intact.

Could I streamline and improve this service? Here are some ideas:

The sign-up process could be accomplished online.
Several waivers should be added to make sure the lawyers don't come calling after our first injury.
Accident insurance and similar services could be added on as well. Neither renter nor owner wants to be caught unawares.
A service level could also be established: will work on delivery (oops), service calls available within X hours, deposits or charges for repairs, and so forth.
Instructions for the bell desk, advertisements, and similar services could also be bolted on. Making it easy for the staff engenders trust and is good advertising.
The kit was mostly good, but delivery could have been more glamorous (kit bag attached to bike instead of in a plastic grocery sack).
I would make people bring their own helmets or have them available for purchase. Helmets are very hard to gauge if they have gone bad. Why risk the lawsuit if an injury does occur.

So, was it worth it? That is a definite no. Could I make it worthwhile? Maybe.

The cost to take the bike, if everything went smoothly, would be gas and time for delivery, Spread over enough hotels, this could be accomplished relatively easily once the service hit critical mass. The repair was a huge hit to profitability (driving there and back on lunch), but careful testing and integration with deliveries/pickups would also make it something that could be priced in with some research. Theft could be mitigated by insurance, but it would need to be managed carefully and included in the cost. Finally, payment was through PayPal which took a sizable cut. Cash might be better, but since the ideal rental involves never meeting your customer, it is impractical. Credit would slice the charges in half.

An attractive alternative is to offer rental services to the hotels/resorts themselves and only deal with them. It would be a simple way to attract business, and they could take advantage of existing infrastructure for payments, renting, waivers, etc. With enough coverage, it might just make a profit.

Tuesday, January 6, 2015

Context-Driven Testing, An Education

Coming into my fifth year of Software Testing, I began to rethink it as a discipline. The current debate is between traditional methods of testing and more modern schools of thought:

Entrenched methods are represented by the International Software Testing Qualifications Board (ISTQB) and its many local equivalents. Certification is their path to expertise, with accompanying wares of training sessions, books, tests and standards.
Context-driven Testing focuses more on a set of tools and skills that typify testing. Advocate offer classes at conferences, but certs and best-practice are four letter words. They state that there are no best practices, and a tester knows best how to apply the tools to explore, experiment and learn about the system under test.

The conflict seemed from the sidelines like a pitched battle over the future of testing. The ISTQB and affiliated consultants had history and 300,000 certifications on its side. The context-driven school was relatively young, but it had a few charismatic evangelists and professional results that could not be ignored. It was plain that I needed to sort this out.

Something Just Didn't Feel Right

I strode onto this battleground in 2011 as a new manager and new tester. Promoted from an application integration team, I was used to working with outside developers while using and abusing buggy product. This did little to prepare me for the reality of testing: limited time and endless defects! I dove into the online community in the hopes that it would help sort the good from the bad. What I found to be the central influences were conferences, consultants and blogs.

At testing conferences, half the talks were advertisements for bug trackers, test case repositories and automation frameworks. These were great for managing a project, but they didn't support the essence of testing: finding more defects. Expensive tutorials before the conferences showed similar taint: how to use this tool, best practices for testing this specific type of thing, certification in a process and not a portable technique. The cost was the most surprising thing: Thousands of dollars for something I couldn't justify to myself, the eager tester with a training budget to burn.

Delving into webinars brought more despair. A demo of the latest automation tool invariably lead to a pitch to get me to purchase something. The topic purported to discuss principles, but I ended up in the morass of industry jargon. I learned how to write, automate, and justify my time, but I was no closer to actually finding bugs in a more efficient manner. And what's more, I was spammed by vendors for months after. I found zero techniques that were universal.

Finally, the blogs showed a glimmer of hope. Some people wrote about techniques. Others wrote about test cases and how they managed their overhead. Still others advocated that QA should be involved earlier in the development process. Nowhere did anyone extol the virtues of their test management system, bug tracker or automation too in helping them find bugs. This was a breath of fresh air, but it still felt stunted and directionless. My closest analogue, software development, spawned a generation of people applying agile to everything from families to manufacturing, but there wasn't a similarly powerful framework for testers. I started to feel that no one was excited about my new career outside of the context of getting paid for it.

This muddled world left me questioning: Shouldn't there be some "ethos of testing" to unify our efforts just like in agile software development, lean manufacturing, and so forth? Why do vendors have a place at the table at industry conferences? Why isn't anyone embarrassed that the main competitor to major test management software is a spreadsheet? Who cares about my expertise and not just my dollars?

"Answers" and Answers

For a long time, I thought the answer was in certification. Surely, if ever there was an organization that could be a cheerleader for quality, it would be the ISTQB. However, the reality is much different. Manuals are filled with process, conferences host vendors and not practice sessions, and training classes are about extracting fees and not learning techniques. The certification exam that guarantees your resume goes to the top of the pile and organization that proctors it is a laughing stock.

The alternative came through an unlikely route: Twitter. Long a tool of celebrity publicists and companies looking to engage directly with individuals, Twitter also has a reputation as the way to communicate within a subcultures. Have an idea? Publish it in under 140 characters. Want to learn the pulse of an industry? Follow its leaders. Computer security wonks, hackers, and now testers joined my follow list, and I was soon introduced to a new debate: is certification a waste of time? I'd found my people.

The new school touted something called Context-driven Testing. Instead of best practice, effective testing was supposed to be driven by context. Instead of test cases, a set of tools were taught that could be used depending on the product (mainframes are tested differently than mobile devices). Even among superficially similar products, the most effective testers would make judgement calls based on the needs of the customer and the time available. Testing was not a set of rigid processes, but a scientific exploration of the software. The knowledge gained by testing increases the confidence of the organization in the software. In other words, testing challenges the assumptions made by the developers that the product works in the time we have available. This sounded like the meat I was looking for, but the results were amazing too.

In an experiment, I had our work study group put down the ISTQB manual with its test cases, and I instead introduced them to exploratory techniques. We first learned about the requirements and returned a bunch of questions to the developer. Then we tested without scripts and tracked our coverage on a mind map. It was the first time we had been prompted to field trial a technique in our study session. The best part was that a person with very little previous experience in testing was able to pick up the technique almost organically.

This revelation about software testing was what we were all looking for, and it was delivered through experience instead of pronouncement. James Marcus Bach, one of the proponents of Context-driven Testing compared the ISTQB and certification organizations to the medieval medicine of Galen. People wrote down what testing was and proceeded to bleed their employers without knowing why it wasn't finding bugs. Testers were outsourced instead of valued as their techniques were old or ineffective. Yet in spite of all this, the consultants and conferences kept making money printing outdated works of questionable value. Once context-driven techniques come to light, the old ways start dying. We can only hope this continues so that meaningless certs are no longer valued by testers, managers and HR alike.

Where Next?

After stumbling through the world of testing for a few years, I have abandoned certification as a path to expertise. As with computer security, network administration and technical support, certifications are a poor way of communicating true expertise. This revelation places testers firmly in the camp of indispensable elements of the development organization. They are not monkeys running scripts but knowledge workers with a valuable investigative skill that challenge the product from all angles. They cannot be outsourced if you hope to be successful, and they cannot be replaced by automation.

I am beginning a new training regimen with my testing colleagues based around Context-driven techniques. We hope to learn the techniques and apply them to our current projects and continually grow our skills in this new framework.