Research Mistakes

Last week, I lost a full day to a mistake. Not an idea that didn’t work out; a mistake. And it’s not the first time I’ve done that. It seemed like a dumb error, and my own fault as a researcher for not being careful, but I think the truth is that this is the untold human story of research. It deserves to be told.

So what happened?

I’m a robotics researcher, and I’ve been working for a few weeks on getting my robot to maintain its orientation in the water. It does this by wirelessly receiving its current orientation from a motion capture system, as well as the desired value (set by me). Then it spins the motor to use half of the body as a reaction wheel to adjust orientation, much like many satellites do.

Reaction wheels work by conservation of angular momentum, so the impulse they provide is through angular acceleration. So the control algorithm is simple enough - let the acceleration be proportional to the error in orientation.

α = K_p (θ_d - θ)

Great. Except there’s a problem. This system is a double integrator, which means I’m controlling an output (orientation) via an input that is its second derivative (angular acceleration), and control theory dictates that a double integrator can’t be stabilized by a purely proportional controller. The best you’ll get is oscillation about the desired value. Think about the accelerator on your car - if you only step off the accelerator when you reach the stop sign you’ll always overshoot it.

So we add a control input proportional to the derivative of the error. This lets the system slow down as it approaches the target, and should allow the orientation to stabilize around the correct value. In the car example - if you see you’re approaching the stop sign too fast, you start to brake. For completeness, let’s throw in a term proportional to the integral of the error, which will correct small errors that accumulate over time. This is a standard PID controller, and you can check out this excellent article by Tim Wescott for more details.

α = K_p (θ_d - θ) + K_d d(θ_d - θ)/dt + K_i∫(θ_d - θ)dt

For most applications - and for this one in particular - small errors aren’t so important. So you don’t need the integral term, just the proportional and derivative ones. But it’s not that hard to add all three to the code, and it’s good to do for completeness, so that’s what I did when I programmed my bot. I also added a nice function to send the three coefficients wirelessly to easily test different values.

I tested the communication and it worked, and the basic control code worked as well, so I got excited. Now all that was left to do was to figure out the correct values for the proportional and derivative coefficients. As expected, proportional control alone gave me oscillation, so I dialed up the derivative coefficient to stabilize the response…

The boat spun out of control.

Maybe it’s too high? I dialed the coefficient way down.

The boat spun out of control.

Why wasn’t it behaving right? Derivative control should stabilize the response, but it was only making it worse… Over the next two hours (most of my half-day COVID-era shift) I racked my brain and tested various combinations of values. I debugged the controller and made sure the output had the right sign and was outputting reasonable values, but any non-zero value for the derivative coefficient caused the boat to spin out of control, no matter how small. I went home defeated…

Over the weekend I found a few more bugs in my code, and I convinced myself that they were the cause of the problem. I came back refreshed and re-invigorated and tried the new code.

The boat spun out of control.

Again, two hours worth of testing proved fruitless, and I was beginning to despair. This is a super basic control algorithm, and I couldn’t even get its implementation right. Impostor syndrome began to set in: “how am I supposed to complete a PhD in robotics if I can’t program a basic PID controller?” I got ready to pack up, go home, and five up for the week.

And then it hit me… The controller I described above is called “PID” - Proportional, Integral, Derivative. So when I wrote the neat function to send coefficients to the bot, what I programmed was:

boat.send_coefficients(Kp, Ki=0, Kd=0)

Notice the order. Neat and sensible - (1) Proportional, (2) Integral, (3) Derivative - to match the name. I’d tested the communication interface and made sure they were stored correctly on the robot. But I was only using the proportional and derivative coefficients, and not the integral. So the command I’d been sending was

boat.send_coefficients(Kp, Kd, 0)

So the order was wrong. I thought I was sending a derivative coefficient, but instead I was sending the integral one. I tried flipping the order, and immediately it started working. Within five minutes I had a functioning controller, and within half an hour I’d found the right set of coefficients. But that dumb mistake cost me a full day worth of work, a weekend of mental agony, and a good-sized helping of feeling like I wasn’t cut out for the job I was doing..

This seems like its nothing - a story of a dumb mistake that wasted a day. But I’ve made similar mistakes that have cost me months of work before I finally found them, and they weren’t ones with easy corrections. Mistakes like these made me feel like I had little aptitude for science and research - what kind of serious researcher messes up like that?

The version of the scientific story that we see in the news and in published works is polished and refined. The scientist finds a problem, divines the perfect answer, and performs the experiment that exactly proves his or her theory. As a researcher, I know this isn’t the full story - there are lots of false starts and dead ends, both in theory and experiment, that don’t make it into the final story because they don’t sell well. But these are legitimate scientific endeavors; you form a hypothesis, you test it, it turns out to be wrong, so you try again.

I hadn’t paid any attention, though, to the mistakes that are made along the way. Building a PCB wrong, soldering the wrong component, ordering the wrong part - these aren’t valid scientific endeavors that don’t work out, they’re just mistakes. I don’t know how many researchers and scientists make these mistakes, or how often, because no one talks about it. But I know it’s not just me, and they’re a real part of the scientific story.

I’m not saying every scientific paper should include a count of wasted solvent, misprinted or miscut parts, and trashed electronics. The storytelling that happens in scientific publication cleans up those details, and for good reason. But I think for young and early career researchers its important to recognize that these things happen, and they’re not a mark of poor science or aptitude. They’re just a mark of working like a human.

Gedaliah KnizhnikAugust 17, 2020