One of the sad ironies of my research (at least at this point) is that experts are an expensive, limited resource, and novices are readily available. So, which would you pick? We have a few strategies in mind to in a way get a little more than just novice results, but the stuff that will be viewed as statistically significant will be done by novices.
The sadness for me in this case comes from the view-dependent control I talked about in my previous post. I mentioned a hybrid joint control/end effector control that is not view-dependent, as it seems to most people that the view-dependence was confusing. People who had a bit more experience with working in a 3D world on a 2D screen seemed to like it more, but that's just anecdotal, and I might just be hoping. I certainly like it better. It's my impression that people would perform better with the view-dependent control after a bit of training time (like, a couple hours, not a couple minutes).
Now, I'm talking about the control type I used for the first user study, where two separate joysticks are used. The single-joystick view-dependent control was just plain confusing with head tracking. With two sticks, one is view-dependent, and the other is always up/down. That's my favorite configuration so far, but I understand the system pretty well, and I'm designing to my preferences.
The answers to these questions really can only be found through science and testing. I really need to keep moving!
Tuesday, January 27, 2009
Monday, January 26, 2009
This robot is out of control!
One of the things we robo-manipulation guys have to deal with is how to design the user control for the robot arm. I'm not talking about the whole user interface, but in particular the controls to make the robot move. If there's some autonomy in there, you have a little bit more flexibility, and you can fairly effectively just use a mouse. But when you just want to allow the operator to teleoperate or "remote control" the arm, things are a little trickier.
Now, just driving a wheeled robot around is not so challenging, because people are accustomed to driving cars, and sometimes even from a remote perspective. With a robot arm, you have to make a control that operates in full 3D, and that can even include 3 axes of rotation. If you want to go really low level, then you need some way to control each joint of the arm.
My first attempt at it was to use two "thumbsticks" on a modern video game controller. The type of control I'm going for is end-effector control, where you move a virtual target point for the robot's gripper to reach. One thumbstick controls motion in one plane, and the other stick controls motion in the remaining axis. This works OK with some training, but a lot of people seemed to still struggle with it. The next attempt was to reduce the control to one stick, and change the way the controls work depending on the view. From a top-down view, the end effector moves in the XY plane (where Z is vertical), and from a side view, the end effector moves in a plane parallel to the Z axis. The tricky part is in views that are neither side or top views. When the view is halfway between those, how should the end effector move? For now I have it somewhat "remember" what control it's in, and you have to go almost all the way to the other view to switch modes. This ends up being rather confusing even for me after practicing for a while. Another option would be to simply move the end effector in the view plane. It's hard to say whether that would be good.
Some other ideas would be to make a non-standard end effector control approach. For instance, left and right on the stick could rotate the base of the arm, and forward and back would extend the arm. Instead of controlling individual joints, however, this mode would still be moving the end effector. There is a difference.
The other day one of the people on my committee offered his Novint Falcon (3D force feedback controller) for controlling the robot arm. At first I thought this was totally the way to go and would solve all of my problems including world hunger. This would mean that I wouldn't have to use two separate joysticks or different modes... with a 3D controller up is up, left is left, and back is back.
Then I remembered the whole view-dependent thing. The trouble is that the interface has head tracking to adjust the view. So it's really easy to adjust the view. It's possibly good and important, but it also makes one wonder whether the controls should always do the same thing, or depend on the view. There's a paper by Jose Macedo called "The Effect of Automated Compensation for Incongruent Axes on Teleoperator Performance" that talks about this, and they basically say that people do better with the automatic compensation (or as I say, view-dependent) control than without.
I think my situation's a little different than theirs, though. They evaluate 2D control, and it's also static. By static I mean that once they have the control axes and the display axes determined, they remain in their particular (mis)alignment for the duration of the experiment. In my case, it's 3D control, and the alignment between axes is dynamic throughout the experiment. So I think it needs to be tested. Perhaps after I get this thesis done.
Now, just driving a wheeled robot around is not so challenging, because people are accustomed to driving cars, and sometimes even from a remote perspective. With a robot arm, you have to make a control that operates in full 3D, and that can even include 3 axes of rotation. If you want to go really low level, then you need some way to control each joint of the arm.
My first attempt at it was to use two "thumbsticks" on a modern video game controller. The type of control I'm going for is end-effector control, where you move a virtual target point for the robot's gripper to reach. One thumbstick controls motion in one plane, and the other stick controls motion in the remaining axis. This works OK with some training, but a lot of people seemed to still struggle with it. The next attempt was to reduce the control to one stick, and change the way the controls work depending on the view. From a top-down view, the end effector moves in the XY plane (where Z is vertical), and from a side view, the end effector moves in a plane parallel to the Z axis. The tricky part is in views that are neither side or top views. When the view is halfway between those, how should the end effector move? For now I have it somewhat "remember" what control it's in, and you have to go almost all the way to the other view to switch modes. This ends up being rather confusing even for me after practicing for a while. Another option would be to simply move the end effector in the view plane. It's hard to say whether that would be good.
Some other ideas would be to make a non-standard end effector control approach. For instance, left and right on the stick could rotate the base of the arm, and forward and back would extend the arm. Instead of controlling individual joints, however, this mode would still be moving the end effector. There is a difference.
The other day one of the people on my committee offered his Novint Falcon (3D force feedback controller) for controlling the robot arm. At first I thought this was totally the way to go and would solve all of my problems including world hunger. This would mean that I wouldn't have to use two separate joysticks or different modes... with a 3D controller up is up, left is left, and back is back.
Then I remembered the whole view-dependent thing. The trouble is that the interface has head tracking to adjust the view. So it's really easy to adjust the view. It's possibly good and important, but it also makes one wonder whether the controls should always do the same thing, or depend on the view. There's a paper by Jose Macedo called "The Effect of Automated Compensation for Incongruent Axes on Teleoperator Performance" that talks about this, and they basically say that people do better with the automatic compensation (or as I say, view-dependent) control than without.
I think my situation's a little different than theirs, though. They evaluate 2D control, and it's also static. By static I mean that once they have the control axes and the display axes determined, they remain in their particular (mis)alignment for the duration of the experiment. In my case, it's 3D control, and the alignment between axes is dynamic throughout the experiment. So I think it needs to be tested. Perhaps after I get this thesis done.
Tuesday, January 13, 2009
Filtering the 3D scan
For the 3D scan display, we don't necessarily want to show every point that the camera sees. For example, the robot arm itself. We already display a graphical version of the arm that we draw based on outputs from the arm. When the arm and 3d points overlap, it's impossible to see exactly what's going on.
We can also remove things like the table the whole setup sits on and replace it with a plane. This will be easier to see, and give more contrast to items of interest, such as the blocks or pipe cleaners we will be using. The ground plane is simple enough to filter... since the coordinate system for the robot arm is what's used in the actual program. Anything below zero altitude is simply eliminated.
In the case of the robot arm, though, it's a little bit trickier. There are two ways I can think of to filter out the points. The first would be to put bounding boxes around the larger parts of the arm, then move the bounding boxes along with the arm and test to see if points are inside. The second method is to use OpenGL to render my relatively detailed model of the robot arm from the viewpoint of the stereo camera and in that way determine which pixels correspond to the robot arm. We have a mapping between 3D points and image pixels, so that's not the hard part. The second way is a little bit more nebulous to me... I'm not sure how difficult it will be. I do think the second way is better, though. Certainly faster, and probably more accurate as well. We can get a "closer trim" with the second method.
We can also remove things like the table the whole setup sits on and replace it with a plane. This will be easier to see, and give more contrast to items of interest, such as the blocks or pipe cleaners we will be using. The ground plane is simple enough to filter... since the coordinate system for the robot arm is what's used in the actual program. Anything below zero altitude is simply eliminated.
In the case of the robot arm, though, it's a little bit trickier. There are two ways I can think of to filter out the points. The first would be to put bounding boxes around the larger parts of the arm, then move the bounding boxes along with the arm and test to see if points are inside. The second method is to use OpenGL to render my relatively detailed model of the robot arm from the viewpoint of the stereo camera and in that way determine which pixels correspond to the robot arm. We have a mapping between 3D points and image pixels, so that's not the hard part. The second way is a little bit more nebulous to me... I'm not sure how difficult it will be. I do think the second way is better, though. Certainly faster, and probably more accurate as well. We can get a "closer trim" with the second method.
Dirty 3D rectangles!
It would be cool (and possibly useful) for the 3D scan in my interface to be dynamically updated in real-time. This is no small and trivial task, because it's a 640x480 image that the model is based on, so about 300k points. In the camera API, these points are stored in a big image where each point takes up 16 bytes (int, float, float, float). That's nearly 5 megabytes of data to sift through every camera update. And this camera can update at 30fps.
That's alotta data.
Well, it's a lot for stuff that needs to be processed in software, anyway. The bottleneck is not in transferring the data, mind you, it's in loading the data to the video card. Copying 5 megabytes to the video card 30 times per second, or even 5 times per second simply won't happen today.

So how do we solve this? I've been processing this at about 0.5% brainpower for the past several months, and today it dawned on me. I had been thinking all along of using vertex arrays, but I couldn't see a way to only update vertices that have changed. It didn't seem possible really. The answer is to use vertex arrays, but lots of vertex arrays instead of one. You break up the image into smaller pieces, so it's really something like 40x30 images that are each 16x16 pixels in size. Then you create a vertex array (or better, a vertex buffer object or VBO) for each sub-image. Then, if nothing is happening in a part of the image, you don't bother uploading that portion of data to the card.
This solution means that if the whole image changes, you have to upload the whole thing all over again, but for a camera that's sitting still, that should only happen if someone trips over the tripod, in which case you don't really care about the image anymore. If your camera is mounted on a mobile robot, the 3d display will only be dynamic if the robot's sitting still, and if it's looking at a static scene. For today's tasks and technology, these are reasonable constraints, I think.
Now that I've said all of this, I may or may not actually implement it. That all depends on how many grandkids I want to have by the time I graduate.
That's alotta data.
Well, it's a lot for stuff that needs to be processed in software, anyway. The bottleneck is not in transferring the data, mind you, it's in loading the data to the video card. Copying 5 megabytes to the video card 30 times per second, or even 5 times per second simply won't happen today.


This solution means that if the whole image changes, you have to upload the whole thing all over again, but for a camera that's sitting still, that should only happen if someone trips over the tripod, in which case you don't really care about the image anymore. If your camera is mounted on a mobile robot, the 3d display will only be dynamic if the robot's sitting still, and if it's looking at a static scene. For today's tasks and technology, these are reasonable constraints, I think.
Now that I've said all of this, I may or may not actually implement it. That all depends on how many grandkids I want to have by the time I graduate.
Monday, January 12, 2009
Second experiment

Because the results from my first study weren't jump out of your seat exciting and significant, we're working toward a second study to get at least a shift-weight-in-chair result.
The most important thing to fix is the alignment between the 3D scan and robot arm. Just last week I managed to fully incorporate the new Videre/SRI STOC camera including calibration, and the alignment is lots better than the SwissRanger. Still, something's slightly off, so I might just cheat a little. I'm thinking of artificially inserting points in the scan for where I know the object to be. This way, the critical points will be aligned perfectly, and I won't have to spend weeks developing methods to get a better alignment. It's really not a perfect solution though, because the robot arm model could still be off, but it does get rid of one point of error.
Another thing that needs doing is smooth arm control. Currently the arm sort of jerks around while moving in a "vroom screech" fashion. Today I worked in a solution that works well for the most part, but it feels hackish and I've noticed a problem that I think is actually due to noisy readings from my servo position feedback, but I never saw it with the jerky control. So of course my adviser comments that I should be using a PD controller for this. But today I equipped my t-shirt of minimum coding effort, so I followed the path I trod earlier to get this thing whipped up quick. And now that I've got a solution mostly working, I know that my adviser's right, and I'll probably have weird problems crop up or a lot of work to remove the quirks unless I make a PD controller (sigh, grumble, sigh). It's sad to throw away code, but sometimes it just has to be done.
That's really all of the hard stuff... there are lots of little things, like improving the subjective user surveys (Likert scale and all of that), but the big stuff is getting to be under control. I can definitely see the study starting up again within a couple weeks (knock knock knock).
Subscribe to:
Posts (Atom)