NTDEV member, Ilya Konstantinov, remarked recently on NTDEV that sometimes it's necessary as a developer to break the rules. Even good developers do it. Sometimes, to do something cool, you just have to do it.
He asked "Where do you draw the line"?
Darn good question. It made me stop and think. I'm not sure if Mr. Konstantinov was asking a rhetorical question, or if he really wanted my thoughts on the matter. But, in any case, I answered him on NTDEV and I'm expanding on what I wrote here in our Dev Blog.
In many cases, the very nature of innovation requires that you do something that Windows wasn't designed to support. Here at OSR, it's rare that a client will engage us to write a driver for some type of device that Windows support well, such as a really cool, fast, network card. I love writing those drivers. They're fun to write, and the tradeoffs don't usually keep me up at night. Sadly, I think the last driver like this that I wrote was in 1995. No, here at OSR we're often hired to help clients implement solutions that are very literally on the leading edge of the industry. This means that we're sometimes forced to venture outside the realm of things that are "documented and supported."
Having to break the rules to do something really special has its own rewards -- both for the engineers involved, and for the company. As an engineer working on such a project, you often get to really stretch your knowledge of the system and test your assumptions. And the company that has this type of software written has an automatic set of barriers to entry... assuming the software that gets written is stable and reliable.
So, breaking the rules can be fun. But it also brings its own special set of responsibilities. What are they?
Well, first, I think you have have a responsibility to do a significant amount of "due diligence." You don't want to break the rules if you don't have to. If there's another acceptable means for achieving your over-arching goal, OTHER than resorting to some nasty undocumented hack... in almost all cases you should use it.
Next, I think you're responsible for doing a carefly "risk vs benefit" analysis. If the risk (of screwing up the OS, crashing the system, writing code that isn't compatible across versions or service packs) is small and the benefit you're gaining is great... you have an EASY decision. To be able to do this, you need to have the technical knowledge (or access to those with the technical knowledge) to accurately gauge the risks. This is critical, right? Just throwing some code onto a target machine and saying "well, it works" does not mean the risk isn't high. It just means you haven't suffered the consequences of the risk... yet.
The real problem with doing the "risk vs benefit" analysis is that it's usually very difficult to accurately assess the risk involved when you stray from the approved way of doing things. The difficulty lines in the fact that you're not aware of all the things that you don't know (assuming anybody but me can even parse that sentence). For example, a driver dev might think "Oh, I'll just reach into this structure at a specific offset... what could happen?"... but he doesn't realize that the undocumented field in the undocumented structure into which you're reaching is part of a union. Of that the structure is in a LIST that has a lock to which he has no access. Or... any number of things.
Sometimes, doing the analysis is easier. There's a field that's undocumented. But, you happen to know that the OS internally uses it in certain places to do certain things. If you also use it in the same way, in pretty much the same places and at pretty much the same times as the OS would, you know your risk is lower. Not non-existent... but lower. Remember, you still don't know what you don't know.
Next, I think it's important that you accurately describe the risk to management. Your technical decision can have business impact. The people making business decisions need to understand the risks involved, and it's your responsibility to accurately represent the risks and put them in the type of context that manager types will understand.
Finally, I think you need to limit your bad behaviors to only what's absolutely necessary to achieve your goal. In other words, once you've decided to venture into the realm of the undocumented, don't be tempted to pile things on, to say "Oh well, I'm already doing X -- which is unsupported -- I might just as well add feature A, B, and C which do Y -- which is also unsupported and undocumented."