Callign KeBugCheckEx

OSR_Community_User · December 27, 2010, 6:59am

What is the “right” way to issue a CRITICAL_PROCESS_TERMINATION
(0xF4) bug check? Supposedly parameter 3 and 4 are strings, and even
though I pass the right address, the BSOD gives me 2 access violations
in my driver (after the entire BSOD it outputs the driver name twice
(two lines) for both addresses).
From other 0xF4 dumps I see the address is just a pointer to a
NULL-terminated ANSI string, which is what I used, but to no avail.

Also, is there a kosher way to make our own bug check codes, that
would not interfere with (also future) Windows codes?

–
Kind regards, Dejan (MSN support: xxxxx@alfasp.com)
http://www.alfasp.com
File system audit, security and encryption kits.

Don_Burn_1 · December 27, 2010, 7:29am

Checking the Win7 source the bug check is called with either both
arguments 3 and 4 NULL or with argument 3 being the Driver Name, and
argument 4 being a printf styly format string “Critical process 0x%p
(%s) exited\n”

I know of no way to get a bugcheck code for a products use.

Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

“Dejan Maksimovic” wrote in message news:xxxxx@ntfsd:

> What is the “right” way to issue a CRITICAL_PROCESS_TERMINATION
> (0xF4) bug check? Supposedly parameter 3 and 4 are strings, and even
> though I pass the right address, the BSOD gives me 2 access violations
> in my driver (after the entire BSOD it outputs the driver name twice
> (two lines) for both addresses).
> From other 0xF4 dumps I see the address is just a pointer to a
> NULL-terminated ANSI string, which is what I used, but to no avail.
>
> Also, is there a kosher way to make our own bug check codes, that
> would not interfere with (also future) Windows codes?
>
> –
> Kind regards, Dejan (MSN support: xxxxx@alfasp.com)
> http://www.alfasp.com
> File system audit, security and encryption kits.

OSR_Community_User · December 27, 2010, 9:41pm

I have always felt that a driver should not, under any imaginable
circumstances, call BugCheck(Ex). I have used the analogy that “The driver
is a guest in someone else’s operating system. Would you like it if a guest
came to your house, and because you were out of milk for the coffee, burned
your house down?” A driver should do everything imaginable to recover from
any error that could occur, and its first and foremost responsibility is to
keep the OS alive. Only if the integrity of the OS (not the driver) is
compromised is the OS entitled to call “Time!” and bugcheck. A driver may
cease to function, but it should not bugcheck unless it has compromised the
entire operating system integrity, which is both hard to do and hard to
detect. Yet I’ve seen drivers that if ExAllocate or
MmGetSystemAddressForMdlSafe returns NULL, will call BugCheck, which is
nonsensical.
joe

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Don Burn
Sent: Monday, December 27, 2010 7:29 AM
To: Windows File Systems Devs Interest List
Subject: Re:[ntfsd] Callign KeBugCheckEx

Checking the Win7 source the bug check is called with either both arguments
3 and 4 NULL or with argument 3 being the Driver Name, and argument 4 being
a printf styly format string “Critical process 0x%p
(%s) exited\n”

I know of no way to get a bugcheck code for a products use.

Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

“Dejan Maksimovic” wrote in message news:xxxxx@ntfsd:

> What is the “right” way to issue a CRITICAL_PROCESS_TERMINATION
> (0xF4) bug check? Supposedly parameter 3 and 4 are strings, and even
> though I pass the right address, the BSOD gives me 2 access violations
> in my driver (after the entire BSOD it outputs the driver name
> twice (two lines) for both addresses).
> From other 0xF4 dumps I see the address is just a pointer to a
> NULL-terminated ANSI string, which is what I used, but to no avail.
>
> Also, is there a kosher way to make our own bug check codes, that
> would not interfere with (also future) Windows codes?
>
> –
> Kind regards, Dejan (MSN support: xxxxx@alfasp.com)
> http://www.alfasp.com File system audit, security and encryption kits.

—
NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

–
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

mm1 · December 27, 2010, 9:44pm

Not even during debugging?
On Dec 27, 2010 9:41 PM, “Joseph M. Newcomer” wrote:
> I have always felt that a driver should not, under any imaginable
> circumstances, call BugCheck(Ex). I have used the analogy that “The driver
> is a guest in someone else’s operating system. Would you like it if a
guest
> came to your house, and because you were out of milk for the coffee,
burned
> your house down?” A driver should do everything imaginable to recover from
> any error that could occur, and its first and foremost responsibility is
to
> keep the OS alive. Only if the integrity of the OS (not the driver) is
> compromised is the OS entitled to call “Time!” and bugcheck. A driver may
> cease to function, but it should not bugcheck unless it has compromised
the
> entire operating system integrity, which is both hard to do and hard to
> detect. Yet I’ve seen drivers that if ExAllocate or
> MmGetSystemAddressForMdlSafe returns NULL, will call BugCheck, which is
> nonsensical.
> joe
>
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Don Burn
> Sent: Monday, December 27, 2010 7:29 AM
> To: Windows File Systems Devs Interest List
> Subject: Re:[ntfsd] Callign KeBugCheckEx
>
> Checking the Win7 source the bug check is called with either both
arguments
> 3 and 4 NULL or with argument 3 being the Driver Name, and argument 4
being
> a printf styly format string “Critical process 0x%p
> (%s) exited\n”
>
> I know of no way to get a bugcheck code for a products use.
>
>
> Don Burn (MVP, Windows DKD)
> Windows Filesystem and Driver Consulting
> Website: http://www.windrvr.com
> Blog: http://msmvps.com/blogs/WinDrvr
>
>
>
> “Dejan Maksimovic” wrote in message news:xxxxx@ntfsd:
>
>> What is the “right” way to issue a CRITICAL_PROCESS_TERMINATION
>> (0xF4) bug check? Supposedly parameter 3 and 4 are strings, and even
>> though I pass the right address, the BSOD gives me 2 access violations
>> in my driver (after the entire BSOD it outputs the driver name
>> twice (two lines) for both addresses).
>> From other 0xF4 dumps I see the address is just a pointer to a
>> NULL-terminated ANSI string, which is what I used, but to no avail.
>>
>> Also, is there a kosher way to make our own bug check codes, that
>> would not interfere with (also future) Windows codes?
>>
>> –
>> Kind regards, Dejan (MSN support: xxxxx@alfasp.com)
>> http://www.alfasp.com File system audit, security and encryption kits.
>
>
> —
> NTFSD is sponsored by OSR
>
> For our schedule of debugging and file system seminars
> (including our new fs mini-filter seminar) visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> –
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
> —
> NTFSD is sponsored by OSR
>
> For our schedule of debugging and file system seminars
> (including our new fs mini-filter seminar) visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · December 28, 2010, 3:02am

Especially during debugging! For example, which is easier: to just stop and
restart a driver, or reboot a system? I get tired of reboots and their
time, and it works a lot better if I just IoCompleteRequest (or the WDF
equivalent) which also tends to force my application (either the test app or
the early draft of the real app) to be more robust about error detection,
response, and recovery, which is not a bad thing, either.

Of course, it has always been my policy to write code that simulates device
failure and internal errors so I can test my recovery code. For example, if
a disk drive has 1 in 2**15 failure rate, I once computed that there is one
of these failures every three minutes (look at the sales figures for
Seagate, Fujitsu, etc., add them up, do a little third grade arithmetic, and
make some reasonably conservative assumptions for data transfer rates based
on what you see over the course of a minute on a server (via, for example,
Task Manager) and a workstation, and you get a figure that shows that there
is an error about every three minutes. Code that can’t respond to these
errors is in trouble. Somewhere. All the time. So what I do is have
hidden ioctls that do things like set a bit mask for the next device
register read, OR the bits into the status bits, then clear the mask. So I
can poke down any kind of error any time. A master error test bitmask
allows me to check other parts of the code using the same scenario. I’ve
been writing drivers like this since the mid-1960s, shortly after my first
driver crashed the system when an error occurred and my (untested) recorvery
code got executed, finally [OK, I was young and foolish then. Now I’m
older]. Helps with the code-coverage stats, too, but I’ve never worked in
an environment that required those.

Think of the debugging stage as the equivalent of “well, we’re thinking of
marriage, but we’re testing out living together, and the deail is if I do
anything irritating, she’s entitled to shoot me through the heart. But once
we’re married, that deal is off”. Not very nice for the debugging phase.
Also, like exit(0) in apps, its use encourages a fundamentally sloppy
approach to error recovery [I once worked on a debugger, dbx, the predecessor of gdb, whose response to any error was exit(0). We would spend half an hour getting to the place in the code where the bug was, with the right conditions, and then the debugger would exit(0) and destroy all our work. BugCheck is the exit(0) of the kernel, and calls like exit(0), ExitThread(), and BugCheck(Ex) are the first resort of sloppy programmers. It takes serious care to do good error recovery. I once used a database library whose response to a bad sequence of calls from the application, or a request to seek to a nonexistent record [specifically, a request to seek beyond EOF], was not to return an error code but to do exit(0) on the app.
Sloppy, sloppy, sloppy].

The number of times I’ve had to rip BugCheck() out of what was called
“production” code is far above rational threshold. The few times I’ve been
able to confront the programmer(s) the excuse always along the lines of “the
driver is screwed up and we didn’t want to figure out how to fix it”. My
approach has been: the driver is screwed up, and after that, every operation
reports back a status that indicates “the driver is screwed up”. I try hard
to make the shutdown/unload sequence clean things up.

One driver required that I write the world’s most amazingly convoluted code
to figure out what state the device was in. It had eight possible states,
but there was no way to reset it back to state 0 (the “base state” at
power-up). Each state transition required a command to the registers that
was only valid in that state, and effectively said “go to next state”. A
horrible design. My S.O. came home one day and said “So I see you’ve been
playing all day” because what she saw on my desk looked like
a dungeon map. It was a state diagram of what values I could write to which
registers and what I’d see in the status registers, in each of the states,
so I could figure out what state I was in, and therefore figure out how to
issue the correct set of commands to get me back to state 0. This was so
the user could do the equivalent (in that OS) of “net stop driver” then “net
start driver” (no, it wasn’t Windows, and the command wasn’t ‘net’) and run
the program even after the hardware wedged itself (which it was prone to do;
it took six more releases of hardware to fix most of the bugs, but not all
of them…) When it failed, it would have made no sense, even during
debugging, to crash the OS.
joe

_____

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Martin O’Brien
Sent: Monday, December 27, 2010 9:44 PM
To: Windows File Systems Devs Interest List
Subject: Re: RE: [ntfsd] Callign KeBugCheckEx

Not even during debugging?

On Dec 27, 2010 9:41 PM, “Joseph M. Newcomer” wrote:
> I have always felt that a driver should not, under any imaginable
> circumstances, call BugCheck(Ex). I have used the analogy that “The driver
> is a guest in someone else’s operating system. Would you like it if a
guest
> came to your house, and because you were out of milk for the coffee,
burned
> your house down?” A driver should do everything imaginable to recover from
> any error that could occur, and its first and foremost responsibility is
to
> keep the OS alive. Only if the integrity of the OS (not the driver) is
> compromised is the OS entitled to call “Time!” and bugcheck. A driver may
> cease to function, but it should not bugcheck unless it has compromised
the
> entire operating system integrity, which is both hard to do and hard to
> detect. Yet I’ve seen drivers that if ExAllocate or
> MmGetSystemAddressForMdlSafe returns NULL, will call BugCheck, which is
> nonsensical.
> joe
>
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Don Burn
> Sent: Monday, December 27, 2010 7:29 AM
> To: Windows File Systems Devs Interest List
> Subject: Re:[ntfsd] Callign KeBugCheckEx
>
> Checking the Win7 source the bug check is called with either both
arguments
> 3 and 4 NULL or with argument 3 being the Driver Name, and argument 4
being
> a printf styly format string “Critical process 0x%p
> (%s) exited\n”
>
> I know of no way to get a bugcheck code for a products use.
>
>
> Don Burn (MVP, Windows DKD)
> Windows Filesystem and Driver Consulting
> Website: http://www.windrvr.com
> Blog: http://msmvps.com/blogs/WinDrvr
>
>
>
> “Dejan Maksimovic” wrote in message news:xxxxx@ntfsd:
>
>> What is the “right” way to issue a CRITICAL_PROCESS_TERMINATION
>> (0xF4) bug check? Supposedly parameter 3 and 4 are strings, and even
>> though I pass the right address, the BSOD gives me 2 access violations
>> in my driver (after the entire BSOD it outputs the driver name
>> twice (two lines) for both addresses).
>> From other 0xF4 dumps I see the address is just a pointer to a
>> NULL-terminated ANSI string, which is what I used, but to no avail.
>>
>> Also, is there a kosher way to make our own bug check codes, that
>> would not interfere with (also future) Windows codes?
>>
>> –
>> Kind regards, Dejan (MSN support: xxxxx@alfasp.com)
>> http://www.alfasp.com File system audit, security and encryption kits.
>
>
> —
> NTFSD is sponsored by OSR
>
> For our schedule of debugging and file system seminars
> (including our new fs mini-filter seminar) visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> –
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
> —
> NTFSD is sponsored by OSR
>
> For our schedule of debugging and file system seminars
> (including our new fs mini-filter seminar) visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

— NTFSD is sponsored by OSR For our schedule of debugging and file system
seminars (including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars To unsubscribe, visit the List Server section of
OSR Online at http://www.osronline.com/page.cfm?name=ListServer
–
This message has been scanned for viruses and
dangerous content by http:</http:> MailScanner, and is
believed to be clean.

OSR_Community_User · December 28, 2010, 4:22am

Let’s agree that I don’t care really, as the customer asks for this
sort of driver. I gave him the alternatives, he could not see them being
used in his target market, so he wanted this one.
I see lots of issues with his approach, but to date I have found
that only 1% of companies understand drivers and even less file systems
(hey, how do I prevent a copy? - seen this question before), and it
takes less time to do what they ask, charge them for it, and then do it
the right way than to argue from the start

“Joseph M. Newcomer” wrote:

Part 1.1 Type: Plain Text (text/plain)
Encoding: 7bit

–
Kind regards, Dejan (MSN support: xxxxx@alfasp.com)
http://www.alfasp.com
File system audit, security and encryption kits.

Peter_Viscarola_OSR · December 28, 2010, 11:58am

Wow… Sorry, but I can’t EXPRESS how strongly I disagree with that statement.

Not only do I find that a surprisingly narrow view of the world of Windows drivers, but I find it architecturally misguided. Let me explain why…

Ah, well. That’s simply not an accurate analogy. Windows isn’t Mach. Drivers are PART of the OS, not guests of the OS. A driver is nothing less than an extension of the OS provided through the I/O subsystem interface. Drivers don’t live in thrall to the OS, they cooperate as a full parter with the OS.

Well, we agree on one thing, anyhow…

Oh, that’s nonsense.

So a driver that does not transport data to/from its device but does so in a way that does not cause the OS to blue screen is fulfilling its “foremost responsibility”? C’mon…

A driver’s first and foremost responsibilities are to (a) ensure its device functions properly, (b) prevent unanticipated data loss on its device, and (c) work within the OS-defined guidelines to cooperate with other parts of the OS (including other drivers).

Lacking (a), there’s no point in having the driver. Having (a) but lacking (b) the driver does more harm than good. Having (a) and (b) but not (c) means that the system COULD do useful work with the device that the driver supports, but not a happily working system.

I assume “keep[ing] the OS alive” is included as a part of my item (c). Note that I find this important, but not imperative.

But the driver IS PART OF the OS. It IS the OS.

Unless the entire raison d’etre for the system is to do the task the driver facilitates, right?

This is what I meant when I said “narrow view” above. Let’s say you have an embedded system that’s supposed to collect satellite data. It’s entire job, it’s only purpose in life, is to collect data from the satellite.

The driver calls ExAllocate (to choose a trivial example of something that might fail) during the data collection, and the allocation fails with the result that satellite data can no longer be collected.

Given that the sole reason for the system to exist is to collect satellite data, and the fact that it can no longer serve its sole purpose, there is no point in keeping the system alive.

I would MUCH rather get a crash dump at this point, and have the system reboot/restart itself quickly, than simply HOPE that I can write an event log entry or write something to the log and try to continue running the system in a compromised state.

Now… before you say it… SURE: The driver SHOULD avoid doing things that can fail in the critical path. Obviously, it is not optimal to call ExAllocate in a “must not fail” situation and this should be avoided IF POSSIBLE. But, there are always cases where your design assumptions prove insufficient (changing operational situation, for example: When you built the system you KNEW that pre-allocating 20 packets for data collection use would ALWAYS be more than enough, but you know… they change the satellite firmware or something and now 20 is no longer sufficient).

Completely useless on an embedded system (for example). Now you have a device with a sole purpose of collecting your satellite data that does not work for an indeterminate period of time, and requires some human intervention? That’s terrible. The RIGHT solution here is to blue screen the machine and cause it to reboot.

Or, how about a file system? S’pose we encounter the rare condition in which maintaining the future consistency of user data is impossible. Shall we bravely soldier on and let the user think we’re saving their stuff, only to have them find out later that’s not so??

But back to the broader context of our discussion and not just the above examples: I almost always prefer to see the system crash than to let the system continue in an undefined, tenuous, state only to have it crash “down the road a bit” when the root cause has become lost or obscured.

If I’ve learned one thing in my (too many) years of writing Windows drivers it’s this: One can almost never make absolute statements. To ME, that’s what makes engineering what it is… and not, say, cooking from a recipe.

Peter
OSR

Ayush_Gupta-2 · December 28, 2010, 12:49pm

> [quote]

I have used the analogy that "The driver is a guest in someone else’s
operating system.
[/quote]

Ah, well. That’s simply not an accurate analogy. Windows isn’t Mach.
Drivers are PART of the OS, not guests of the OS.
A driver is nothing less than an extension of the OS provided through the
I/O subsystem interface.
Drivers don’t live in thrall to the OS, they cooperate as a full parter
with the OS.

+1
That is why drivers become a “family” and are the “trusted” components and
are able to bypass a lot of things.

Regards,
Ayush Gupta
Software Consultant & Owner,
AI Consulting
http://in.linkedin.com/in/guptaayush

mm1 · December 28, 2010, 12:59pm

-1

You’re saying that since it’s really hard to determine if the os is FACKED,
let’s pretend everything’s cool, ship it and we’ll worry about it later.

This isn’t an embedded os.

Mm
On Dec 28, 2010 3:03 AM, “Joseph M. Newcomer” wrote:
> Especially during debugging! For example, which is easier: to just stop
and
> restart a driver, or reboot a system? I get tired of reboots and their
> time, and it works a lot better if I just IoCompleteRequest (or the WDF
> equivalent) which also tends to force my application (either the test app
or
> the early draft of the real app) to be more robust about error detection,
> response, and recovery, which is not a bad thing, either.
>
> Of course, it has always been my policy to write code that simulates
device
> failure and internal errors so I can test my recovery code. For example,
if
> a disk drive has 1 in 2**15 failure rate, I once computed that there is
one
> of these failures every three minutes (look at the sales figures for
> Seagate, Fujitsu, etc., add them up, do a little third grade arithmetic,
and
> make some reasonably conservative assumptions for data transfer rates
based
> on what you see over the course of a minute on a server (via, for example,
> Task Manager) and a workstation, and you get a figure that shows that
there
> is an error about every three minutes. Code that can’t respond to these
> errors is in trouble. Somewhere. All the time. So what I do is have
> hidden ioctls that do things like set a bit mask for the next device
> register read, OR the bits into the status bits, then clear the mask. So I
> can poke down any kind of error any time. A master error test bitmask
> allows me to check other parts of the code using the same scenario. I’ve
> been writing drivers like this since the mid-1960s, shortly after my first
> driver crashed the system when an error occurred and my (untested)
recorvery
> code got executed, finally [OK, I was young and foolish then. Now I’m
> older]. Helps with the code-coverage stats, too, but I’ve never worked in
> an environment that required those.
>
> Think of the debugging stage as the equivalent of “well, we’re thinking of
> marriage, but we’re testing out living together, and the deail is if I do
> anything irritating, she’s entitled to shoot me through the heart. But
once
> we’re married, that deal is off”. Not very nice for the debugging phase.
> Also, like exit(0) in apps, its use encourages a fundamentally sloppy
> approach to error recovery [I once worked on a debugger, dbx, the > predecessor of gdb, whose response to any error was exit(0). We would spend > half an hour getting to the place in the code where the bug was, with the > right conditions, and then the debugger would exit(0) and destroy all our > work. BugCheck is the exit(0) of the kernel, and calls like exit(0), > ExitThread(), and BugCheck(Ex) are the first resort of sloppy programmers. > It takes serious care to do good error recovery. I once used a database > library whose response to a bad sequence of calls from the application, or a > request to seek to a nonexistent record [specifically, a request to seek > beyond EOF], was not to return an error code but to do exit(0) on the app.
> Sloppy, sloppy, sloppy].
>
> The number of times I’ve had to rip BugCheck() out of what was called
> “production” code is far above rational threshold. The few times I’ve been
> able to confront the programmer(s) the excuse always along the lines of
“the
> driver is screwed up and we didn’t want to figure out how to fix it”. My
> approach has been: the driver is screwed up, and after that, every
operation
> reports back a status that indicates “the driver is screwed up”. I try
hard
> to make the shutdown/unload sequence clean things up.
>
> One driver required that I write the world’s most amazingly convoluted
code
> to figure out what state the device was in. It had eight possible states,
> but there was no way to reset it back to state 0 (the “base state” at
> power-up). Each state transition required a command to the registers that
> was only valid in that state, and effectively said “go to next state”. A
> horrible design. My S.O. came home one day and said “So I see you’ve been
> playing all day” because what she saw on my desk looked
like
> a dungeon map. It was a state diagram of what values I could write to
which
> registers and what I’d see in the status registers, in each of the states,
> so I could figure out what state I was in, and therefore figure out how to
> issue the correct set of commands to get me back to state 0. This was so
> the user could do the equivalent (in that OS) of “net stop driver” then
“net
> start driver” (no, it wasn’t Windows, and the command wasn’t ‘net’) and
run
> the program even after the hardware wedged itself (which it was prone to
do;
> it took six more releases of hardware to fix most of the bugs, but not all
> of them…) When it failed, it would have made no sense, even during
> debugging, to crash the OS.
> joe
>
>
>
>
> _____
>
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Martin O’Brien
> Sent: Monday, December 27, 2010 9:44 PM
> To: Windows File Systems Devs Interest List
> Subject: Re: RE: [ntfsd] Callign KeBugCheckEx
>
>
>
> Not even during debugging?
>
> On Dec 27, 2010 9:41 PM, “Joseph M. Newcomer”
wrote:
>> I have always felt that a driver should not, under any imaginable
>> circumstances, call BugCheck(Ex). I have used the analogy that “The
driver
>> is a guest in someone else’s operating system. Would you like it if a
> guest
>> came to your house, and because you were out of milk for the coffee,
> burned
>> your house down?” A driver should do everything imaginable to recover
from
>> any error that could occur, and its first and foremost responsibility is
> to
>> keep the OS alive. Only if the integrity of the OS (not the driver) is
>> compromised is the OS entitled to call “Time!” and bugcheck. A driver may
>> cease to function, but it should not bugcheck unless it has compromised
> the
>> entire operating system integrity, which is both hard to do and hard to
>> detect. Yet I’ve seen drivers that if ExAllocate or
>> MmGetSystemAddressForMdlSafe returns NULL, will call BugCheck, which is
>> nonsensical.
>> joe
>>
>>
>> -----Original Message-----
>> From: xxxxx@lists.osr.com
>> [mailto:xxxxx@lists.osr.com] On Behalf Of Don Burn
>> Sent: Monday, December 27, 2010 7:29 AM
>> To: Windows File Systems Devs Interest List
>> Subject: Re:[ntfsd] Callign KeBugCheckEx
>>
>> Checking the Win7 source the bug check is called with either both
> arguments
>> 3 and 4 NULL or with argument 3 being the Driver Name, and argument 4
> being
>> a printf styly format string “Critical process 0x%p
>> (%s) exited\n”
>>
>> I know of no way to get a bugcheck code for a products use.
>>
>>
>> Don Burn (MVP, Windows DKD)
>> Windows Filesystem and Driver Consulting
>> Website: http://www.windrvr.com
>> Blog: http://msmvps.com/blogs/WinDrvr
>>
>>
>>
>> “Dejan Maksimovic” wrote in message news:xxxxx@ntfsd:
>>
>>> What is the “right” way to issue a CRITICAL_PROCESS_TERMINATION
>>> (0xF4) bug check? Supposedly parameter 3 and 4 are strings, and even
>>> though I pass the right address, the BSOD gives me 2 access violations
>>> in my driver (after the entire BSOD it outputs the driver name
>>> twice (two lines) for both addresses).
>>> From other 0xF4 dumps I see the address is just a pointer to a
>>> NULL-terminated ANSI string, which is what I used, but to no avail.
>>>
>>> Also, is there a kosher way to make our own bug check codes, that
>>> would not interfere with (also future) Windows codes?
>>>
>>> –
>>> Kind regards, Dejan (MSN support: xxxxx@alfasp.com)
>>> http://www.alfasp.com File system audit, security and encryption kits.
>>
>>
>> —
>> NTFSD is sponsored by OSR
>>
>> For our schedule of debugging and file system seminars
>> (including our new fs mini-filter seminar) visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>>
>> –
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>>
>> —
>> NTFSD is sponsored by OSR
>>
>> For our schedule of debugging and file system seminars
>> (including our new fs mini-filter seminar) visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> — NTFSD is sponsored by OSR For our schedule of debugging and file
system
> seminars (including our new fs mini-filter seminar) visit:
> http://www.osr.com/seminars To unsubscribe, visit the List Server section
of
> OSR Online at http://www.osronline.com/page.cfm?name=ListServer
> –
> This message has been scanned for viruses and
> dangerous content by http:</http:> MailScanner, and is
> believed to be clean.
>
> —
> NTFSD is sponsored by OSR
>
> For our schedule of debugging and file system seminars
> (including our new fs mini-filter seminar) visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Calvin_Guan-2 · December 28, 2010, 3:54pm

I usually do
a) try to handle *all* failure conditions gracefully in free build and
b) *zero tolerance* to abnomalities in checked build.

The different strategies used in chk and fre build gives me good “safety margin”, and may yield vital clues when I need it.

Instead of using KeBugcheckXxx, I use dbg_bp_if() which is translated to a “__debug_break if true” – BTW, the stock ASSERT is very annoying.

It’s very important to catch abnomalities as early as possible as for the *debug* build. They can be anything from simple programming error to result of fundumental design flaw. Break, break and break, use kd to analyze them carefully, fix or remove them. As the code getting mature and closer to production, the breakpoints are almost all gone b/c most abnomalities had been brought to my attention and had been taken care of. However, there are some out of my control as every driver programmer has encountered.

Checked build driver must be debuggable. Don’t try to hide the valuable clues which are supposed to save my day. Debugging a customer reported issue in the field is very difficult especially for mature products which has been mass volume shipped for years. The last thing I want is receiving a call from Microsoft’s SQL team saying we were stressing your NIC, it’s dropping 2 records every 14 days or so. We have setup a remote for you to look at, please investigate at your earliest convenient. By the way, the system is running our production Exchange mail server, please be careful…

Calvin

OSR_Community_User · December 29, 2010, 3:03pm

>- BTW, the stock ASSERT is very annoying.

I have my own ASSERTs for this.

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

OSR_Community_User · January 12, 2011, 7:18pm

>> I know of no way to get a bugcheck code for a products use.

Ask Microsoft. You may not get a bugcheck code for your exclusive use, but you can certainly get one that is specific to a class of device drivers and which has some specificity to your driver. I point, for example, to the bug code 0x108, THIRD_PARTY_FILE_SYSTEM_FAILURE, which ties parameter 1 in the blue screen to the file system that is reporting. This was created as a direct result of some lobbying of Microsoft for an error code that didn’t have PolyServe using what we had been using until that time, which was 0x59, PINBALL_FILE_SYSTEM (which should cause the old-timers on this list a hearty LOL). This became a problem when the online error reporting for Windows came into use.

With few exceptions (0x108 being one of them), I can’t see a lot of reasons to have a bugcheck code that is specific to your product or driver. A possible exception is that you have some new class of driver that does something with devices that no one has ever done before, and which crashes for reasons no one has ever thought of. I that case, petition Microsoft for that new bug code - the space is large and very lightly filled, and they are fairly reasonable guys.

(Just for reference, the mechanism I used to get the 0x108 code and the PolyServe specific pointer was to go to the file system filter fest and simply ask the Microsoft guys there for one. I had a reasoned and reasonable argument ready, and presented it clearly to them. After that it only took until the next major release (a year or so) until the bugcheck showed up.)

…dave

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Don Burn
Sent: Monday, December 27, 2010 5:29 AM
To: Windows File Systems Devs Interest List
Subject: Re:[ntfsd] Callign KeBugCheckEx

Checking the Win7 source the bug check is called with either both
arguments 3 and 4 NULL or with argument 3 being the Driver Name, and
argument 4 being a printf styly format string “Critical process 0x%p
(%s) exited\n”

I know of no way to get a bugcheck code for a products use.

Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

“Dejan Maksimovic” wrote in message news:xxxxx@ntfsd:

> What is the “right” way to issue a CRITICAL_PROCESS_TERMINATION
> (0xF4) bug check? Supposedly parameter 3 and 4 are strings, and even
> though I pass the right address, the BSOD gives me 2 access violations
> in my driver (after the entire BSOD it outputs the driver name twice
> (two lines) for both addresses).
> From other 0xF4 dumps I see the address is just a pointer to a
> NULL-terminated ANSI string, which is what I used, but to no avail.
>
> Also, is there a kosher way to make our own bug check codes, that
> would not interfere with (also future) Windows codes?
>
> –
> Kind regards, Dejan (MSN support: xxxxx@alfasp.com)
> http://www.alfasp.com
> File system audit, security and encryption kits.

—
NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · January 13, 2011, 4:10am

>that time, which was 0x59, PINBALL_FILE_SYSTEM (which should cause the old-timers on this list a

hearty LOL).

Is it OS/2’s HPFS? and why the name of “pinball”?

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

OSR_Community_User · January 13, 2011, 7:04pm

I don’t remember all the details, but pinball was the code name for HPFS (which was shipped on NT until 4.0). There were actually two versions of pinball, one in assembler and one in c; the c versions was called pinball. I kinda vaguely recall that pinball may have been a product of the LanManager group for OS/2.

I don’t know the origin of the code name; you’d need to get some real old timers to tell you that (I only go back to 1990 with NT file systems…)

…dave

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Maxim S. Shatskih
Sent: Thursday, January 13, 2011 2:10 AM
To: Windows File Systems Devs Interest List
Subject: Re:[ntfsd] Re:Callign KeBugCheckEx

that time, which was 0x59, PINBALL_FILE_SYSTEM (which should cause the old-timers on this list a
hearty LOL).

Is it OS/2’s HPFS? and why the name of “pinball”?

–
Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer