Узбекистан, Бухара, Бухарский институт высоких технологий, 2013 |
Custom kernels
Finding out what really happened
In general, you start analyzing a panic dump in the stack frame that called panic, but in the case of the fatal trap that we have here, the most important stack frame is the one below trap, in this case frame 5. That's where things went wrong. Select it with the frame command, abbreviated to f, and list the code with list (or l):
(kgdb) f 5 #5 0xc01c434b in malloc (size=1024, type=0xc03c3c60, flags=0) at ../../kern/kern_malloc.c:233 233 va = kbp->kb_next; (kgdb) l 228 } 229 freep->next = savedlist; 230 if(kbp->kb_last == NULL) 231 kbp->kb_last = (caddr_t)freep; 232 } 233 va = kbp->kb_next; 234 kbp->kb_next = ((struct freelist *)va)->next; 235 #ifdef INVARIANTS 236 freep = (struct freelist *)va; 237 savedtype = (const char *) freep->type->ks_shortdesc; (kgdb)
You might want to look at the local (automatic) variables. Use info local, which you can abbreviate to I loc:
(kgdb) I loc type = (struct malloc_type *) 0xc03c3c60 kbp = (struct kmembuckets *) 0xc03ebc68 kup = (struct kmemusage *) 0x0 freep = (struct freelist *) 0x0 indx = 10 npg = -1071714292 allocsize = -1069794208 s=6864992 va = 0xffffffff <Address 0xffffffff out of bounds> cp = 0x0 savedlist = 0x0 ksp = (struct malloc_type *) 0xffffffff (kgdb)
The line where the problem occurs is 233:
233 va =kbp->kb_next;
(kgdb) p*kbp $2 = { kb_next = 0xffffffff <Address 0xffffffff out of bounds>, kb_last = 0xc1a31000 "", kb_calls = 83299, kb_total = 1164, kb_elmpercl = 4, kb_totalfree = 178, kb_highwat = 20, kb couldfree = 3812 }
The problem here is that the pointer kb_next is set to Oxffffffff . It should contain a valid address, but as gdb observes, this isn't not valid.
So far we have found that the crash is in malloc, and that it's caused by an invalid pointer in an internal data structure. malloc is a function that is used many times a second by all computers. It's unlikely that the bug is in malloc. In fact, the most likely cause is that a function that has used memory allocated by malloc has overwritten its bounds and hit malloc's data structures.
What do we do now? To quote fortune:
The seven eyes of Ningauble the Wizard floated back to his hood as he reported to Fafhrd: "I have seen much, yet cannot explain all. The Gray Mouser is exactly twenty-five feet below the deepest cellar in the palace of Gilpkerio Kistomerces. Even though twenty-four parts in twenty-five of him are dead, he is alive.
"Now about Lankhmar. She's been invaded, her walls breached everywhere and desperate fighting is going on in the streets, by a fierce host which out-numbers Lankhmar's inhabitants by fifty to one -and equipped with all modern weapons. Yet you can save the city."
"How?" demanded Fafhrd.
Ningauble shrugged. "You're a hero. You should know."
-- Fritz Leiber, from "The Swords of Lankhmar"
From here on, you're on your own. If you get this far, the FreeBSD-hackers mailing list may be interested in giving suggestions.