Misplaced Pages

Ntoskrnl.exe: Difference between revisions

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 15:37, 29 April 2014 editJeh (talk | contribs)Extended confirmed users, Pending changes reviewers19,611 edits rm sentence that is not about the subject of this article. WP articles do not attempt to justify their existence or encourage readership← Previous edit Latest revision as of 20:09, 25 December 2024 edit undoJ.c.whynot (talk | contribs)4 editsNo edit summaryTag: Manual revert 
(141 intermediate revisions by 73 users not shown)
Line 1: Line 1:
{{Short description|Windows NT kernel image}}
{{Under construction}}
{{Lowercase title}}
{{About|the Windows NT kernel image|the Windows NT kernel|Windows NT kernel|}}
{{About|a computer file that contains a part of the Windows NT kernel|the Windows NT kernel itself|Architecture of Windows NT}}
{{lowercase|ntoskrnl.exe}}
{{More citations needed|date=April 2014}}

'''ntoskrnl.exe''' (Short for ] ] ], and '''ntkrnlpa.exe''' on systems with ] support) is the ] image for the family of ] operating systems. It provides the kernel and executive layers of the Windows NT kernel space, and is responsible for various system services such as hardware virtualization, process and memory management, etc., thus making it a fundamental part of the system. It contains the cache manager, the executive, the kernel, the security reference monitor, the memory manager, and the scheduler, among other things.<ref name="sysinternals">Russinovich, M: , ''SysInternals Information''</ref> <code>'''ntoskrnl.exe'''</code> (short for ] ] ] ]), also known as the '''kernel image''', contains the ] and ] layers of the Microsoft ], and is responsible for ], ] handling, and ]. In addition to the kernel and executive layers, it contains the ] manager, security reference monitor, memory manager, ] (Dispatcher), and ] (the prose and portions of the code).<ref name="sysinternals">Russinovich, M: , ''SysInternals Information''</ref>


== Overview == == Overview ==
This system binary is not a ] (in that it is not linked against <code>ntdll.dll</code>), instead containing a standard 'start' entry point, a function that calls the arch-independent kernel initialization function. x86 versions of <code>ntoskrnl.exe</code> depend on <code>bootvid.dll</code>, <code>]</code> and <code>kdcom.dll</code> (x64 variants of <code>ntoskrnl.exe</code> have these DLLs embedded in the kernel to improve performance). However, it is not a ] thus it is not linked against <code>]</code>. Instead, <code>ntoskrnl.exe</code> has its own ] "'''KiSystemStartup'''" that calls the ]-independent kernel initialization function. Because it requires a static copy of the C Runtime objects, the executable is usually about 10 MB in size.


In ] and earlier, the Windows installation source ships four kernel image files to support ], ] systems, CPUs with ], and CPUs without PAE. Windows setup decides whether the system is uniprocessor or multiprocessor, then, installs both the PAE and non-PAE variants of the kernel image for the decided kind. On a multiprocessor system, Setup installs <code>ntkrnlmp.exe</code> and <code>ntkrpamp.exe</code> but renames them to <code>ntoskrnl.exe</code> and <code>ntkrnlpa.exe</code> respectively.
<source lang="c">
/*
* NTOS kernel entry point
*/
void __attribute__((stdcall)) KiSystemStartup(IN PLOADER_PARAMETER_BLOCK LoaderBlock)
{
/* perform arch independent initialization */
KiInitializeKernel();
}
</source>


Starting with Windows Vista, Microsoft began unifying the kernel images as ] took to the market and PAE became mandatory.
While ntoskrnl.exe is not linked against ], it is linked against bootvid.dll, ] and kdcom.dll. Because it requires a static copy of C Runtime objects it depends on, the executable is usually about 2MB in size.


{| class="wikitable sortable" style="text-align: center; margin: 0px auto;"
=== Kernel image filenames ===
] or ] files are selected at install time, and PAE or non-PAE files are selected by boot.ini or BCD option.
The kernel image is chosen according to the ].

{| class="wikitable sortable" style="text-align: center; font-size: smaller; table-layout: fixed;"
|+ Kernel image filenames |+ Kernel image filenames
|-
| colspan="3" |32-bit Windows
|- |-
! Filename ! Filename
! ] ! Supports<br />]
! ] ! Supports<br />]
|-
| colspan="4" | 32-bit kernel
|- |-
| <code>ntoskrnl.exe</code>
| ''NTOSKRNL.EXE''
| {{No}} | {{No}}
| {{No}} | {{No}}
|- |-
| <code>ntkrnlmp.exe</code>
| ''NTKRNLMP.EXE''
| {{Yes}} | {{Yes}}
| {{No}} | {{No}}
|- |-
| <code>ntkrnlpa.exe</code>
| ''NTKRNLPA.EXE''
| {{No}} | {{No}}
| {{Yes}} | {{Yes}}
|- |-
| <code>ntkrpamp.exe</code>
| ''NTKRPAMP.EXE''
| {{Yes}}
| {{Yes}}
|-
| colspan="3" | 64-bit kernel (] editions)
|-
! Filename
! Supports<br />]
! Supports<br />]
|-
| <code>ntkrnlmp.exe</code>
| {{Yes}}
| {{No}}
|-
| <code>ntkrla57.exe</code>
| {{Yes}} | {{Yes}}
| {{Yes}} | {{Yes}}
|} |}


=== Coding style ===
Typical Windows coding uses ] for variable names. The kernel API also uses prefixes for functions.


Windows kernel's architecture is structured so that everything is easy to understand{{huh|date=December 2024}}. Functions and global variables use the, so called ] formatting with special (additional) prefixes in their names to differentiate parts of the kernel.
The table below is not an exhaustive listing of all prefixes. There are other undocumented prefixes.


An example is '''IoCreateDevice''' and '''ObReferenceObjectByHandle'''. Both functions have different prefix names to differentiate critical managers within the kernel code: '''Io''' being used for functions and '''Ob''' for ] functions.
{| class="wikitable sortable collapsible"

|+ NT function prefixes<ref name="oldnewthing">{{cite web | author=] | year=2009 | url=http://blogs.msdn.com/oldnewthing/archive/2009/06/03/9687937.aspx | title=The Old New Thing : What does the "Zw" prefix mean? | publisher=] | accessdate=2009-06-13 }}</ref><ref>{{cite web | author=] | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff551797.aspx | title=I/O Manager Routines | publisher=] | accessdate=2009-06-13 }}</ref><ref>{{cite web | author=] | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff539010.aspx | title=Cache Manager Routines | publisher=] | accessdate=2009-06-13 }}</ref><ref>{{cite web | author=] | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff559835.aspx | title=Power Manager Routines | publisher=] | accessdate=2009-06-13 }}</ref><ref>{{cite web | author=] | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff542078.aspx | title=Core Kernel Library Support Routines | publisher=] | accessdate=2009-06-13 }}</ref><ref>{{cite web | author=] | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff540426.aspx | title=File System Runtime Library Routines | publisher=] | accessdate=2009-06-13 }}</ref>
Variations of these prefixes exist for internal functions that are not being exported by the kernel, such as adding an '''i''' after the first letter (e.g., <code>Ki</code> for “Kernel Internal”) or appending '''p''' to the full prefix (e.g., <code>Psp</code> for “Process Support Internal”).


The following table lists all prefixes.

{| class="wikitable sortable"
|+ <u>NT favorable prefixes</u>
|- |-
! Prefix ! Export<br />Prefix
!Internal Prefix
! Meaning ! Meaning
|- |-
| <code>Cc</code>
| Cc || cache controller
| Ccp || File system cache<ref>{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff539010.aspx | title=Cache Manager Routines | publisher=] | access-date=2009-06-13 }}</ref>
|- |-
| <code>Cm</code>
| Csr || Csr are client-server functions that are used to communicate with the Win32 subsystem process, csrss.exe (csrss stands for client/server runtime sub-system).
| Cmp || Configuration Manager, the kernel mode side of ]
|- |-
| <code>Dbg</code>
| Dbg || Dbg are debugging aid functions such as a software break point.
| Dbg || Debugging aid functions, such as a software break point
|- |-
| <code>Dbgk</code>
| Ex || Windows Executive
| Dbgk
| A set of debugging functions that are being exposed to user mode through ntdll.dll
|- |-
| <code>Ex</code>
| FsRtl|| file system runtime
| Exp || Windows executive, an "outer layer" of <code>ntoskrnl.exe</code>
|- |-
| <code>FsRtl</code>
| Io || I/O manager
| FsRtlp || File system runtime library<ref>{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff540426.aspx | title=File System Runtime Library Routines | publisher=] | access-date=2009-06-13 }}</ref>
|- |-
| <code>Io</code>
| Ke || core kernel routines
| Iop || I/O manager<ref>{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff551797.aspx | title=I/O Manager Routines | publisher=] | access-date=2009-06-13 }}</ref>
|- |-
| <code>Ke</code>
| Ki || Ki are upcalls from kernel-mode for things like APC dispatching.
| Ki || Core kernel routines<ref>{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff542078.aspx | title=Core Kernel Library Support Routines | publisher=] | access-date=2009-06-13 }}</ref>
|- |-
|
| Ks || kernel streaming
| Kx || ], semaphores, ], ] and ] related functions
|- |-
|
| Ldr || Ldr are loader functions for PE file handling and starting of new processes.
| Ks || Kernel streaming
|- |-
| <code>Ldr</code>
| Lpc || Local Procedure Call
| Ldrp || NT's ] loader
|- |-
| <code>Lpc</code>
| Lsa || Local Security Authority
| Lpcp || ], an internal, undocumented, interprocess or user/kernel message passing mechanism
|- |-
| <code>Lsa</code>
| Mm || memory management
| Lsap || ]
|- |-
| <code>Mm</code>
| Mi || Memory management
|-
| <code>Nls</code>
| Nls || Nls for Native Language Support (similar to code pages). | Nls || Nls for Native Language Support (similar to code pages).
|- |-
| <code>Ob</code>
| Ob || Object Manager
| Obp || ]
|- |-
| <code>Po</code>
| Pfx || Pfx for prefix handling.
| Pop || ] and ]<ref>{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff559835.aspx | title=Power Manager Routines | publisher=] | access-date=2009-06-13 }}</ref>
|- |-
| <code>Ps</code>
| Po || power management
| Psp || ] and ] management (task management)
|- |-
| <code>Rtl</code>
| Ps || Process management
| Rtlp || ], i.e., many utility functions that can be used by native applications, yet don't directly involve kernel support
|- |-
| <code>Se</code>
| Rtl || Rtl is the second largest group of ntdll calls. These comprise the (extended) C Run-Time Library, which includes many utility functions that can be used by native applications, yet don't directly involve kernel support.
| Sep || Security Manager, ] for the Win32 API
|- |-
| <code>Vf</code>
| Se || security
| Vi || ]
|- |-
| <code>Zw/Nt</code>
| Zw || Nt or Zw are system calls declared in ntdll.dll and ntoskrnl.exe. When called from ntdll.dll in user mode, these groups are almost exactly the same; they trap into kernel mode and call the equivalent function in ntoskrnl.exe via the SSDT. When calling the functions directly in ntoskrnl.exe (only possible in kernel mode), the Zw variants ensure kernel mode, whereas the Nt variants do not.<ref>{{cite journal | author=The NT Insider | work=OSR Online | volume=10 | issue=4 | date=August 27, 2003 | url=http://www.osronline.com/article.cfm?article=257 | title=Nt vs. Zw - Clearing Confusion On The Native API | publisher=OSR Open Systems Resources | accessdate=2013-09-16 }}</ref> The Zw prefix does not stand for anything.<ref name="oldnewthing">{{cite web | author=] | year=2009 | url=http://blogs.msdn.com/oldnewthing/archive/2009/06/03/9687937.aspx | title=The Old New Thing : What does the "Zw" prefix mean? | publisher=] | accessdate=2009-06-13 }}</ref>
| || <code>Nt</code> or <code>Zw</code> are system calls declared in <code>ntdll.dll</code> and <code>ntoskrnl.exe</code>. When called from <code>ntdll.dll</code> in user mode, these groups are almost exactly the same; they trap into kernel mode and call the equivalent function in <code>ntoskrnl.exe</code> via the ]. When calling the functions directly in <code>ntoskrnl.exe</code> (only possible in kernel mode), the <code>Zw</code> variants ensure kernel mode, whereas the <code>Nt</code> variants do not.<ref>{{cite journal | author=The NT Insider | journal=OSR Online | volume=10 | issue=4 | date=August 27, 2003 | url=http://www.osronline.com/article.cfm?article=257 | title=Nt vs. Zw - Clearing Confusion On The Native API | publisher=OSR Open Systems Resources | access-date=2013-09-16 }}</ref>
|} |}


== Initialization == == Initialization ==
When the kernel receives control, it gets a pointer to a structure as parameter.<ref>windows internals</ref> This structure is passed by the boot loader and contain information about the hardware, the path to the registry file, kernel parameters containing boot preferences or options that change the behaviour of the kernel, path of the files loaded by the bootloader (<code>SYSTEM</code> registry hive, nls for character encoding conversion and vga font).<ref>http://www.nirsoft.net/kernel_struct/vista/LOADER_PARAMETER_BLOCK.html</ref> The definition of this structure can be retrieved by using the kernel debugger or downloading it from the Microsoft symbol database.<ref name="practical rev eng" /> When the kernel receives control, it gets a struct-type pointer from ]. The pointer's destination contains information about the hardware, the path to the Windows Registry file, kernel parameters containing boot preferences or options that change the behavior of the kernel, path of the files loaded by the bootloader (<code>SYSTEM</code> ], <code>nls</code> for character encoding conversion, and <code>vga</code> font).<ref>{{cite web|url=http://www.nirsoft.net/kernel_struct/vista/LOADER_PARAMETER_BLOCK.html|title=struct LOADER_PARAMETER_BLOCK|website=www.nirsoft.net}}</ref> The definition of this structure can be retrieved by using the kernel debugger or downloading it from the Microsoft symbol database.<ref name="practical rev eng">{{cite book|title=Practical Reverse Engineering Using X86, X64, Arm, Windows Kernel, and Reversing Tools.|date=2014|publisher=John Wiley & Sons Inc|isbn=978-1118787311}}</ref>{{Page needed|date=October 2014}}


In the x86 architecture, the kernel receives the system already in protected mode, with the GDT, IDT and TSS ready. But since it does not know the address of each one, it has to load them one by one to fill the ] structure. In the ], the kernel receives the system already in protected mode, with the ], ] and ] ready.{{elucidate|date=October 2014}} But since it does not know the address of each one, it has to load them one by one to fill the ] structure.{{technical statement|date=October 2014}}


The main entry point of ntoskrnl.exe performs some system dependent initialization then calls a system independent initialization then enters an idle loop. The main entry point of <code>ntoskrnl.exe</code> performs some system dependent initialization then calls a system independent initialization then enters an idle loop.{{Contradict-inline|reason=It does? From the code sample above it looks like it calls KiInitializeKernel and then returns to caller.|date=October 2014}}


== Interrupt Handling == == Interrupt handling ==
{{About|NT implementation of interrupt handlers||Interrupt handling}} {{About|NT implementation of interrupt handlers||Interrupt handling}}

Modern operating systems use interrupts instead of I/O port polling to wait for information from devices. Modern operating systems use interrupts instead of I/O port polling to wait for information from devices.


In the x86 architecture, interrupts are handled by using the Interrupt Vector Table (IVT). When a device triggers an interrupt, or interrupt request (IRQ), the interrupt flag (IF) in the flags register is set and the processor's hardware looks for a interrupt handler on the table. Interrupt handlers usually save the state of all or some registers before handling it and restore the registers when done. In the ], interrupts are handled through the Interrupt Dispatch Table (IDT). When a device triggers an interrupt ''and'' the ] (IF) in the ] is set, the processor's hardware looks for an interrupt handler in the table entry corresponding to the interrupt number to which in turn has been translated from ] by ] chips, or in more modern hardwares, ]. Interrupt handlers usually save some subset of the state of ] before handling it and restore them back to their original values when done.

The interrupt table contains handlers for hardware interrupts, software interrupts, and exceptions. For some ] versions of the kernel, one example of such a software interrupt handler (of which there are many) is in its IDT table entry 2E<sub>16</sub> (]; 46 in ]), used in ] as <code>INT 2EH</code> for ]s. In the real implementation the entry points to an internal ] named (as per ] information published by Microsoft) <code>KiSystemService</code>. For newer versions, different mechanisms making use of <code>SYSENTER</code> ] and in ] <code>SYSCALL</code> instruction are used instead.


One notable feature of NT's interrupt handling is that interrupts are usually conditionally masked based on their priority (called "IRQL"), instead of disabling all IRQs via the interrupt flag. This permits various kernel components to carry on critical operations without necessarily blocking services of peripherals and other devices.<ref name="kirql_technet">{{cite web | author=CC Hameed | date=January 22, 2008 | url=https://blogs.technet.microsoft.com/askperf/2008/01/22/what-is-irql-and-why-is-it-important/ | title=What is IRQL and why is it important? {{!}} Ask the Performance Team Blog | publisher=] | access-date=2018-11-11 }}</ref>
The interrupt table contains handlers both for IRQs and soft interrupts. The location of the soft interrupt handler is 0x2e.
It points to the <code>KiSystemService</code>.


== Memory manager == == Memory manager ==
{{About|NT implementation of a memory manager||memory management}} {{About|NT implementation of a memory manager||memory management}}
The entire physical memory (RAM) address range is broken into many small blocks also called pages, 4KB in size each, and mapped to virtual addresses. A few of the properties of each block are stored in structures called ] entries, which are managed by the OS and accessed by the processor's hardware. Page tables are organized into a tree structure, and the physical page number of the top-level table is stored in control register 3 (CR3).


The kernel of Microsoft Windows divides the memory in two segments using paging. The lower part, starting at zero, is used by user-land programs and the upper part is used by the kernel. Microsoft Windows divides ] into two regions. The lower part, starting at zero, is instantiated separately for each process and is accessible from both user and kernel mode.<!-- not that we've described these yet... --> Application programs run in processes and supply code that runs in user mode.
The upper part is accessible only from kernel mode, and with some exceptions, is instantiated just once, system-wide. <code>ntoskrnl.exe</code> is mapped into this region, as are several other kernel mode components. This region also contains data used by kernel mode code, such as the kernel mode heaps and the file system cache.


{| class="wikitable" {| class="wikitable"
|+ Virtual Address Space Layouts<ref name="practical rev eng" />
|-
|+ Start and end of segments by access privilege<ref name="practical rev eng" />
|- |-
! Arch ! Arch
Line 133: Line 170:
! MmSystemRangeStart ! MmSystemRangeStart
|- |-
| x86 || 0x7fffffff || 0x80000000 | x86{{efn|Tunable via <code>/userva</code> or <code>/3gb</code> switch.}} || rowspan=2 | <code>0x7fffffff</code> || rowspan=2 | <code>0x80000000</code>
|- |-
| ARM
| ARM || 0x7fffffff || 0x80000000
|- |-
| x86-64 || <code>0x000007ff'ffffffff</code>(until Windows 8.1 Update 2)<br /><code>0x00007fff'ffffffff</code>(from Windows 8.1 Update 3) || <code>0xffff8000'00000000</code>
| x86-64 || 0x000007ff‘ffffffff || 0xffff0800‘00000000
|} |}


== Registry ==
In the x86 architecture, the memory manager (mm) of an operating system separates each address space using ].
{{Details|Windows Registry}}
The entire physical memory (RAM) address range is broken into many small (usualy 4 KiB) blocks. The properties of each block is stored in the ] that is managed directly by the processor's hardware. The base address of this table is stored in the control register #3 (CR3) and paging.
Windows Registry is a repository for configuration and settings information for the operating system and for other software, such as applications. It can be thought of as a filesystem optimized for small files.<ref>{{cite book|last=Tanenbaum|first=Andrew S.|title=Modern operating systems|date=2008|publisher=Pearson Prentice Hall|location=Upper Saddle River, N.J.|isbn=978-0136006633|pages=829|edition=3rd}}</ref> However, it is not accessed through file system-like semantics, but rather through a specialized set of APIs, implemented in kernel mode and exposed to user mode.


The registry is stored on disk as several different files called "hives." One, the System hive, is loaded early in the boot sequence and provides configuration information required at that time. Additional registry hives, providing software-specific and user-specific data, are loaded during later phases of system initialization and during user login, respectively.
== Object Manager ==
{{main|Object Manager (Windows)}}

The Object Manager (OM) is the ] (VFS) used by the NTOS kernel that is similar to the <code>]</code> and <code>]</code> filesystems. It is exposed to the user-land programs but it is not used by the default shell directly neither is shown to the user, except when using special tools such as the ] by Sysinternals.

The root of OM is not a disk filesystem nor a ramdrive. It contains node that corresponds to the mounted filesystem. The name of these nodes are of the form PartitionXX. It stores volatile data. Data that should be nonvolatile is stored in the registry.

Explorer does not use them directly but use some symlinks that point to the actual NT nodes. The name of each of these nodes is a DOS partition letter. They were created for campatibility with old MS-DOS-based Windows releases.

These nodes are not created by the kernel but by the filesystem driver. The filesystem driver must comply with the ] API. It uses the IoCreateDevice<ref>{{cite web|title=IoCreateDevice routine|url=http://msdn.microsoft.com/en-us/library/windows/hardware/ff548397%28v=vs.85%29.aspx|work=MSDN|publisher=Microsoft Corporation|accessdate=28 April 2014}}</ref> and IoCreateSymbolicLink<ref>{{cite web|title=IoCreateSymbolicLink routine|url=http://msdn.microsoft.com/en-us/library/windows/hardware/ff549043%28v=vs.85%29.aspx|work=MSDN|publisher=Microsoft Corporation|accessdate=28 April 2014}}</ref> api calls to do it.

== Registry hives ==
{{Main|Windows Registry}}
A '''registry hive''', or simply a '''registry''', is a filesystem optimized for small files.<ref>{{cite book|last=Tanenbaum|first=Andrew S.|title=Modern operating systems|date=2008|publisher=Pearson Prentice Hall|location=Upper Saddle River, N.J.|isbn=978-0136006633|pages=829|edition=3rd ed.}}</ref> Although sometimes referred as a database it does not include feature characteristic to databases such as indexing. It is loaded as early in the bootloader. Additional registry hives are mounted during system boot and login.


== Drivers == == Drivers ==
{{About|NT specific drivers||device driver}} {{further|Device driver}}
The list of drives to be loaded from the disk are retrieved from the <code>Services</code> key in the <code>SYSTEM</code> registry hive. That key stores device drivers, kernel processes and user processes. They are all collectively called "services" and are all stored mixed on the same place. The list of drivers to be loaded from the disk are retrieved from the <code>Services</code> key of the current control set's key in the <code>SYSTEM</code> registry hive. That key stores device drivers, kernel processes and user processes. They are all collectively called "services" and are all stored mixed on the same place.


During initialization or upon driver load request, the kernel transverses that tree looking for services tagged as kernel services. During initialization or upon driver load request, the kernel traverses that tree looking for services tagged as kernel services.


== See also == == See also ==
Line 168: Line 193:
* ] * ]


==References== == Notes ==
{{notelist}}<small>As mentioned in , the boot-time option <code>increaseuserva</code> and corresponding header in executable image is required for this feature.</small>
{{reflist}}

== References ==
{{Reflist}}


== Further reading == == Further reading ==
* Tanenbaum, Andrew. Modern Operating Systems (3rd Edition). 978-0136006633 * {{Cite book|last=Tanenbaum|first=Andrew S.|title=Modern Operating Systems|date=2008|publisher=]|location=Upper Saddle River, N.J.|isbn=978-0136006633|pages=829|edition=3rd}}
* {{Cite book|author1=Bruce Dang|author2=Alexandre Gazet|author3=Elias Bachaalany|title=Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation|date=2014|publisher=]|isbn=978-1118787311|pages=384}}
* Practical reverse engineering. 978-1118787311


==External links== == External links ==
* *
* *
* *


Line 183: Line 211:


] ]
] ]
]

{{Windows-stub}}

Latest revision as of 20:09, 25 December 2024

Windows NT kernel image This article is about a computer file that contains a part of the Windows NT kernel. For the Windows NT kernel itself, see Architecture of Windows NT.
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Ntoskrnl.exe" – news · newspapers · books · scholar · JSTOR (April 2014) (Learn how and when to remove this message)

ntoskrnl.exe (short for Windows NT operating system kernel executable), also known as the kernel image, contains the kernel and executive layers of the Microsoft Windows NT kernel, and is responsible for hardware abstraction, process handling, and memory management. In addition to the kernel and executive layers, it contains the cache manager, security reference monitor, memory manager, scheduler (Dispatcher), and blue screen of death (the prose and portions of the code).

Overview

x86 versions of ntoskrnl.exe depend on bootvid.dll, hal.dll and kdcom.dll (x64 variants of ntoskrnl.exe have these DLLs embedded in the kernel to improve performance). However, it is not a native application thus it is not linked against ntdll.dll. Instead, ntoskrnl.exe has its own entry point "KiSystemStartup" that calls the architecture-independent kernel initialization function. Because it requires a static copy of the C Runtime objects, the executable is usually about 10 MB in size.

In Windows XP and earlier, the Windows installation source ships four kernel image files to support uniprocessor systems, symmetric multiprocessor (SMP) systems, CPUs with PAE, and CPUs without PAE. Windows setup decides whether the system is uniprocessor or multiprocessor, then, installs both the PAE and non-PAE variants of the kernel image for the decided kind. On a multiprocessor system, Setup installs ntkrnlmp.exe and ntkrpamp.exe but renames them to ntoskrnl.exe and ntkrnlpa.exe respectively.

Starting with Windows Vista, Microsoft began unifying the kernel images as multi-core CPUs took to the market and PAE became mandatory.

Kernel image filenames
32-bit Windows
Filename Supports
SMP
Supports
PAE
32-bit kernel
ntoskrnl.exe No No
ntkrnlmp.exe Yes No
ntkrnlpa.exe No Yes
ntkrpamp.exe Yes Yes
64-bit kernel (x64 editions)
Filename Supports
SMP
Supports
57 bit VA
ntkrnlmp.exe Yes No
ntkrla57.exe Yes Yes


Windows kernel's architecture is structured so that everything is easy to understand. Functions and global variables use the, so called Pascal Case formatting with special (additional) prefixes in their names to differentiate parts of the kernel.

An example is IoCreateDevice and ObReferenceObjectByHandle. Both functions have different prefix names to differentiate critical managers within the kernel code: Io being used for I/O Manager functions and Ob for Object Manager functions.

Variations of these prefixes exist for internal functions that are not being exported by the kernel, such as adding an i after the first letter (e.g., Ki for “Kernel Internal”) or appending p to the full prefix (e.g., Psp for “Process Support Internal”).


The following table lists all prefixes.

NT favorable prefixes
Export
Prefix
Internal Prefix Meaning
Cc Ccp File system cache
Cm Cmp Configuration Manager, the kernel mode side of Windows Registry
Dbg Dbg Debugging aid functions, such as a software break point
Dbgk Dbgk A set of debugging functions that are being exposed to user mode through ntdll.dll
Ex Exp Windows executive, an "outer layer" of ntoskrnl.exe
FsRtl FsRtlp File system runtime library
Io Iop I/O manager
Ke Ki Core kernel routines
Kx Interrupt handling, semaphores, spinlocks, multithreading and context switching related functions
Ks Kernel streaming
Ldr Ldrp NT's PE Executables loader
Lpc Lpcp Local Procedure Call, an internal, undocumented, interprocess or user/kernel message passing mechanism
Lsa Lsap Local Security Authority
Mm Mi Memory management
Nls Nls Nls for Native Language Support (similar to code pages).
Ob Obp Object Manager
Po Pop Plug-and-play and power management
Ps Psp Process and thread management (task management)
Rtl Rtlp Runtime library, i.e., many utility functions that can be used by native applications, yet don't directly involve kernel support
Se Sep Security Manager, access token for the Win32 API
Vf Vi Driver Verifier
Zw/Nt Nt or Zw are system calls declared in ntdll.dll and ntoskrnl.exe. When called from ntdll.dll in user mode, these groups are almost exactly the same; they trap into kernel mode and call the equivalent function in ntoskrnl.exe via the SSDT. When calling the functions directly in ntoskrnl.exe (only possible in kernel mode), the Zw variants ensure kernel mode, whereas the Nt variants do not.

Initialization

When the kernel receives control, it gets a struct-type pointer from bootloader. The pointer's destination contains information about the hardware, the path to the Windows Registry file, kernel parameters containing boot preferences or options that change the behavior of the kernel, path of the files loaded by the bootloader (SYSTEM Registry hive, nls for character encoding conversion, and vga font). The definition of this structure can be retrieved by using the kernel debugger or downloading it from the Microsoft symbol database.

In the x86 architecture, the kernel receives the system already in protected mode, with the GDT, IDT and TSS ready. But since it does not know the address of each one, it has to load them one by one to fill the PCR structure.

The main entry point of ntoskrnl.exe performs some system dependent initialization then calls a system independent initialization then enters an idle loop.

Interrupt handling

This article is about NT implementation of interrupt handlers. For other uses, see Interrupt handling.

Modern operating systems use interrupts instead of I/O port polling to wait for information from devices.

In the x86 architecture, interrupts are handled through the Interrupt Dispatch Table (IDT). When a device triggers an interrupt and the interrupt flag (IF) in the FLAGS register is set, the processor's hardware looks for an interrupt handler in the table entry corresponding to the interrupt number to which in turn has been translated from IRQ by PIC chips, or in more modern hardwares, APIC. Interrupt handlers usually save some subset of the state of registers before handling it and restore them back to their original values when done.

The interrupt table contains handlers for hardware interrupts, software interrupts, and exceptions. For some IA-32 versions of the kernel, one example of such a software interrupt handler (of which there are many) is in its IDT table entry 2E16 (hexadecimal; 46 in decimal), used in assembly language as INT 2EH for system calls. In the real implementation the entry points to an internal subroutine named (as per symbol information published by Microsoft) KiSystemService. For newer versions, different mechanisms making use of SYSENTER instruction and in x86-64 SYSCALL instruction are used instead.

One notable feature of NT's interrupt handling is that interrupts are usually conditionally masked based on their priority (called "IRQL"), instead of disabling all IRQs via the interrupt flag. This permits various kernel components to carry on critical operations without necessarily blocking services of peripherals and other devices.

Memory manager

This article is about NT implementation of a memory manager. For other uses, see memory management.

The entire physical memory (RAM) address range is broken into many small blocks also called pages, 4KB in size each, and mapped to virtual addresses. A few of the properties of each block are stored in structures called page table entries, which are managed by the OS and accessed by the processor's hardware. Page tables are organized into a tree structure, and the physical page number of the top-level table is stored in control register 3 (CR3).

Microsoft Windows divides virtual address space into two regions. The lower part, starting at zero, is instantiated separately for each process and is accessible from both user and kernel mode. Application programs run in processes and supply code that runs in user mode. The upper part is accessible only from kernel mode, and with some exceptions, is instantiated just once, system-wide. ntoskrnl.exe is mapped into this region, as are several other kernel mode components. This region also contains data used by kernel mode code, such as the kernel mode heaps and the file system cache.

Virtual Address Space Layouts
Arch MmHighestUserAddress MmSystemRangeStart
x86 0x7fffffff 0x80000000
ARM
x86-64 0x000007ff'ffffffff(until Windows 8.1 Update 2)
0x00007fff'ffffffff(from Windows 8.1 Update 3)
0xffff8000'00000000

Registry

Further information: Windows Registry

Windows Registry is a repository for configuration and settings information for the operating system and for other software, such as applications. It can be thought of as a filesystem optimized for small files. However, it is not accessed through file system-like semantics, but rather through a specialized set of APIs, implemented in kernel mode and exposed to user mode.

The registry is stored on disk as several different files called "hives." One, the System hive, is loaded early in the boot sequence and provides configuration information required at that time. Additional registry hives, providing software-specific and user-specific data, are loaded during later phases of system initialization and during user login, respectively.

Drivers

Further information: Device driver

The list of drivers to be loaded from the disk are retrieved from the Services key of the current control set's key in the SYSTEM registry hive. That key stores device drivers, kernel processes and user processes. They are all collectively called "services" and are all stored mixed on the same place.

During initialization or upon driver load request, the kernel traverses that tree looking for services tagged as kernel services.

See also

Notes

  1. Tunable via /userva or /3gb switch.

As mentioned in Windows Internals Book 7th edition, the boot-time option increaseuserva and corresponding header in executable image is required for this feature.

References

  1. Russinovich, M: Systems Internals Tips and Trivia, SysInternals Information
  2. Microsoft Corporation (2009). "Cache Manager Routines". Microsoft Corporation. Retrieved 2009-06-13.
  3. Microsoft Corporation (2009). "File System Runtime Library Routines". Microsoft Corporation. Retrieved 2009-06-13.
  4. Microsoft Corporation (2009). "I/O Manager Routines". Microsoft Corporation. Retrieved 2009-06-13.
  5. Microsoft Corporation (2009). "Core Kernel Library Support Routines". Microsoft Corporation. Retrieved 2009-06-13.
  6. Microsoft Corporation (2009). "Power Manager Routines". Microsoft Corporation. Retrieved 2009-06-13.
  7. The NT Insider (August 27, 2003). "Nt vs. Zw - Clearing Confusion On The Native API". OSR Online. 10 (4). OSR Open Systems Resources. Retrieved 2013-09-16.
  8. "struct LOADER_PARAMETER_BLOCK". www.nirsoft.net.
  9. ^ Practical Reverse Engineering Using X86, X64, Arm, Windows Kernel, and Reversing Tools. John Wiley & Sons Inc. 2014. ISBN 978-1118787311.
  10. CC Hameed (January 22, 2008). "What is IRQL and why is it important? | Ask the Performance Team Blog". Microsoft Corporation. Retrieved 2018-11-11.
  11. Tanenbaum, Andrew S. (2008). Modern operating systems (3rd ed.). Upper Saddle River, N.J.: Pearson Prentice Hall. p. 829. ISBN 978-0136006633.

Further reading

  • Tanenbaum, Andrew S. (2008). Modern Operating Systems (3rd ed.). Upper Saddle River, N.J.: Pearson Prentice Hall. p. 829. ISBN 978-0136006633.
  • Bruce Dang; Alexandre Gazet; Elias Bachaalany (2014). Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation. Wiley. p. 384. ISBN 978-1118787311.

External links

Microsoft Windows components
Management
tools
Apps
Shell
Services
File systems
Server
Architecture
Security
Compatibility
API
Games
Discontinued
Games
Apps
Others
Spun off to
Microsoft Store
Categories:
Ntoskrnl.exe: Difference between revisions Add topic