Computer 1998

Cover Feature

COMPUTER 0018-9162/98/$10.00 © 1998 IEEE
Vol. 31, No. 10: OCTOBER 1998, pp. 48-54

Contact the authors at Microsoft Corp., One Microsoft Way, Redmond, WA 98052; cabrera@microsoft.com.

Advances in Windows NT Storage Management

Luis Felipe Cabrera Microsoft Corporation
Brian Andrew Microsoft Corporation
Kyle Peltonen Microsoft Corporation
Norbert Kusters Microsoft Corporation

Since the design of Windows NT 4.0 stabilized in 1995, the needs of users have evolved, as have the characteristics of storage devices and storage media. Microsoft is introducing fundamental changes to storage management in Windows NT 5.0.

Since the design of Microsoft's Windows NT 4.0 operating system stabilized in 1995, the needs of our users have evolved, as have the characteristics of storage devices and storage media. This reality motivated us to introduce fundamental changes to storage management in Windows NT 5.0. The new infrastructure in NT 5.0 can adapt to future changes in storage devices and media, brings a more powerful set of management mechanisms, and supports more base services within the system. Most of the changes are transparent to legacy applications and to users, but software producers who want to deploy advanced functionality should realize increased benefits from the underlying infrastructure.
NT 5.0 now supports the dynamic plug-and-play of storage devices and the growth and mounting of storage volumes. It also frees the Windows NT-based system from drive letter limitations and incorporates several new file system features to aid the development of enterprise-class storage management solutions.

STORAGE DEVICE MANAGEMENT

Windows NT 5.0 incorporates significant changes to its storage management facilities for detecting devices and for administering devices that use removable storage.¹
The manner in which devices are detected and incorporated into the system has changed from a rather static one to a dynamic, plug-and-play procedure. NT 5.0 creates a corresponding device object whenever it detects new hardware, so there is no need to restart the system. In an analogous manner, a device can be disconnected from the system at any time.
In earlier versions of NT, each device that handled removable storagesuch as CD players, magnetic tape drives, and libraries of optical media or magnetic tapeswas administered in complete isolation. Therefore, two applications or system services that wanted to share such a device had no system infrastructure to help them. NT 5.0 will provide a new interface for administering removable storage as well as a new miniclass device driver for libraries. You can now write device-independent applications that share these resources in a uniform manner.
Plug-and-play storage
The plug-and-play facilities of Windows NT required a new system infrastructure. We defined a new family of I/O Request Packets, IRPs, to communicate all new operations and events associated with storage within the NT kernel.
We also wrote a kernel subsystem to perform bus enumeration and dynamic device detection. The dynamic detection of storage hardware varies, however, depending on the kind of bus to which the device is connected. SCSI devices, for example, require that the bus be refreshed to identify newly connected hardware. Newer bus architectures, such as USB, include an automated notification mechanism that signals to the operating system when a device has been connected.
We developed two additional kernel subsystems, Partition Manager and Mount Manager, to complement device detection. Both subsystems use software abstractions to identify storage volumes and volume mount points. These subsystems adapt their state to the dynamic behavior of present devices. Thus, a file system built in a storage volume that resides in a Jaz drive, for example, can begin to operate when media is inserted into the system and cease to operate when the media is ejected from the system.
Removable storage management
The central storage abstraction presented by the Removable Storage Manager (RSM) is that of a media pool. A media pool is a named collection of homogeneous media elements. Media pools have attributes, much like files, including an owner and an access control list. Thus, media pools can coexist in a system and be accessed only by those explicitly authorized to do so.
Applications that use removable storage define media pools for their purposes. In addition, the system administers two generic pools, the free pool and the unrecognized pool. The free pool is defined for media that have been explicitly deallocated from a media pool by an application. The unrecognized pool is defined for media that have been detected in the system and not yet recognized by any application.
In addition to storage abstractions like these, the RSM implements interfaces that enable the administration of devices that operate on removable storage media. As Figure 1 shows, the interface includes calls to mount and dismount media, to inventory the media in a library, to administer sessions, to control the libraries, and to support operator calls.

Figure 1. Block diagram of selected I/O components in Windows NT.

To administer this kind of storage, the RSM provides unique identifiers for each media identified in the system. (For those devices that support bar codes, the library manager makes use of them as part of its identification process.) The service maintains an internal database of all information needed to administer removable storage. All updates to this information are done in an atomic, all-or-nothing manner.²

MANAGEMENT OF STORAGE VOLUMES

A storage volume is a self-contained unit of storage that appears to a file system as one contiguous range of disk sectors. These sectors, in turn, may originate from underlying partitions of one or more disk devices. The volume is the storage abstraction needed by file systems to operate.
The Partition Manager (PM) is a new kernel subsystem in charge of inspecting the disk devices detected by the plug-and-play subsystem and capturing the partitions that are present in such devices. The PM operates on the disk devices found by the plug-and-play subsystem and exports partitions to the volume managers.
In Windows NT 5.0, several volume managers may be operating concurrently. The volume managers, in turn, export volumes to the Mount Manager (MM), which assigns different kinds of names to them so that the NT file systems (NTFS), NT object manager, and user-level applications can access the underlying storage. In NT, a storage volume needs to be mounted before any operation can be issued on it.
Dynamic volume growth and contraction
A new capability in Windows NT 5.0 is that volume managers can grow and contract storage volumes online, with no need to restart the computer. This dynamic behavior is obtained by manipulating the system information that describes the state of a volume in a manner that can be modified in an all-or-nothing manner even in the presence of system failures.
An additional characteristic of this volume system data is that it is contained within the volume itself, thus enabling the transport of a volume between computers. This is especially useful for NT Clusters, in which multiple paths to storage devices are used to achieve the ability to "fail over" access to a volume from one computer to another.
Volume mount points
In NT 4.0, each storage volume had an associated drive letter. This policy unnecessarily restricted the number of volumes that could be accessed at a given time. In NT 5.0, this restriction will be lifted: A volume can be given an arbitrary directory name in a containing volume; a drive letter is no longer necessary.
The two restrictions for mount points are that the containing volume be administered by NTFS and that the directory on which the mount point is established be empty.
The MM uses a GUID (globally unique identifier) to denote each volume present in the system. The GUID is used as part of an internal system name given to the volume and used to identify it for the NT object manager. The volume mount point establishes a final name relationship between a directory name and the internal system name. Because each volume in a system has a persistent GUID associated with it, NT 5.0 will ensure there is no possibility that a volume mount point refers to an incorrect underlying volume.
Volume mount points can be created, deleted, and renamed online, with no need to restart the system. Furthermore, in each volume there is state to track the volume mount points present in the volume. This enables the kernel to validate its internal state upon each restart of the system and the user interface to display efficiently all the volume mount points in the system.
Dynamic and forced volume dismount
A volume dismount operation is the mechanism to terminate access to a storage volume in a normal, orderly manner. A dismount operation can be issued on a volume at any time. Thereafter, I/O requests directed to a volume that has been dismounted are rejected by the I/O subsystem.
In NT 4.0, a dismount operation on a volume would not succeed if a handle was outstanding on objects belonging to that volume. In NT 5.0, the new infrastructure forces the dismount of a volume. Therefore, removable media that contains a file system can be ejected from a computer at any time.
A new notification system, based on new I/O Request Packets (IRPs), alerts handle owners that a dismount has been forced. It lets file systems and applications that have handles open close those handles and perform appropriate cleanup operations. All file systems in NT 5.0 have been enhanced to flush their cached state when they receive such an IRP. This guarantees that system state is saved in a self-consistent manner.

NEW FEATURES IN NTFS

Windows NT has several file systemsNTFS, FAT, FAT 32, CDFS, UDFand all but NTFS are compatibility file systems.³ Compatibility file systems provide a means to interchange files among different kinds of media. FAT (File Allocation Table), for example, is used in 3.5-inch diskettes; FAT 32 is used in Windows 98 volumes; CDFS (CD File System) enables access to data stored on CDs; and UDF (Universal Disk Format) is for data stored using the DVD data format.
Calls in the Win32 file and directory interfaces operate correctly, irrespective of the file system type in which the underlying files and directories are located. In NT, the systems administrator chooses a file system type when formatting a volume. For NT 5.0, the file system set by default when formatting a volume is NTFS.
NTFS 5.0 contains all the file system innovation. Because of the new features, the underlying on-disk data format has changed, and back-level versions of NTFS cannot access the version 5 volumes. (To enable interoperability between computers using NT 4.0 or NT 5.0, and to enable mixed versions in NT Clusters, the NTFS present in Service Pack 4 of NT 4.0 does access the version 5 volumes.) The set of new NTFS features can be divided in two categories: volume oriented and file or directory oriented.

VOLUME-ORIENTED FEATURES

NTFS version 5 has a control operation that expands its capacity to accommodate a larger underlying storage volume without needing to restart the system or be quiescent. All the underlying state needed to describe the additional storage availableextending the allocation map and the master file table, for exampleis built while the system continues to operate. Upon completion of this control operation, NTFS can make use of the new storage available to place files and directories.
To enable the dynamic dismount of NTFS, we had to evolve all aspects of the system to accommodate the fact that the underlying storage could cease to be accessible at any time. Support for dynamic dismount required first defining an internal resource to represent the existence of the underlying volume and then restructuring all internal operations so they could commence only when this resource was available. Error conditions could then be handled in a graceful manner.
The typical change involved adding instructions to acquire the volume resource in a shared manner every time that storage is to be accessed and then verifying that it is in a mounted state. If the resource is found in dismounted state, the operation returns an appropriate error condition and the resource is released.
Reliable change journal
A cornerstone in the ability to administer large numbers of files and directories within a volume is the ability to track succinctly all the changes to files and directories. The change journal is a new optional NTFS service for this purpose. In it, a record is written for each new change reason in an open/close session. Each record is given a 64-bit identity called the update sequence number (USN). The USN identities are assigned in a monotonically increasing manner.
There are 21 different change reasons that produce records in the journal. The reasons include creating and deleting directories or files, modifying or appending data to the unnamed and named data streams of a file, and changing the attributes of a file or directory (including renaming). In addition, when the last handle on a modified file or directory is closed, NTFS writes a close-reason record with a summary of all changes. A caller may request the USN of a specific close record to determine if the caller provoked a change. If all operations performed on a file during an open/close session are read-only, no records appear in the change journal for the session.
Associated with each open file or directory, NTFS has an internal structure to track the changes that have occurred up to a given point in time. If there are multiple handles on a given file, the activity through all of them is reflected in the same underlying structure. The change journal records are forced to disk only when the corresponding changes are made durable. In the event of a crash, NTFS reconstructs the state of each file and, when appropriate, adds close records in the change journal to reflect the changes that were applied but whose change records got lost. Records produced by the change journal typically are smaller than 128 bytes.
Figure 2 illustrates a C definition of a valid change record. The fundamental advantage of the change journal is that the information produced is proportional to the number of changes and not the number of entities in a volume. Thus, if only one file is modified in a volume with 200 million files, then the only records produced are the few that pertain to the one file. An activity like backup or file replication service, for example, need only pay attention to the one file that changed. Eliminating the need to periodically inspect all files in the name space helps storage management applications scale up to administer more entities.
  typedef struct {
   ULONG RecordLength;
   USHORT MajorVersion;
   USHORT MinorVersion;
   ULONGLONG FileReferenceNumber;
   ULONGLONG ParentFileReferenceNumber;
   USN Usn;
   LARGE_INTEGER TimeStamp;
   ULONG Reason;
   ULONG SourceInfo;
   ULONG SecurityId;
   ULONG FileAttributes;
   USHORT FileNameLength;
   USHORT FileNameOffset;
   WCHAR FileName[1];
  } USN_RECORD, *PUSN_RECORD;
Figure 2. C definition of a valid change record.

Change journal creation allows the specification of a maximum desired size of the journal within the volume. NTFS preserves the 64-bit identity of the last USN journal created and requires new identities to be different from previous ones. This enables callers to determine the specific instance of the change journal producing the records. The deletion operation traverses all the entries in the master file table (MFT) and sets the USN to zero. If a crash happens while this operation is in progress, NTFS recovers its state correctly and restarts it using information that has been preserved in persistent storage by NTFS.
Within NTFS, the change journal is stored as a sparse data stream in which only the appropriate range of the file uses disk allocation. The USN of a record, in effect, denotes the byte offset in this stream. As the active range moves forward through the stream, NTFS frees the storage used to hold earlier records in 4-Kbyte units.
A journal can be started and stopped via commands. By querying the first valid USN, a caller determines whether the journal has been stopped since the last time the caller inquired. Each time a journal is started, the first valid USN is set to that of the first record that is written. To help applications find all the files with changes between a given pair of USN values, there is an MFT enumeration call to return all entries whose USN is between the specified values. This enumeration call synchronizes its read-only traversal with concurrent update activity using a locking scheme that does not guarantee repeatable reads.²
In addition to the change reasons mentioned, the records also include a field, called SourceInfo, to identify what kind of event produced the change. Three kinds are differentiated in this first release: normal changes produced by applications or services, changes produced by storage management applications such as file replication or remote storage, and changes produced by applications that build auxiliary or derived data based on the primary data present in the file. An example of this last kind of application is one that produces thumbnail sketches out of complete images to be used as icons for the file. Thus, a content indexing service, for example, has the ability to ignore changes produced by the thumbnail application or by a remote storage service that do not change the contents of the file but only affect where the data is stored.
Current users of the change journal include the file content indexing service, the file replication service, and the remote storage service.
ACL check accelerator
In NT 4.0, all NTFS access control lists (ACLs) were stored individually. In version 5 an optimization was introduced to store each ACL present in the volume only once. Through the use of appropriate hash functions, a common storage area is administered for all ACLs. This has two benefits. The first is the reduced storage for all ACLs. The second is the reduced time to access and validate whether an access control check succeeds or fails. While the interface to the operations that administer ACLs remains the same, the internal implementation reduces the overhead per file. Entries in the file descriptors now only have a reference to their ACL instead of having the ACL itself.
Unique object identities
Structured storage has been present in the Windows platform for a long time, and embedding one document within another is an operation supported by all modern productivity applications. NTFS now supports assigning a 128-bit GUID to a file. This allows applications that want to embed a file within another file to do so and remain immune to inessential changes, such as the renaming of the embedded file. In addition, this mechanism allows appropriate services to track the relocation of a file across volumes in a computer and across computers within a domain.

FILE AND DIRECTORY-ORIENTED FEATURES

Windows NT 5.0 contains a number of new file and directory-oriented features. The two principal underlying motivations are to provide a file management infrastructure that will scale better and to enable third parties to build additional services.
Defrag of directories
Whereas NTFS 4.0 provided primitives for file-oriented defrag operations, NT 5.0 includes new directory-oriented features. NTFS 5.0 externalizes the underlying allocation of the B-tree indices used to represent directories in NTFS, enabling relocating directories to contiguous ranges of storage. Any third party may use these new control functions to build a defrag utility.
Sparse files
Since the inception of NTFS the offsets into a file and the length of a file have been 64-bit quantities. In version 5 of NTFS a file may have regions of its data represented implicitly by the system with no need to have explicit storage allocated to it. The value of the data in this region has to be zero, so that all reads directed to such a region return zeroes. This enables better support for scientific applications that operate on sparse vectors and sparse matrices. In addition, if a region of a file is never to be used again (such as a set of obsolete records in an append-only log), NTFS can zero-out the region and free the underlying storage.
Reparse points
Reparse points are a simple and elegant NT I/O solution that enables layered file system filters to add arbitrary behavior to a file or directory. The underlying mechanism modifies the normal file-open process, forcing its name parsing to restart with a new, user-controlled context. When a reparse point is encountered, the user-controlled, private reparse data is returned in an appropriate buffer and made available to all file system filters in the system.^4,⁵ The NTFS mechanism to support reparse points is a new system-controlled attribute called $REPARSE. This attribute can be set in any file or empty directory and the user-controlled data stored in it can be up to 16 Kbytes in length.
Reparse points are created using a set operation and destroyed using a delete operation. The contents of a reparse point are retrieved using a get operation. Each of these operations requires a handle on the reparse point obtained using the open (or create) operation, possibly in an appropriate new mode.
A mandatory characteristic of the data stored in $REPARSE is to have a 32-bit tag. NT I/O uses these tag values to differentiate between the various uses of reparse points. NT I/O has a set of predefined tags and a range of them reserved for Microsoft use. Third parties requiring tags obtain them through a facility in Redmond. The GUID value required in the reparse buffer is always associated with the same corresponding tag. This level of redundancy facilitates debugging and provides a more efficient identification scheme.
The NT I/O reparse tags have a minimum amount of structure built into them. Of the 32 bits, the low 16 bits are used to determine the kind of reparse point. Of the high 16 bits, 13 are reserved for future use and three are used to denote specific attributes of the tags and entities represented by the reparse point.
If the highest order bit is a one, this indicates a Microsoft tag; all ISVs use tags with a zero high-order bit. The second bit is the high-latency bit; when set to one, the UI will display this entity with an icon overlay to denote it as such. The third bit is the name surrogate indicator; when set to one, the reparse point represents another named entity in the system. This structure enables applications to recognize generic characteristics for different kinds of reparse points and also enables user interfaces to depict appropriate information. In the shell for NT 5.0, for example, files with high-latency reparse point tags are presented with a small hourglass icon overlay. Figure 3 shows a tabular summary of the bit structure of NT I/O reparse tags.
     3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1
     1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
     + + + + ----------------------- + --------------------------- +
     M L N |     Reserved bits       |          Tag Value          |
     ---------------------------------------------------------------
     + + + + ----------------------- + --------------------------- +
Figure 3. Tabular summary of the bit structure of NT I/O reparse tags. The reparse tags are 32-bit ULONGs. M is the Microsoft bit; a "1" denotes a tag owned by Microsoft. L is the high-latency bit; a file with this tag is expected to have a long latency to retrieve the first byte of data. N is the name surrogate tag, with the file representing another named entity in the system.

The NT I/O subsystem is extensible by installing kernel-level file system filters to enrich NTFS base operations. These filters are installed in a system-controlled stack, and each layer has two opportunities to inspect I/O requests: when the request descends through the stack and passes the corresponding layer and, using IoCompletion routines, when the request returns from lower layers ascending the stack. NT also lets you control the relative order in the stack of these filters, using information stored in the registry. In Figure 4, for example, NTFS is placed in this stack below the file system filters that require NTFS as a service and above the device drivers that are used by NTFS.

Figure 4. Block diagram of one file system filter layered over the NTFS file system in the NT I/O stack.

When a reparse point is encountered during open, the file system returns to NT I/O the user-controlled contents of $REPARSE and uses the return code STATUS_REPARSE to signal this result. It also leaves unaltered the file name that was being parsed, and returns the length of the file name that remained to be parsed. A layered filter in the NT I/O stack can then identify and process this information. At the top of the I/O subsystem stack, NT I/O can identify the case of name grafting operations and begin the parsing afresh.
FS filters operate efficiently because they only need to process open IRPs returning from the NTFS. They do so by having appropriate completion routines that are invoked by NT I/O. Once they determine what is to be done for a given request, they can establish appropriate state to track future IRPs directed to the underlying NTFS file object.
The reparse point mechanism encourages third-party FS filters adding advanced storage management features. Current uses of reparse points within Microsoft include the single-instance storage service, the remote storage service, the native structured storage service, and the name redirection service used in volume name points.
When a reparse point is used by a hierarchical storage manager service to denote the part of a file's state that is in remote storage, the reparse point data becomes an extension of the data needed in NTFS to retrieve a file. In effect, the reparse point implemented in NT 5.0 has all the allocation information necessary to identify and retrieve the state of a file stored remotely.

NEW FILE-BASED SERVICES

All the new in-the-box storage management services that are described in this section help administer large collections of files. Some save storage, complementing the role of data compression; some help users find what is stored and the file name long forgotten; and some manage proximity to data, reducing access latency.
File replication
The administration of large collections of machines that share files is simplified with an appropriate file replication service. By establishing that a given directory with its subdirectories is to be replicated in a set of computers, users can access these common files without a latency penalty. If the collection being replicated is used mostly for reads with little update activity, keeping the complete collection in perfect replication is done with minimal overhead. The replication scheme is multimaster and does not require clock synchronization to operate. Thus, updates may originate from any members of a so-called replica set. Updates are then propagated to the complete replica set, taking into account the underlying topology and overall network traffic.
The file replication service uses the change journal to detect updates of interest. This reduces the update overhead to a minimum, as only the files that change produce actions on the system. After the initial setup, files that do not change are never accessed by the replication service.
Volume-wide services to optimize storage use
Windows NT 5.0 will provide two services for the efficient and economical management of storage in both server-based and remote-storage environments.
Single-instance storage. In server-based environments, it is common to find duplicate files in a volume. Most often this is due to system binaries, application files (like fonts), or shared libraries stored by multiple users. The single-instance storage service identifies data streams that are duplicated in a volume and replaces them with appropriate references to a common copy. If the common data is modified through any reference, the system uses a copy-on-write mechanism to produce a new version of the data that is accessed instead of the common copy.
The advantages of reducing the number of copies stored in a volume include space savings and simplified administration. In the case of system binaries, for example, the deployment of a new version of a program is facilitated if the administrator needs to update only one file.
Remote storage. One truism about data storage is that the set of actively accessed files changes over time, leaving behind a number of files that are seldom accessed. Managing the placement of files in different storage pools in a hierarchical manner captures this characteristic of the storage workload. However, hierarchical storage management also introduces the crucial problem of access latency.
With the remote storage services present in NT 5.0, we have begun to support high-latency data in several layers of the system. These services implement a policy whereby data in remote storage is transparently recalled to local storage when it needs updating. Moreover, operations that update file attributes do not produce a recall. For example, if a user sets the compression attribute on a high-latency file, NT will not simply recall the data and do the compression, but will defer this action until the data is actually recalled.
By providing a reference implementation that preserves the correct operation of legacy applications without adding unnecessary "no space" error conditions, we can extend storage capacity via remote storage while minimizing unexpected behavior.
Content indexing
The ability to have an index of the content of all files in a system enables a complete new class of queries that help find long-forgotten files. Given that even low-end PCs now have four- and six-gigabyte disks, the chances of not finding a stored file are high. The content index service shown in Figure 5 builds appropriate indices that help identify files by their content. With this service users can search using any set of words and use operators like "next to," in addition to the traditional Boolean "or" and "and."

Figure 5. Block diagram of the content indexing service. Content filters use the IFilter interface.

The indexing technology is extensible. Any third party can install recognizer software that will produce the appropriate content for the indexing engine. There is a public interface, IFilter, which abstracts all the operations needed to index content. In addition, the service will also recognize the short-lived nature of temporary files and files that are being worked on and inhibit their indexing until the updates have finished.

CONCLUSION

This article describes two new mechanisms present in Windows NT 5.0, the reliable change journal and the reparse points. It also presents storage management infrastructure subsystems and services that help administer storage. We believe that this version of Windows NT meets better than ever the underlying goal for storage management: to facilitate the robust and efficient administration of an increasing volume of data whose lifetime grows over time.

Acknowledgments

The work presented in this article is the effort of several Microsoft and external teams. We thank Lou Perazzoli, Frank Artale, Keith Kaplan, Mark Zbikowski, Bob Rinne, Tom Miller, Sunil Pai, Bill Bolosky, David Orbits, Glenn Thompson, Kevin Phaup, Nikhil Joshi, and the cast of hundreds that made Windows NT 5.0 a reality.
The information contained in this article represents the current view of Microsoft Corp. as of the date of publication. Due to ongoing development work and changing market conditions, it should not be interpreted as a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.
References

1. D.A. Solomon, Inside Windows NT, 2nd ed., Microsoft Press, Redmond, Wash., 1998.
2. L.F. Cabrera et al., "Implementing Atomicity in Two Systems: Techniques, Trade-offs, and Experience," IEEE Trans. Software Eng., Oct. 1993, pp. 950-961.
3. H. Custer, Windows NT File System, Microsoft Press, Redmond, Wash., 1994.
4. J. Richter, Advanced Windows, 3rd ed., Microsoft Press, Redmond, Wash., 1997.
5. R. Nagar, Windows NT File System Internals: A Developer's Guide, O'Reilly, Sebastapol, Calif., 1997.
Luis Felipe Cabrera is an architect in the NT Base Development group. His areas of responsibility include remote storage, backup and restore, removable storage, and automated system recovery. For NT 5.0 he participated in the deployment of reparse points and volume mount points. He joined Microsoft in 1996. Cabrera holds a PhD from the University of California at Berkeley.
Brian Andrew has been a software developer in the Windows NT group since 1990. He has worked on NTFS since its original implementation. For NT 5.0 he participated in the deployment of the change journal, sparse files, and several other features.
Kyle Peltonen has been a software developer at Microsoft since 1989. He is the lead developer for Microsoft Index Server and has been working on full-text search and retrieval since 1991. Before joining Microsoft, he earned a BS at the Massachusetts Institute of Technology.
Norbert Kusters has been a software developer at Microsoft in the Windows NT group since 1990. For NT 5.0 he participated in the deployment of volume management, the Partition Manager, and the Mount Manager.