About Duffbert...

Duffbert's Random Musings is a blog where I talk about whatever happens to be running through my head at any given moment... I'm Thomas Duff, and you can find out more about me here...

Email Me!

Search This Site!

Custom Search

I'm published!

Co-author of the book IBM Lotus Sametime 8 Essentials: A User's Guide
SametimeBookCoverImage.jpg

Purchase on Amazon

Co-author of the book IBM Sametime 8.5.2 Administration Guide
SametimeAdminBookCoverImage.jpg

Purchase on Amazon

MiscLinks

Visitor Count...



View My Stats

« Book Review - Developing J2EE Applications With IBM WebSphere Studio by Howard Kushner | Main| Is it Possible To Create an Input Validation of a Rich Text Field? »

The Indexer and Its Functionality

Category Software Development

Good synopsis article from the KnowledgeBase...

The Indexer and Its Functionality

Document Number:  7003075

The Indexer and its Functionality

Introduction

Within a production environment, it is extremely important to keep all information within a database current.  Notes accomplishes this task through a function called the Indexer.  This technical paper explains what the Indexer is and how it functions within Notes.

What is the Indexer?


The Indexer is composed of three components:

1.  The Update Task
2.  The Updall Task
3.  The Notes Indexing Facility (NIF) Subsystem

The Indexer as a whole is responsible for keeping the active views within databases current.  It is the Indexer that processes any requests for changes that have been made to documents within a database, so that the active view collections and the full text indexes display the most recent modifications and current information.

Update

Update is a Domino server task that should be running at all times.  It can be loaded either manually at the server console or specified in the ServerTasks= line in the NOTES.INI file.  When it is specified in the ServerTasks line, Update starts automatically at server startup.

Manually from the server console:  load update

In the NOTES.INI:  ServerTasks=update

The Update task works continuously from a queue called the $UpdateQueue.  When a change has been made in a database, such as deletions, additions, edits, a corresponding request is entered into the Update queue.  Update checks the queue every 5 seconds for any new requests that have been deposited, plucking the requests from the queue on a first-come first-served basis.

The $UpdateQueue is a hard-coded queue that has a maximum capacity of 500 requests.  These requests include updates to view indices as well as full text indexes.  When a request is placed into the queue, the actual names of the databases and paths to the databases are stored in the queue, not the view names themselves.

Suppression Time

Although Update checks the queue every 5 seconds, it does not refresh views at the same interval.  Instead, it uses what is known as the Update Suppression Time.  With Suppression Time, Update waits for multiple, similar requests to be deposited in the queue and then batches them.  In this way, Update processes all changes to a database at the same time.  Since the Indexer is the most CPU-intensive Domino server task, batching requests reduces the performance impact on a server significantly.  It is only after the Suppression Time has passed that Update forces the update of the view collection and the full text indexes as requested.  By default, the Suppression Time is 5 minutes; however, this can be overridden with the following NOTES.INI parameter:

Update_Suppression_Time=minutes

Which Views to Update within a Database?

How does the Indexer know which views to update, if just the database name is placed into the $UpdateQueue?  The Indexer does this by means of the following logic:

Each database has a database header.  The database header contains a field called DATAMODIFIEDTIME.  DATAMODIFIEDTIME is the last time that data was modified within a database.  

Each view collection also has a header.  This header contains a field called MODIFIEDTIME.  The MODIFIEDTIME is the last time the update process started on a particular view.  

When a request has been entered into the queue, the Update task compares the DataModifiedTime in the database header to the ModifiedTime of EVERY view in the database that has been placed in the queue.  If the ModifiedTime of a view is older than the DataModified Time in the database, Update will refresh the view.  Likewise, if the ModifiedTime of a view is more recent than the DataModifiedTime, the update task will skip the view and move on to the next one in the database.

A picture named M2


A picture named M3

Multiple Update Tasks

Beginning in R4, it is possible to have Multiple Update tasks running on a server.  To enable multiple Update tasks, one of the following steps must be performed:

1.        The following parameter must be added to the NOTES.INI:

Updaters = # of desired Update tasks to run on the server.

2.        Update must be loaded manually multiple times at the server console with the syntax shown below:

A picture named M4

As with running one Update task, the Indexer reads a request from the queue, removes it from the queue, and then performs the indexing functions on the view.  A single Update task works on a single database that is pulled from the queue.  If a second request comes into the queue, the next Updater then removes the request from the queue and starts working on it.  If both of these requests are for the same database, then the two tasks will work in tandem on the same database.  It is much more likely, however, that the two tasks will work on different databases.

Multiple Update tasks can update different view indexes within the same database at the same time.  However, multiple tasks cannot update the same view at the same time.  Nor can multiple Update tasks run on the same full text index simultaneously, since it is one index.

If one Update process is ended at the Domino server console (tell update quit), then ALL other Update tasks will terminate as well:

A picture named M5

The original theory behind loading multiple Updaters was to improve performance on the server and to expedite the Indexing process.  However, having multiple Updaters does not necessarily guarantee improved performance.  In fact, in most cases there is usually a performance degradation, although slight.  This is because the Update task may be in contention for the same database semaphore.

For example, if the Indexer sees a view that is already being rebuilt/refreshed by another Update task, it will skip the view and begin to refresh another one.  It does this check conservatively, however, since it would not be beneficial to skip a view and not index it at all.  As a result, there are some rare cases when a view is just starting to be updated, that two different Update tasks might think they are the first one to open a view and start indexing it.  To prevent two Update tasks from trying to update the same view at the same time, the first Update task locks a semaphore so that the other Update task cannot access the view.  The other Update task then waits until the semaphore is unlocked.

While this is occurring, the Log shows that both Updaters are indexing the same view at the same time, when in fact this is not possible.  One Update task is just waiting for the other to complete.

Indexer              Updating names.nsf view '($Locations)'      ---->  Updater #1
Indexer              Updating names.nsf view '($Locations)'      ----->  Updater #2

Updall

Updall is a single instance of Update.  It does not operate off a queue like Update but accomplishes the same tasks in a different way.  Updall runs once until it has processed every Notes database, refreshing their views and/or updating their full text indexes.  It then terminates and does not run again, unless loaded manually or until the next day.  By default, the Updall task is run at 2:00 AM.  It is specified in the ServerTasks line in the NOTES.INI as follows:

ServerTasksat2=Updall

Updall does, however, perform other functions that Update does not.  In the design of a view, it is possible to specify the frequency with which a view index is discarded (that is, Never, After each use, If inactive for XX days).  It is the responsibility of the Updall task to remove the view index if a discard option is specified.  Regardless of the discard Index option selected, the index is not actually discarded immediately.  Rather, the index is removed the next time the Updall tasks runs.  For example, a view has a Discard View option of "After each use."  If you exit from a view at 1:00 PM, the view will not be removed until Updall is run (usually done at 2:00 AM).

Switches

Updall can also be run with a variety of switches.  Running this task without a switch simply refreshes outdated views and full text indexes and discards view indexes.  The most commonly used switches and their functions are as follows:

- R  discards and rebuilds the view indexes
- X  discards and rebuilds the full text indexes
- C  builds the view indexes on a database that currently does not have any views built

When Updall -r is used to rebuild the views of a database, Updall will discard and then rebuild each view within the database one at a time.  It does not rebuild all views at once.  Updall -r continues until all views in the database have been rebuilt and then terminates.

NIF Subsystem

The Notes Indexing Facility (NIF) is made up of a multitude of functions within Notes that allow a Domino server to keep data notes ordered and current within a view.  Specifically, it does the following:

- Updates Indexes
- Opens and closes view collections
- Locates Index entries

The majority of  these requests are made by the server when users open and close databases.  For example, it is the NIF subsystem--not the update process itself--that forces the update of a view collection when a user switches between views.  If a user makes a modification to a document(s) within a view, and then switches quickly to another view and then back again to the original view, the new changes should be seen almost immediately to the user.  The reason is that the NIF subsystem is actually refreshing the view, not the Update task.

When a user or server opens a view, a call is made to NIFOpenCollection().  NIFOpenCollection(), depending on a flag passed into this function, checks to see whether the view is up to date.  If it is not, NIFOpenCollection() will call NIFUpdateCollection().  NIFUpdateCollection() forces the update of the view collection.  No request is put into the Update queue.  If some other process happened to put a request into the Update queue for this database, the Update task will not run on it.  It only updates views that are out of date.  If a view has already been brought up to date by NIF, the Update task will skip it and move on to the next view in the database.

NIFOpenCollection() is called in many places in the code.  In most, a flag is passed to make sure the collection is up to date before opening the view.  In a few cases, however, a flag is never passed to ensure that the view is current.  For example, when View Refresh frequency is set to "Manual", a flag is never passed when the Notes client calls NIFOpenCollection(), and consequently the view collection is not automatically refreshed.

A picture named M6

PAGES

The NIF Subsystem uses storage containers for caching views.  These containers are made up of units called PAGES.  Pages are used to store a portion of the index that is being viewed in memory.  They are faulted into memory on demand and, once in memory, are discardable.  It is the NSF_BUFFER_POOL that holds the pages of memory that contain the information pertaining to open views.

For example, if a user scrolls once or twice, the response is quick.  A few more scrolls, however, cause a delay.  This is because the user has gone past the current page that is cached, and a new page must be retrieved, with pages allotted in 4K blocks.  If every user is accessing the same view in a database, then one page is being used.  If five users are accessing five different views, then five pages of memory are being used.

The NIF Pool

The NIF Pool is a pool in Notes that controls the maximum amount of memory that is allotted for views and users of those views.  This pool consists of 384 subpools and currently has a maximum ceiling of 25MB in Notes 4.5x and 48 MB in Notes 4.6.  This is a hard-coded limit that cannot be adjusted in Notes R4.  Views take whatever memory is needed.  This pool resets upon reboot of the server.

Prior to Notes Release 4.5.3:  Notes used what is called the Optimize for Speed mode when searching for space in the NIF pool.  This means that Notes would search for a contiguous block of memory that it could use.  For example, if it needed 4MB of space, it would search and use an entire 4MB block.  Over time, however, this sometime led to fragmentation of the NIF pool, causing it to reach the 24MB ceiling faster.

Notes 4.5.3 and later:  Once one of the subpools within the NIF reaches 75% of its value, it changes from Optimize for Speed mode (it is in this mode until it reaches 75% of its value) to Optimize for Space mode.  At this point it begins a cleanup of the NIF.

Composition of a View

Every view is made up of a collection.  A collection is a container for a view that holds three different indexes, which are  as follows:

1.  An index sorted by UNID
2.  An index of parent--child documents
3.  A collation of the index  (This is how the view is sorted.)

These indexes are stored in what is called a B-Tree Structure, which is simply a means by which information can be stored and accessed quickly.

Every view in a database must be built before it can be accessed.  This is done through a VIEW NOTE.  Every view contains a view note, which stores all the information necessary to build the view.  This information is as follows:

$Title - Contains the title of the view.  For example, ($People)

$Formula - Contains the selection formula of the view.  For example:

@If(Owner = "" ; "Calendar Profile" ; "Calendar Profile for " + @Name([Abbreviate];Owner))

$Formula Class - Contains the selected notes class type.  For example, View Note.

$Collation -  Contains information on how documents are sorted within the view.  This can be specified in the Column Properties:

A picture named M7

$View Format - Contains all format and formula information.

$Collection - Contains a description of how the column looks.  For example, Font type.

$Collection - Created on the fly, if it does not already exist, when the view is opened.


A picture named M8


When a view is built, refreshed, or rebuilt, all three indexes within the collection are built/refreshed/rebuilt.


What Triggers the Indexer?

A view will be refreshed when any of the following occur:

1.        Router - When the Router deposits a message in a mail database, it also places a corresponding request into the Update queue.

2.        Replication - When databases replicate, the replicator places a request into the Update queue.

3.        Server - When a user logs a session or closes a session and modifications have been made to a database, a request is added to the Update Queue.

The Indexer plucks requests from the queue on a first-come first-served basis.  No request takes precedence over another.

A view will be rebuilt when any of the following occur:

1.        Updall -r is run manually on a view.

2.        The design of a view has been changed (the column or selection formulas).  The view needs to be recompiled.

3.        The Collation Table has changed.  The collation table specifies how the view is sorted.  If this changes, every index within the view collection must be rebuilt.

4.        SHIFT+F9 rebuilds the view you are currently in.


SERVER_NAME_LOOKUP_NOUPDATE=1

Under normal circumstances, the Indexer prevents a user or server from accessing a view when it is updating it.  The indexer locks a semaphore so that other tasks cannot use the view index until it is refreshed.  As a result, during the time that the Indexer is updating a view, several NAMELookup requests from users and servers queue up behind the Indexer, waiting for the view to be brought up to date.  The view is inaccessible until the Indexer has completed and the semaphore has been unlocked.

This poses a significant problem when users or server tasks such as the mail router attempt to access the Public Name and Address Book (NAB) while a view is being updated.  For this reason, the NOTES.INI parameter SERVER_NAME_LOOKUP_NOUPDATE=1 was introduced in Notes 4.13 and 4.51.  This parameter allows access to name lookups in views in the NAB only (this parameter does not work with other databases) by preventing name lookups from triggering a view update if there have been changes since the last time the view was accessed (NIF subsystem).  This is extremely helpful for mail routing and authentication purposes because processes are no longer denied access to views.

Although access to the views is granted, the view may not be current.  This is because it is being accessed before the Indexer has refreshed the view collection.  Since the view is not updated when a name lookup is done to the NAB, an entry is deposited into the $UpdateQueue.  The next time that Update runs it will process the request.

Server_Name_Lookup_Noupdate=1 does not work for complete rebuilds because when a view is marked as needing a complete rebuild, it cannot be read until it is rebuilt.  For example, if an Updall -r is issued on the NAB, this parameter will not prevent users from being locked out of the views within the NAB when attempting to access the server.

Name lookup views include the following:

($ServerAccess)
($Users)
($Groups)
($NameFieldLookup)


Summary

Notes depends heavily on the three processes that comprise the Indexer.  It is because of the Indexer that users are able to view the most current information that is available within Notes databases in a timely manner.  Because of the necessity to provide a production environment with current information as quickly as possible, the Indexer has become one of the most relied-upon functions in Notes.

Comments

Gravatar Image1 - Excellent explanation on story behind the updall and indexers.

Gravatar Image2 - I searched for this theme! The Indexer as a whole is responsible for keeping the active views within databases current. It is the Indexer that processes any requests for changes that have been made to documents within a database, so that the active view collections and the full text indexes display the most recent modifications and current information.

Post A Comment

:-D:-o:-p:-x:-(:-):-\:angry::cool::cry::emb::grin::huh::laugh::lips::rolleyes:;-)

Want to support this blog or just say thanks?

When you shop Amazon, start your shopping experience here.

When you do that, all your purchases during that session earn me an affiliate commission via the Amazon Affiliate program. You don't have to buy the book I linked you to (although I wouldn't complain!). Simply use that as your starting point.

Thanks!

Thomas "Duffbert" Duff

Ads of Relevance...