1 files changed, 1235 insertions, 0 deletions
diff --git a/doc/rfc/rfc1224.txt b/doc/rfc/rfc1224.txt
new file mode 100644
index 0000000..06d838f
--- /dev/null
+++ b/doc/rfc/rfc1224.txt
@@ -0,0 +1,1235 @@
+
+
+
+
+
+
+Network Working Group                                       L. Steinberg
+Request for Comments: 1224                               IBM Corporation
+                                                                May 1991
+
+
+
+        Techniques for Managing Asynchronously Generated Alerts
+
+Status of this Memo
+
+   This memo defines common mechanisms for managing asynchronously
+   produced alerts in a manner consistent with current network
+   management protocols.
+
+   This memo specifies an Experimental Protocol for the Internet
+   community.  Discussion and suggestions for improvement are requested.
+   Please refer to the current edition of the "IAB Official Protocol
+   Standards" for the standardization state and status of this protocol.
+   Distribution of this memo is unlimited.
+
+Abstract
+
+   This RFC explores mechanisms to prevent a remotely managed entity
+   from burdening a manager or network with an unexpected amount of
+   network management information, and to ensure delivery of "important"
+   information.  The focus is on controlling the flow of asynchronously
+   generated information, and not how the information is generated.
+
+Table of Contents
+
+   1. Introduction...................................................  2
+   2. Problem Definition.............................................  3
+   2.1 Polling Advantages............................................  3
+    (a) Reliable detection of failures...............................  3
+    (b) Reduced protocol complexity on managed entity................  3
+    (c) Reduced performance impact on managed entity.................  3
+    (d) Reduced configuration requirements to manage remote entity...  4
+   2.2 Polling Disadvantages.........................................  4
+    (a) Response time for problem detection..........................  4
+    (b) Volume of network management traffic generated...............  4
+   2.3 Alert Advantages..............................................  5
+    (a) Real-time knowledge of problems..............................  5
+    (b) Minimal amount of network management traffic.................  5
+   2.4 Alert Disadvantages...........................................  5
+    (a) Potential loss of critical information.......................  5
+    (b) Potential to over-inform a manager...........................  5
+   3. Specific Goals of this Memo....................................  6
+   4. Compatibility with Existing Network Management Protocols.......  6
+
+
+
+Steinberg                                                       [Page 1]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+   5. Closed Loop "Feedback" Alert Reporting with a "Pin" Sliding
+      Window Limit...................................................  6
+   5.1 Use of Feedback...............................................  7
+   5.1.1 Example.....................................................  8
+   5.2 Notes on Feedback/Pin usage...................................  8
+   6. Polled, Logged Alerts..........................................  9
+   6.1 Use of Polled, Logged Alerts.................................. 10
+   6.1.1 Example..................................................... 12
+   6.2 Notes on Polled, Logged Alerts................................ 12
+   7. Compatibility with SNMP and CMOT .............................. 14
+   7.1 Closed Loop Feedback Alert Reporting.......................... 14
+   7.1.1 Use of Feedback with SNMP................................... 14
+   7.1.2 Use of Feedback with CMOT................................... 14
+   7.2 Polled, Logged Alerts......................................... 14
+   7.2.1 Use of Polled, Logged Alerts with SNMP...................... 14
+   7.2.2 Use of Polled, Logged Alerts with CMOT...................... 15
+   8. Notes on Multiple Manager Environments......................... 15
+   9. Summary........................................................ 16
+   10. References.................................................... 16
+   11. Acknowledgements.............................................. 17
+   Appendix A.  Example of polling costs............................. 17
+   Appendix B.  MIB object definitions............................... 19
+   Security Considerations........................................... 22
+   Author's Address.................................................. 22
+
+1.  Introduction
+
+   This memo defines mechanisms to prevent a remotely managed entity
+   from burdening a manager or network with an unexpected amount of
+   network management information, and to ensure delivery of "important"
+   information.  The focus is on controlling the flow of asynchronously
+   generated information, and not how the information is generated.
+   Mechanisms for generating and controlling the generation of
+   asynchronous information may involve protocol specific issues.
+
+   There are two understood mechanisms for transferring network
+   management information from a managed entity to a manager: request-
+   response driven polling, and the unsolicited sending of "alerts".
+   Alerts are defined as any management information delivered to a
+   manager that is not the result of a specific query.  Advantages and
+   disadvantages exist within each method.  They are detailed in section
+   2 below.
+
+   Alerts in a failing system can be generated so rapidly that they
+   adversely impact functioning resources.  They may also fail to be
+   delivered, and critical information maybe lost.  Methods are needed
+   both to limit the volume of alert transmission and to assist in
+   delivering a minimum amount of information to a manager.
+
+
+
+Steinberg                                                       [Page 2]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+   It is our belief that managed agents capable of asynchronously
+   generating alerts should attempt to adopt mechanisms that fill both
+   of these needs.  For reasons shown in section 2.4, it is necessary to
+   fulfill both alert-management requirements.  A complete alert-driven
+   system must ensure that alerts are delivered or their loss detected
+   with a means to recreate the lost information, AND it must not allow
+   itself to overburden its manager with an unreasonable amount of
+   information.
+
+2.  Problem Definition
+
+   The following discusses the relative advantages and disadvantages of
+   polled vs. alert driven management.
+
+2.1  Polling Advantages
+
+   (a) Reliable detection of failures.
+
+          A manager that polls for all of its information can
+          more readily determine machine and network failures;
+          a lack of a response to a query indicates problems
+          with the machine or network.   A manager relying on
+          notification of problems might assume that a faulty
+          system is good, should the alert be unable to reach
+          its destination, or the managed system be unable to
+          correctly generate the alert.  Examples of this
+          include network failures (in which an isolated network
+          cannot deliver the alert), and power failures (in which
+          a failing machine cannot generate an alert).  More
+          subtle forms of failure in the managed entity might
+          produce an incorrectly generated alert, or no alert at
+          all.
+
+   (b) Reduced protocol complexity on managed entity
+
+          The use of a request-response based system is based on
+          conservative assumptions about the underlying transport
+          protocol.  Timeouts and retransmits (re-requests) can
+          be built into the manager.  In addition, this allows
+          the manager to affect the amount of network management
+          information flowing across the network directly.
+
+   (c) Reduced performance impact on managed entity
+
+          In a purely polled system, there is no danger of having
+          to often test for an alert condition.  This testing
+          takes CPU cycles away from the real mission of the
+          managed entity.  Clearly, testing a threshold on each
+
+
+
+Steinberg                                                       [Page 3]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+          packet received could have unwanted performance effects
+          on machines such as gateways.  Those who wish to use
+          thresholds and alerts must choose the parameters to be
+          tested with great care, and should be strongly
+          discouraged from updating statistics and checking values
+          frequently.
+
+   (d) Reduced Configuration Requirements to manage remote
+          entity
+
+          Remote, managed entities need not be configured
+          with one or more destinations for reporting information.
+          Instead, the entity merely responds to whomever
+          makes a specific request.  When changing the network
+          configuration, there is never a need to reconfigure
+          all remote manageable systems.  In addition, any number
+          of "authorized" managers (i.e., those passing any
+          authentication tests imposed by the network management
+          protocol) may obtain information from any managed entity.
+          This occurs without reconfiguring the entity and
+          without reaching an entity-imposed limit on the maximum
+          number of potential managers.
+
+2.2  Polling Disadvantages
+
+   (a) Response time for problem detection
+
+          Having to poll many MIB [2] variables per machine on
+          a large number of machines is itself a real
+          problem.  The ability of a manager to monitor
+          such a system is limited; should a system fail
+          shortly after being polled there may be a significant
+          delay before it is polled again.  During this time,
+          the manager must assume that a failing system is
+          acceptable.  See Appendix A for a hypothetical
+          example of such a system.
+
+          It is worthwhile to note that while improving the mean
+          time to detect failures might not greatly improve the
+          time to correct the failure, the problem will generally
+          not be repaired until it is detected.  In addition,
+          most network managers would prefer to at least detect
+          faults before network users start phoning in.
+
+   (b) Volume of network management traffic
+
+          Polling many objects (MIB variables) on many machines
+          greatly increases the amount of network management
+
+
+
+Steinberg                                                       [Page 4]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+          traffic flowing across the network (see Appendix A).
+          While it is possible to minimize this through the use
+          of hierarchies (polling a machine for a general status
+          of all the machines it polls), this aggravates the
+          response time problem previously discussed.
+
+2.3  Alert Advantages
+
+   (a) Real-time Knowledge of Problems
+
+          Allowing the manager to be notified of problems
+          eliminates the delay imposed by polling many objects/
+          systems in a loop.
+
+   (b) Minimal amount of Network Management Traffic
+
+          Alerts are transmitted only due to detected errors.
+          By removing the need to transfer large amounts of status
+          information that merely demonstrate a healthy system,
+          network and system (machine processor) resources may be
+          freed to accomplish their primary mission.
+
+2.4  Alert Disadvantages
+
+   (a) Potential Loss of Critical Information
+
+          Alerts are most likely not to be delivered when the
+          managed entity fails (power supply fails) or the
+          network experiences problems (saturated or isolated).
+          It is important to remember that failing machines and
+          networks cannot be trusted to inform a manager that
+          they are failing.
+
+   (b) Potential to Over-inform the Manager
+
+          An "open loop" system in which the flow of alerts to
+          a manager is fully asynchronous can result in an excess
+          of alerts being delivered (e.g., link up/down messages
+          when lines vacillate).  This information places an extra
+          burden on a strained network, and could prevent the
+          manager from disabling the mechanism generating the
+          alerts; all available network bandwidth into the manager
+          could be saturated with incoming alerts.
+
+   Most major network management systems strive to use an optimal
+   combination of alerts and polling.  Doing so preserves the advantages
+   of each while eliminating the disadvantages of pure polling.
+
+
+
+
+Steinberg                                                       [Page 5]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+3.  Specific Goals of this Memo
+
+   This memo suggests mechanisms to minimize the disadvantages of alert
+   usage.  An optimal system recognizes the potential problems
+   associated with sending too many alerts in which a manager becomes
+   ineffective at managing, and not adequately using alerts (especially
+   given the volumes of data that must be actively monitored with poor
+   scaling).  It is the author's belief that this is best done by
+   allowing alert mechanisms that "close down" automatically when over-
+   delivering asynchronous (unexpected) alerts, and that also allow a
+   flow of synchronous alert information through a polled log.  The use
+   of "feedback" (with a sliding window "pin") discussed in section 5
+   addresses the former need, while the discussion in section 6 on
+   "polled, logged alerts" does the latter.
+
+   This memo does not attempt to define mechanisms for controlling the
+   asynchronous generation of alerts, as such matters deal with
+   specifics of the management protocol.  In addition, no attempt is
+   made to define what the content of an alert should be.  The feedback
+   mechanism does require the addition of a single alert type, but this
+   is not meant to impact or influence the techniques for generating any
+   other alert (and can itself be generated from a MIB object or the
+   management protocol).  To make any effective use of the alert
+   mechanisms described in this memo, implementation of several MIB
+   objects is required in the relevant managed systems.  The location of
+   these objects in the MIB is under an experimental subtree delegated
+   to the Alert-Man working group of the Internet Engineering Task Force
+   (IETF) and published in the "Assigned Numbers" RFC [5].  Currently,
+   this subtree is defined as
+
+         alertMan ::= { experimental 24 }.
+
+4.  Compatibility With Existing Network Management Protocols
+
+   It is the intent of this document to suggest mechanisms that violate
+   neither the letter nor the spirit of the protocols expressed in CMOT
+   [3] and SNMP [4].  To achieve this goal, each mechanism described
+   will give an example of its conformant use with both SNMP and CMOT.
+
+5.  Closed Loop "Feedback" Alert Reporting with a "Pin" Sliding
+    Window Limit
+
+   One technique for preventing an excess of alerts from being delivered
+   involves required feedback to the managed agent.  The name "feedback"
+   describes a required positive response from a potentially "over-
+   reported" manager, before a remote agent may continue transmitting
+   alerts at a high rate.  A sliding window "pin" threshold (so named
+   for the metal on the end of a meter) is established as a part of a
+
+
+
+Steinberg                                                       [Page 6]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+   user-defined SNMP trap, or as a managed CMOT event.  This threshold
+   defines the maximum allowable number of alerts ("maxAlertsPerTime")
+   that may be transmitted by the agent, and the "windowTime" in seconds
+   that alerts are tested against.  Note that "maxAlertsPerTime"
+   represents the sum total of all alerts generated by the agent, and is
+   not duplicated for each type of alert that an agent might generate.
+   Both "maxAlertsPerTime" and "windowTime" are required MIB objects of
+   SMI [1] type INTEGER, must be readable, and may be writable should
+   the implementation permit it.
+
+   Two other items are required for the feedback technique.  The first
+   is a Boolean MIB object (SMI type is INTEGER, but it is treated as a
+   Boolean whose only value is zero, i.e., "FALSE") named
+   "alertsEnabled", which must have read and write access.  The second
+   is a user defined alert named "alertsDisabled".  Please see Appendix
+   B for their complete definitions.
+
+5.1  Use of Feedback
+
+   When an excess of alerts is being generated, as determined by the
+   total number of alerts exceeding "maxAlertsPerTime" within
+   "windowTime" seconds, the agent sets the Boolean value of
+   "alertsEnabled" to "FALSE" and sends a single alert of type
+   "alertsDisabled".
+
+   Again, the pin mechanism operates on the sum total of all alerts
+   generated by the remote system.  Feedback is implemented once per
+   agent and not separately for each type of alert in each agent.  While
+   it is also possible to implement the Feedback/Pin technique on a per
+   alert-type basis, such a discussion belongs in a document dealing
+   with controlling the generation of individual alerts.
+
+   The typical use of feedback is detailed in the following steps:
+
+      (a)  Upon initialization of the agent, the value of
+           "alertsEnabled" is set to "TRUE".
+
+      (b)  Each time an alert is generated, the value of
+           "alertsEnabled" is tested.  Should the value be "FALSE",
+           no alert is sent.  If the value is "TRUE", the alert is
+           sent and the current time is stored locally.
+
+      (c)  If at least "maxAlertsPerTime" have been generated, the
+           agent calculates the difference of time stored for the
+           new alert from the time associated with alert generated
+           "maxAlertsPerTime" previously.  Should this amount be
+           less than "windowTime", a single alert of the type
+           "alertsDisabled" is sent to the manager and the value of
+
+
+
+Steinberg                                                       [Page 7]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+           "alertsEnabled" is then set to "FALSE".
+
+      (d)  When a manager receives an alert of the type "Alerts-
+           Disabled", it is expected to set "alertsEnabled" back
+           to "TRUE" to continue to receive alert reports.
+
+5.1.1  Example
+
+   In a sample system, the maximum number of alerts any single managed
+   entity may send the manager is 10 in any 3 second interval.  A
+   circular buffer with a maximum depth of 10 time of day elements is
+   defined to accommodate statistics keeping.
+
+   After the first 10 alerts have been sent, the managed entity tests
+   the time difference between its oldest and newest alerts.  By testing
+   the time for a fixed number of alerts, the system will never disable
+   itself merely because a few alerts were transmitted back to back.
+
+   The mechanism will disable reporting only after at least 10 alerts
+   have been sent, and the only if the last 10 all occurred within a 3
+   second interval.  As alerts are sent over time, the list maintains
+   data on the last 10 alerts only.
+
+5.2  Notes on Feedback/Pin Usage
+
+   A manager may periodically poll "alertsEnabled" in case an
+   "alertsDisabled" alert is not delivered by the network.  Some
+   implementers may also choose to add COUNTER MIB objects to show the
+   total number of alerts transmitted and dropped by "alertsEnabled"
+   being FALSE.  While these may yield some indication of the number of
+   lost alerts, the use of "Polled, Logged Alerts" offers a superset of
+   this function.
+
+   Testing the alert frequency need not begin until a minimum number of
+   alerts have been sent (the circular buffer is full).  Even then, the
+   actual test is the elapsed time to get a fixed number of alerts and
+   not the number of alerts in a given time period.  This eliminates the
+   need for complex averaging schemes (keeping current alerts per second
+   as a frequency and redetermining the current value based on the
+   previous value and the time of a new alert).  Also eliminated is the
+   problem of two back to back alerts; they may indeed appear to be a
+   large number of alerts per second, but the fact remains that there
+   are only two alerts.  This situation is unlikely to cause a problem
+   for any manager, and should not trigger the mechanism.
+
+   Since alerts are supposed to be generated infrequently, maintaining
+   the pin and testing the threshold should not impact normal
+   performance of the agent (managed entity).  While repeated testing
+
+
+
+Steinberg                                                       [Page 8]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+   may affect performance when an excess of alerts are being
+   transmitted, this effect would be minor compared to the cost of
+   generating and sending so many alerts.  Long before the cost of
+   testing (in CPU cycles) becomes relatively high, the feedback
+   mechanism should disable alert sending and affect savings both in
+   alert sending and its own testing (note that the list maintenance and
+   testing mechanisms disable themselves when they disable alert
+   reporting).  In addition, testing the value of "alertsEnabled" can
+   limit the CPU burden of building alerts that do not need to be sent.
+
+   It is advised that the implementer consider allowing write access to
+   both the window size and the number of alerts allowed in a window's
+   time.  In doing so, a management station has the option of varying
+   these parameters remotely before setting "alertsEnabled" to "TRUE".
+   Should either of these objects be set to 0, a conformant system will
+   disable the pin and feedback mechanisms and allow the agent to send
+   all of the alerts it generates.
+
+   While the feedback mechanism is not high in CPU utilization costs,
+   those implementing alerts of any kind are again cautioned to exercise
+   care that the alerts tested do not occur so frequently as to impact
+   the performance of the agent's primary function.
+
+   The user may prefer to send alerts via TCP to help ensure delivery of
+   the "alerts disabled" message, if available.
+
+   The feedback technique is effective for preventing the over-reporting
+   of alerts to a manager.  It does not assist with the problem of
+   "under-reporting" (see "polled, logged alerts" for this).
+
+   It is possible to lose alerts while "alertsEnabled" is "FALSE".
+   Ideally, the threshold of "maxAlertsPerTime" should be set
+   sufficiently high that "alertsEnabled" is only set to "FALSE" during
+   "over-reporting" situations.  To help prevent alerts from possibly
+   being lost when the threshold is exceeded, this method can be
+   combined with "polled, logged alerts" (see below).
+
+6.  Polled, Logged Alerts
+
+   A simple system that combines the request-response advantages of
+   polling while minimizing the disadvantages is "Polled, Logged
+   Alerts".  Through the addition of several MIB objects, one gains a
+   system that minimizes network management traffic, lends itself to
+   scaling, eliminates the reliance on delivery, and imposes no
+   potential over-reporting problems inherent in pure alert driven
+   architectures.  Minimizing network management traffic is affected by
+   reducing multiple requests to a single request.  This technique does
+   not eliminate the need for polling, but reduces the amount of data
+
+
+
+Steinberg                                                       [Page 9]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+   transferred and ensures the manager either alert delivery or
+   notification of an unreachable node.  Note again, the goal is to
+   address the needs of information (alert) flow and not to control the
+   local generation of alerts.
+
+6.1  Use of Polled, Logged Alerts
+
+   As alerts are generated by a remote managed entity, they are logged
+   locally in a table.  The manager may then poll a single MIB object to
+   determine if any number of alerts have been generated.  Each poll
+   request returns a copy of an "unacknowledged" alert from the alert
+   log, or an indication that the table is empty.  Upon receipt, the
+   manager might "acknowledge" any alert to remove it from the log.
+   Entries in the table must be readable, and can optionally allow the
+   user to remove them by writing to or deleting them.
+
+   This technique requires several additional MIB objects.  The
+   alert_log is a SEQUENCE OF logTable entries that must be readable,
+   and can optionally have a mechanism to remove entries (e.g., SNMP set
+   or CMOT delete).  An optional read-only MIB object of type INTEGER,
+   "maxLogTableEntries" gives the maximum number of log entries the
+   system will support.  Please see Appendix B for their complete
+   definitions.
+
+   The typical use of Polled, Logged Alerts is detailed below.
+
+      (a)  Upon initialization, the agent builds a pointer to a log
+           table.  The table is empty (a sequence of zero entries).
+
+      (b)  Each time a local alert is generated, a logTable entry
+           is built with the following information:
+
+      SEQUENCE {
+                 alertId          INTEGER,
+                 alertData        OPAQUE
+           }
+
+           (1) alertId number of type INTEGER, set to 1 greater
+               than the previously generated alertId.  If this is
+               the first alert generated, the value is initialized
+               to 1.  This value should wrap (reset) to 1 when it
+               reaches 2**32.  Note that the maximum log depth
+               cannot exceed (2**32)-1 entries.
+
+           (2) a copy of the alert encapsulated in an OPAQUE.
+
+      (c)  The new log element is added to the table.  Should
+           addition of the element exceed the defined maximum log
+
+
+
+Steinberg                                                      [Page 10]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+           table size, the oldest element in the table (having the
+           lowest alertId) is replaced by the new element.
+
+      (d)  A manager may poll the managed agent for either the next
+           alert in the alert_table, or for a copy of the alert
+           associated with a specific alertId.  A poll request must
+           indicate a specific alertId. The mechanism for obtaining
+           this information from a table is protocol specific, and
+           might use an SNMP GET or GET NEXT (with GET NEXT
+           following an instance of zero returning the first table
+           entry's alert) or CMOT's GET with scoping and filtering
+           to get alertData entries associated with alertId's
+           greater or less than a given instance.
+
+      (e)  An alertData GET request from a manager must always be
+           responded to with a reply of the entire OPAQUE alert
+           (SNMP TRAP, CMOT EVENT, etc.) or a protocol specific
+           reply indicating that the get request failed.
+
+           Note that the actual contents of the alert string, and
+           the format of those contents, are protocol specific.
+
+      (f)  Once an alert is logged in the local log, it is up to
+           the individual architecture and implementation whether
+           or not to also send a copy asynchronously to the
+           manager.  Doing so could be used to redirect the focus
+           of the polling (rather than waiting an average of 1/2
+           the poll cycle to learn of a problem), but does not
+           result in significant problems should the alert fail to
+           be delivered.
+
+      (g)  Should a manager request an alert with alertId of 0,
+           the reply shall be the appropriate protocol specific
+           error response.
+
+      (h)  If a manager requests the alert immediately following
+           the alert with alertId equal to 0, the reply will be the
+           first alert (or alerts, depending on the protocol used)
+           in the alert log.
+
+      (i)  A manager may remove a specific alert from the alert log
+           by naming the alertId of that alert and issuing a
+           protocol specific command (SET or DELETE).  If no such
+           alert exists, the operation is said to have failed and
+           such failure is reported to the manager in a protocol
+           specific manner.
+
+
+
+
+
+Steinberg                                                      [Page 11]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+6.1.1  Example
+
+   In a sample system (based on the example in Appendix A), a manager
+   must monitor 40 remote agents, each having between 2 and 15
+   parameters which indicate the relative health of the agent and the
+   network.  During normal monitoring, the manager is concerned only
+   with fault detection.  With an average poll request-response time of
+   5 seconds, the manager polls one MIB variable on each node.  This
+   involves one request and one reply packet of the format specified in
+   the XYZ network management protocol.  Each packet requires 120 bytes
+   "on the wire" (requesting a single object, ASN.1 encoded, IP and UDP
+   enveloped, and placed in an ethernet packet).  This results in a
+   serial poll cycle time of 3.3 minutes (40 nodes at 5 seconds each is
+   200 seconds), and a mean time to detect alert of slightly over 1.5
+   minutes.  The total amount of data transferred during a 3.3 minute
+   poll cycle is 9600 bytes (120 requests and 120 replies for each of 40
+   nodes).  With such a small amount of network management traffic per
+   minute, the poll rate might reasonably be doubled (assuming the
+   network performance permits it).  The result is 19200 bytes
+   transferred per cycle, and a mean time to detect failure of under 1
+   minute.  Parallel polling obviously yields similar improvements.
+
+   Should an alert be returned by a remote agent's log, the manager
+   notifies the operator and removes the element from the alert log by
+   setting it with SNMP or deleting it with CMOT.  Normal alert
+   detection procedures are then followed.  Those SNMP implementers who
+   prefer to not use SNMP SET for table entry deletes may always define
+   their log as "read only".  The fact that the manager made a single
+   query (to the log) and was able to determine which, if any, objects
+   merited special attention essentially means that the status of all
+   alert capable objects was monitored with a single request.
+
+   Continuing the above example, should a remote entity fail to respond
+   to two successive poll attempts, the operator is notified that the
+   agent is not reachable.  The operator may then choose (if so
+   equipped) to contact the agent through an alternate path (such as
+   serial line IP over a dial up modem).  Upon establishing such a
+   connection, the manager may then retrieve the contents of the alert
+   log for a chronological map of the failure's alerts.  Alerts
+   undelivered because of conditions that may no longer be present are
+   still available for analysis.
+
+6.2  Notes on Polled, Logged Alerts
+
+   Polled, logged alert techniques allow the tracking of many alerts
+   while actually monitoring only a single MIB object.  This
+   dramatically decreases the amount of network management data that
+   must flow across the network to determine the status.  By reducing
+
+
+
+Steinberg                                                      [Page 12]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+   the number of requests needed to track multiple objects (to one), the
+   poll cycle time is greatly improved.  This allows a faster poll cycle
+   (mean time to detect alert) with less overhead than would be caused
+   by pure polling.
+
+   In addition, this technique scales well to large networks, as the
+   concept of polling a single object to learn the status of many lends
+   itself well to hierarchies.  A proxy manager may be polled to learn
+   if he has found any alerts in the logs of the agents he polls.  Of
+   course, this scaling does not save on the mean time to learn of an
+   alert (the cycle times of the manager and the proxy manager must be
+   considered), but the amount of network management polling traffic is
+   concentrated at lower levels.  Only a small amount of such traffic
+   need be passed over the network's "backbone"; that is the traffic
+   generated by the request-response from the manager to the proxy
+   managers.
+
+   Note that it is best to return the oldest logged alert as the first
+   table entry.  This is the object most likely to be overwritten, and
+   every attempt should be made ensure that the manager has seen it.  In
+   a system where log entries may be removed by the manager, the manager
+   will probably wish to attempt to keep all remote alert logs empty to
+   reduce the number of alerts dropped or overwritten.  In any case, the
+   order in which table entries are returned is a function of the table
+   mechanism, and is implementation and/or protocol specific.
+
+   "Polled, logged alerts" offers all of the advantages inherent in
+   polling (reliable detection of failures, reduced agent complexity
+   with UDP, etc.), while minimizing the typical polling problems
+   (potentially shorter poll cycle time and reduced network management
+   traffic).
+
+   Finally, alerts are not lost when an agent is isolated from its
+   manager.  When a connection is reestablished, a history of conditions
+   that may no longer be in effect is available to the manager.  While
+   not a part of this document, it is worthwhile to note that this same
+   log architecture can be employed to archive alert and other
+   information on remote hosts.  However, such non-local storage is not
+   sufficient to meet the reliability requirements of "polled, logged
+   alerts".
+
+
+
+
+
+
+
+
+
+
+
+Steinberg                                                      [Page 13]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+7.  Compatibility with SNMP [4] and CMOT [3]
+
+7.1  Closed Loop (Feedback) Alert Reporting
+
+7.1.1  Use of Feedback with SNMP
+
+   At configuration time, an SNMP agent supporting Feedback/Pin is
+   loaded with default values of "windowTime" and "maxAlerts-PerTime",
+   and "alertsEnabled" is set to TRUE.  The manager issues an SNMP GET
+   to determine "maxAlertsPerTime" and "windowTime", and to verify the
+   state of "alertsEnabled".  Should the agent support setting Pin
+   objects, the manager may choose to alter these values (via an SNMP
+   SET).  The new values are calculated based upon known network
+   resource limitations (e.g., the amount of packets the manager's
+   gateway can support) and the number of agents potentially reporting
+   to this manager.
+
+   Upon receipt of an "alertsDisabled" trap, a manager whose state and
+   network are not overutilized immediately issues an SNMP SET to make
+   "alertsEnabled" TRUE.  Should an excessive number of "alertsDisabled"
+   traps regularly occur, the manager might revisit the values chosen
+   for implementing the Pin mechanism.  Note that an overutilized system
+   expects its manager to delay the resetting of "alertsEnabled".
+
+   As a part of each regular polling cycle, the manager includes a GET
+   REQUEST for the value of "alertsEnabled".  If this value is FALSE, it
+   is SET to TRUE, and the potential loss of traps (while it was FALSE)
+   is noted.
+
+7.1.2  Use of Feedback with CMOT
+
+   The use of CMOT in implementing Feedback/Pin is essentially identical
+   to the use of SNMP.  CMOT GET, SET, and EVENT replace their SNMP
+   counterparts.
+
+7.2  Polled, Logged Alerts
+
+7.2.1  Use of Polled, Logged alerts with SNMP
+
+   As a part of regular polling, an SNMP manager using Polled, logged
+   alerts may issue a GET_NEXT Request naming
+   { alertLog logTableEntry(1) alertId(1) 0 }.  Returned is either the
+   alertId of the first table entry or, if the table is empty, an SNMP
+   reply whose object is the "lexicographical successor" to the alert
+   log.
+
+   Should an "alertId" be returned, the manager issues an SNMP GET
+   naming { alertLog logTableEntry(1) alertData(2) value } where "value"
+
+
+
+Steinberg                                                      [Page 14]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+   is the alertId integer obtained from the previously described GET
+   NEXT.  This returns the SNMP TRAP encapsulated within an OPAQUE.
+
+   If the agent supports the deletion of table entries through SNMP
+   SETS, the manager may then issue a SET of { alertLog logTableEntry(1)
+   alertId(1) value } to remove the entry from the log.  Otherwise, the
+   next GET NEXT poll of this agent should request the first "alertId"
+   following the instance of "value" rather than an instance of "0".
+
+7.2.2  Use of Polled, Logged Alerts with CMOT
+
+   Using polled, logged alerts with CMOT is similar to using them with
+   SNMP.  In order to test for table entries, one uses a CMOT GET and
+   specifies scoping to the alertLog.  The request is for all table
+   entries that have an alertId value greater than the last known
+   alertId, or greater than zero if the table is normally kept empty by
+   the manager.  Should the agent support it, entries are removed with a
+   CMOT DELETE, an object of alertLog.entry, and a distinguishing
+   attribute of the alertId to remove.
+
+8.  Multiple Manager Environments
+
+   The conflicts between multiple managers with overlapping
+   administrative domains (generally found in larger networks) tend to
+   be resolved in protocol specific manners.  This document has not
+   addressed them.  However, real world demands require alert management
+   techniques to function in such environments.
+
+   Complex agents can clearly respond to different managers (or managers
+   in different "communities") with different reply values.  This allows
+   feedback and polled, logged alerts to appear completely independent
+   to differing autonomous regions (each region sees its own value).
+   Differing feedback thresholds might exist, and feedback can be
+   actively blocking alerts to one manager even after another manager
+   has reenabled its own alert reporting.  All of this is transparent to
+   an SNMP user if based on communities, or each manager can work with a
+   different copy of the relevant MIB objects.  Those implementing CMOT
+   might view these as multiple instances of the same feedback objects
+   (and allow one manager to query the state of another's feedback
+   mechanism).
+
+   The same holds true for polled, logged alerts.  One manager (or
+   manager in a single community/region) can delete an alert from its
+   view without affecting the view of another region's managers.
+
+   Those preferring less complex agents will recognize the opportunity
+   to instrument proxy management.  Alerts might be distributed from a
+   manager based alert exploder which effectively implements feedback
+
+
+
+Steinberg                                                      [Page 15]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+   and polled, logged alerts for its subscribers.  Feedback parameters
+   are set on each agent to the highest rate of any subscriber, and
+   limited by the distributor.  Logged alerts are deleted from the view
+   at the proxy manager, and truly deleted at the agent only when all
+   subscribers have so requested, or immediately deleted at the agent
+   with the first proxy request, and maintained as virtual entries by
+   the proxy manager for the benefit of other subscribers.
+
+9.  Summary
+
+   While "polled, logged alerts" may be useful, they still have a
+   limitation: the mean time to detect failures and alerts increases
+   linearly as networks grow in size (hierarchies offer shorten
+   individual poll cycle times, but the mean detection time is the sum
+   of 1/2 of each cycle time).  For this reason, it may be necessary to
+   supplement asynchronous generation of alerts (and "polled, logged
+   alerts") with unrequested transmission of the alerts on very large
+   networks.
+
+   Whenever systems generate and asynchronously transmit alerts, the
+   potential to overburden (over-inform) a management station exists.
+   Mechanisms to protect a manager, such as the "Feedback/Pin"
+   technique, risk losing potentially important information.  Failure to
+   implement asynchronous alerts increases the time for the manager to
+   detect and react to a problem.  Over-reporting may appear less
+   critical (and likely) a problem than under-informing, but the
+   potential for harm exists with unbounded alert generation.
+
+   An ideal management system will generate alerts to notify its
+   management station (or stations) of error conditions.  However, these
+   alerts must be self limiting with required positive feedback.  In
+   addition, the manager should periodically poll to ensure connectivity
+   to remote stations, and to retrieve copies of any alerts that were
+   not delivered by the network.
+
+10.  References
+
+   [1] Rose, M., and K. McCloghrie, "Structure and Identification of
+       Management Information for TCP/IP-based Internets", RFC 1155,
+       Performance Systems International and Hughes LAN Systems, May
+       1990.
+
+   [2] McCloghrie, K., and M. Rose, "Management Information Base for
+       Network Management of TCP/IP-based internets", RFC 1213, Hughes
+       LAN Systems, Inc., Performance Systems International, March 1991.
+
+   [3] Warrier, U., Besaw, L., LaBarre, L., and B. Handspicker, "Common
+       Management Information Services and Protocols for the Internet
+
+
+
+Steinberg                                                      [Page 16]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+       (CMOT) and (CMIP)", RFC 1189, Netlabs, Hewlett-Packard, The Mitre
+       Corporation, Digital Equipment Corporation, October 1990.
+
+   [4] Case, J., Fedor, M., Schoffstall, M., and C. Davin, "Simple
+       Network Management Protocol" RFC 1157, SNMP Research, Performance
+       Systems International, Performance Systems International, MIT
+       Laboratory for Computer Science, May 1990.
+
+   [5] Reynolds, J., and J. Postel, "Assigned Numbers", RFC 1060,
+       USC/Information Sciences Institute, March 1990.
+
+11.  Acknowledgements
+
+   This memo is the product of work by the members of the IETF Alert-Man
+   Working Group and other interested parties, whose efforts are
+   gratefully acknowledged here:
+
+      Amatzia Ben-Artzi          Synoptics Communications
+      Neal Bierbaum              Vitalink Corp.
+      Jeff Case                  University of Tennessee at Knoxville
+      John Cook                  Chipcom Corp.
+      James Davin                MIT
+      Mark Fedor                 Performance Systems International, Inc.
+      Steven Hunter              Lawrence Livermore National Labs
+      Frank Kastenholz           Clearpoint Research
+      Lee LaBarre                Mitre Corp.
+      Bruce Laird                BBN, Inc
+      Gary Malkin                FTP Software, Inc.
+      Keith McCloghrie           Hughes Lan Systems
+      David Niemi                Contel Federal Systems
+      Lee Oattes                 University of Toronto
+      Joel Replogle              NCSA
+      Jim Sheridan               IBM Corp.
+      Steve Waldbusser           Carnegie-Mellon University
+      Dan Wintringham            Ohio Supercomputer Center
+      Rich Woundy                IBM Corp.
+
+Appendix A
+
+   Example of polling costs
+
+      The following example is completely hypothetical, and arbitrary.
+      It assumes that a network manager has made decisions as to which
+      systems, and which objects on each system, must be continuously
+      monitored to determine the operational state of a network.  It
+      does not attempt to discuss how such decisions are made, and
+      assumes that they were arrived at with the full understanding that
+      the costs of polling many objects must be weighed against the
+
+
+
+Steinberg                                                      [Page 17]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+      level of information required.
+
+      Consider a manager that must monitor 40 gateways and hosts on a
+      single network.  Further assume that the average managed entity
+      has 10 MIB objects that must be watched to determine the device's
+      and network's overall "health".  Under the XYZ network management
+      protocol, the manager may get the values of up to 4 MIB objects
+      with a single request (so that 3 requests must be made to
+      determine the status of a single entity).  An average response
+      time of 5 seconds is assumed, and a lack of response within 30
+      seconds is considered no reply.  Two such "no replies" are needed
+      to declare the managed entity "unreachable", as a single packet
+      may occasionally be dropped in a UDP system (those preferring to
+      use TCP for automated retransmits should assume a longer timeout
+      value before declaring the entity "unreachable" which we will
+      define as 60 seconds).
+
+      We begin with the case of "sequential polling".  This is defined
+      as awaiting a response to an outstanding request before issuing
+      any further requests.  In this example, the average XYZ network
+      management protocol packet size is 300 bytes "on the wire"
+      (requesting multiple objects, ASN.1 encoded, IP and UDP enveloped,
+      and placed in an ethernet packet).  120 request packets are sent
+      each cycle (3 for each of 40 nodes), and 120 response packets are
+      expected.  72000 bytes (240 packets at 300 bytes each) must be
+      transferred during each poll cycle, merely to determine that the
+      network is fine.
+
+      At five seconds per transaction, it could take up to 10 minutes to
+      determine the state of a failing machine (40 systems x 3 requests
+      each x 5 seconds per request).  The mean time to detect a system
+      with errors is 1/2 of the poll cycle time, or 5 minutes.  In a
+      failing network, dropped packets (that must be timed out and
+      resent) greatly increase the mean and worst case times to detect
+      problems.
+
+      Note that the traffic costs could be substantially reduced by
+      combining each set of three request/response packets in a single
+      request/response transaction (see section 6.1.1 "Example").
+
+      While the bandwidth use is spread over 10 minutes (giving a usage
+      of 120 bytes/second), this rapidly deteriorates should the manager
+      decrease his poll cycle time to accommodate more machines or
+      improve his mean time to fault detection.  Conversely, increasing
+      his delay between polls reduces traffic flow, but does so at the
+      expense of time to detect problems.
+
+      Many network managers allow multiple poll requests to be "pending"
+
+
+
+Steinberg                                                      [Page 18]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+      at any given time.  It is assumed that such managers would not
+      normally poll every machine without any delays.  Allowing
+      "parallel polling" and initiating a new request immediately
+      following any response would tend to generate larger amounts of
+      traffic; "parallel polling" here produces 40 times the amount of
+      network traffic generated in the simplistic case of "sequential
+      polling" (40 packets are sent and 40 replies received every 5
+      seconds, giving 80 packets x 300 bytes each per 5 seconds, or 4800
+      bytes/second).  Mean time to detect errors drops, but at the cost
+      of increased bandwidth.  This does not improve the timeout value
+      of over 2 minutes to detect that a node is not responding.
+
+      Even with parallel polling, increasing the device count (systems
+      to manage) not only results in more traffic, but can degrade
+      performance.  On large networks the manager becomes bounded by the
+      number of queries that can be built, tracked, responses parsed,
+      and reacted to per second.  The continuous volume requires the
+      timeout value to be increased to accommodate responses that are
+      still in transit or have been received and are queued awaiting
+      processing.  The only alternative is to reduce the poll cycle.
+      Either of these actions increase both mean time to detect failure
+      and worst case time to detect problems.
+
+      If alerts are sent in place of polling, mean time to fault
+      detection drops from over a minute to as little as 2.5 seconds
+      (1/2 the time for a single request-response transaction).  This
+      time may be increased slightly, depending on the nature of the
+      problem.  Typical network utilization is zero (assuming a
+      "typical" case of a non-failing system).
+
+Appendix B
+
+              All defined MIB objects used in this document reside
+              under the mib subtree:
+
+              alertMan ::= { iso(1) org(3) dod(6) internet(1)
+                    experimental(3) alertMan(24) ver1(1) }
+
+              as defined in the Internet SMI [1] and the latest "Assigned
+              Numbers" RFC [5]. Objects under this branch are assigned
+              as follows:
+
+              RFC 1224-MIB DEFINITIONS ::= BEGIN
+
+              alertMan        OBJECT IDENTIFIER ::= { experimental 24 }
+
+              ver1            OBJECT IDENTIFIER ::= { alertMan 1 }
+
+
+
+
+Steinberg                                                      [Page 19]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+              feedback        OBJECT IDENTIFIER ::= { ver1 1 }
+              polledLogged    OBJECT IDENTIFIER ::= { ver1 2 }
+
+              END
+
+
+              1) Feedback Objects
+
+                 OBJECT:
+                 ------
+
+                 maxAlertsPerTime { feedback 1 }
+
+                 Syntax:
+                    Integer
+
+                 Access:
+                    read-write
+
+                 Status:
+                    mandatory
+
+                 OBJECT:
+                 ------
+
+                 windowTime { feedback 2 }
+
+                 Syntax:
+                    Integer
+
+                 Access:
+                    read-write
+
+                 Status:
+                    mandatory
+
+                 OBJECT:
+                 ------
+
+                 alertsEnabled { feedback 3 }
+
+                 Syntax:
+                    Integer
+
+                 Access:
+                    read-write
+
+                 Status:
+
+
+
+Steinberg                                                      [Page 20]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+                    mandatory
+
+
+              2) Polled, Logged Objects
+
+                 OBJECT:
+                 ------
+
+                 alertLog { polledLogged 1 }
+
+                 Syntax:
+                    SEQUENCE OF logTableEntry
+
+                 Access:
+                    read-write
+
+                 Status:
+                    mandatory
+
+                 OBJECT:
+                 ------
+
+                 logTableEntry { alertLog 1 }
+
+                 Syntax:
+
+                    logTableEntry ::= SEQUENCE {
+
+                       alertId
+                          INTEGER,
+                       alertData
+                          OPAQUE
+                    }
+
+                 Access:
+                    read-write
+
+                 Status:
+                    mandatory
+
+                 OBJECT:
+                 ------
+
+                 alertId { logTableEntry 1 }
+
+                 Syntax:
+                    Integer
+
+
+
+
+Steinberg                                                      [Page 21]
+
+RFC 1224        Managing Asynchronously Generated Alerts        May 1991
+
+
+                 Access:
+                    read-write
+
+                 Status:
+                    mandatory
+
+                 OBJECT:
+                 ------
+
+                 alertData { logTableEntry 2 }
+
+                 Syntax:
+                    Opaque
+
+                 Access:
+                    read-only
+
+                 Status:
+                    mandatory
+
+                 OBJECT:
+                 ------
+
+                 maxLogTableEntries { polledLogged 2 }
+
+                 Syntax:
+                    Integer
+
+                 Access:
+                    read-only
+
+                 Status:
+                    optional
+
+Security Considerations
+
+   Security issues are not discussed in this memo.
+
+Author's Address
+
+   Lou Steinberg
+   IBM NSFNET Software Development
+   472 Wheelers Farms Rd, m/s 91
+   Milford, Ct. 06460
+
+   Phone:     203-783-7175
+   EMail:     LOUISS@IBM.COM
+
+
+
+
+Steinberg                                                      [Page 22]
+
+\ No newline at end of file