|
Minutes From The 12/18/03 Bimonthly Meeting |
|
Agenda
| Participants
|
- Preliminary Architecture Feedback Discussion
- Project Scope Discussion
|
- Chas DiFatta - CMU (chair)
- Russ Hobby - Internet2
- Nate Klingenstein - Internet2 (scribe)
- Steve Olshansky - Internet2 (flywheel)
- Mark Poepping - Internet2
- Von Welch - Internet2
|
Discussion
Russ and Matt Zekauskas of Internet2 invited the CMU contingent to a
working session at SDSC. There was a large amount of interest and
feedback on measurement, the comment event record, the architecture, etc.
The call was devoted to developing the flows in between collection
modules and processing modules.
Event Record Overhead
Von suggested changing the name of the field demonstrating where the
event record came into existence from "location" to "creator". This
would allow for the scenario where a host would be connected via a
socket to a router using NetFlow, where the IP addresses of the point
of observation and the host are different. The observer creates the
raw event, while the collection module creates the event record.
Von expanded this to cellular phones, examining how to identify an
individual using a cell phone, which might be done using the cell phone
number. The question was left outstanding whether to restrict the
scope of these tools to IP-based networks. However, there still may be
a need to correllate IP addresses to DHCP logs to identify boxes.
Instead, the host name of the box was thought as more temporally stable.
There was also question suggestion that it might be best to record
instead an IPv4, IPv6, or MAC address. In the case where events would
be collected from a SONET ring, that could prove useful. Abstraction
of the creator field to allow for any of these contents seemed useful
to Russ, who noted that someday wavelengths might have to be considered.
Internationalization was also brought up, with a suggestion that there
be a field to encode the language of the event record.
Event Tags
There was debate whether events should be tagged using pointers or
values. Chas proposed two subfields within an event tag: an event
descriptor and a user tag, which might take the form of expansible XML.
It's unclear whether the entire field should be expressed as XML with
concerns about the amount of size that a record would then consume
prevailing due to the huge number of records collected.
A simple pointer to a specific field in raw event data might be
preferred for the maximum economy of space; however, in the case of a
desire to parse large amounts of raw data or search them, the pointer
itself may not contain sufficient information describing the event.
Pointers also would have to be meaningful across application versions
to maintain independence. Von noted some overlap with a project within
the GGF, and Chas noted that the Grid is part of the application survey
to capture these synergies.
To expand on the pointer idea, Von suggested a name registration or
something else that could map a pointer to a schema used to interpret
it. The pointer would then be a URI or URL pointing to the schema for
the raw event data, allowing a parsing application to find the user ID,
for example, without familiarity with the application being monitored.
He also noted that the decision whether to use values or pointers may
depend on the use of this data; his "gut feeling [is] that 99 percent
of the time these will never be read."
Event Identifiers
The group observed that if part of the unique identifier were formed
using the location, then there would be no need to be able to
co-ordinate identifiers across hosts. However, if the granularity on
the timestamp were significant enough, the need for a unique identifier
field at all might be nulled; the combination of the location and
timestamp may be sufficient guarantee of a uniquely identified event.
However, if an event were somehow forked and came back into the
system, Chas worried, the event might be recorded twice. It seemed
important to him to be able to determine that these two copies referred
to the same event.
|