MW-E2ED call
5-April-06
*Participants*
Chas DiFatta, CMU (chair)
Paul Hill, MIT
Mark Poepping, CMU
Michael Gettes, Duke
Kevin
Miller, Duke
Steve Olshansky, Internet2 (scribe)
*Action Items*
New
[AI] {Chas} will follow up with the Novell developer
working on Kerberos auditing
[AI] {Mark} will look into Walter
attending the I2MM.
*Discussion*
There has been a lot of traffic
on the Kerberos list about auditing features, adding hashes
in the KDC logs to correlate the TGT to the service ticket
issuance, so that a TGT can be correlated to the service tickets
which were issued to the user at the machine. A developer at
Novell is particularly interested in this, Chas will follow
up with him.
An EDDY developer met with Kevin and Michael at Duke about e-mail diagnostics functionality. Notes from the meeting were sent out to the list by Chas 4-April-06, and served as the basis for discussion (and these minutes):
* emphasized not to optimize for performance at the expense of functionality.
* argus/network flow data (event rate) is likely more intensive than other data (e.g. e-mail). Mail flows will likely engender more discrete things to examine however, and more different systems, than netflows. If trying to determine why a message wasn't delivered, spam/virus logs will also need to be reviewed.
Netflows are usually viewed from a single PoV, but EDDY expands this to encompass more collection points (routers etc.). In a mail example, the sequence can be anticipated, and if a message doesn't appear when/where it is expected this can be a useful data point.
* Duke perspective
- SNIFF project -- ability to
provide network and system awareness to end-users; Michael's
analogy was that when one drives down the road, there are multiple
sources of information to give you an idea of what is going
on -- e.g. it is easy to tell if things are slow because there
is a problem with your car vs. congestion on the road.
- EDDY is viewed as the underlying infrastructure that can be used to build SNIFF.
- e-mail component is viewed as not only being useful but also as a proof of concept for convincing the SNIFF team members to use the infrastructure rather than re-inventing the wheel.
- SNIFF is focused on delivering info to the end-user, and the infrastructure required to enable that would also be useful to the admins. EDDY could be a useful tool to collect this data...
- SNIFF = "Systems Networking Instrumentation Forensics and Forecasting." This is not a system per se, but a goal/framework for something akin to a dashboard, providing direct feedback to a user, including performance relative to other users.
- e.g. a user could receive feedback about incoming spam percentages - allowing an informed opt in/out or threshold-setting, or about percentages of incoming mail above a certain size...
* We talked about syslog integration
- Michael cautioned against requiring too much for the sysadmin to install.
- Kevin suggested just having a loghost for syslog gathering and converting wouldn't be bad -- the main issue though is that syslog is not very detailed so by not having something at the source, you lose information.
So one could either have something that talks back to an agent on the host or you have a syslog interceptor on the host.
The end result was that it would be something good to do and that if one didn't run something locally, then there would be a loss of functionality. However, with proof of concept to show value, there should be enough incentive to overcome some hurdles.
- This leads to the question: what are you missing in syslog, if anything? E.g. host performance metrics? This is out of scope for initial e-mail integration, but a possible future enhancement.
- Perhaps making user tags more flexible or more structured, enabling the DB to adapt to whatever it is receiving, might be useful as a first step. There may be some loss of efficiency in doing this...
- example
record types in this context... What would the CER for each
of these types look like?
1. message received
2. intermediate
processing (e.g. name expansion)
3. delivery attempt
4. error
* A generalizable interface to a database would be good to have. Kevin has some ideas. He agrees that using Hibernate as a database abstraction is likely a good idea, especially since this allows the use of Oracle, MySQL, or whatever (and should make it easier to move between the databases).
There has been previous discussion about the potential need for an archive agent v. a DB agent. Archive agent could be a simple DB that stored CERs (or some subsection) and age quickly (days), a DB agent would be extremely focused, e.g. for mail.
For now a DB agent is likely enough. Performance for many potential EDDY applications (beyond flow data) is not likely to be a concern...
- Early wins to demonstrate the value of EDDY in other contexts would be a good goal - i.e. crawl before walking - then go on to solve other problems.
E.g. track a message by messageid, or show me all mail for a particular user. Being able to answer these questions effectively would be a great first step.
If everything is dumped into one big DB, what does that imply for managing access to the data? Break mail data into a separate mail DB...
Raw performance (events/second) is of much lower priority than functionality in looking at EDDY in other applications (beyond netflows) at this early stage of development
It was also noted that Sendmail is obviously not the only MTA in wide use, thus it is important to normalize the event data - the value of CER. It would be useful e.g. to include a school using Postfix.
* We discussed the structure of email flows.
- We decided that it would be a multi-part object that would need to be able to compose data from machines.
- We were discussing whether to pre-combine the parts or to do it upon a query. The issue was whether or not the insert of the data would be expensive enough to hurt the rate of insert and whether or not people would actually be looking at the data. Clearly if it is expensive to insert and the data is rarely looked at, then it is not worth doing the work. On the other hand if there are tons of read, then pre-computing may make more sense.
At the end, we decided it wasn't worth optimizing at this time and we'd optimize when we saw a problem and understood usage.
- I suggested to try to do some quick email exchanges to iterate over some basic design ideas. Here is the first message.
- We are both using sendmail which is likely good and bad. It is good in that we'll be producing something that is useful to us. It is bad in that we may miss do something sendmail specific and miss fields in other MTAs.
- What CERs need to be generated at the source? What do we need to abstract out of the DB? What intermediate CERs would be useful? E.g. delivery attempt? Other intermediate steps in the life of a message on a particular machine?
* Initial thoughts
- Let's
say the basic structure is:
* Machine record
. Different machine
records may be
= incoming MX
= client SMTP submission
= spam/virus
host
= (maybe) directory/lookup host
= (maybe) queuing host
. Envelope information is usually not
fully captured so it may be good to capture it here.
* Email flow record
. This
would consist of multiple machine records.
. There would likely
be some common data fields of: message-id, message-from, message-from,
message-subject, message-date.
* Questions - What about gray listed messages? - would a md5 hash of the body be useful? are there other hash algorithms that indicate similarity rather than highlighting differences?
Walter will remain point on working on the e-mail issues with the folks at Duke. [AI] Mark will look into Walter attending the I2MM.
We will work on fleshing out what these structures look like on the next call, which is set for Wed 19-April 5:00 EDT