Internet2
Site Index | Internet2 Searchlight |
Membership | Communities | Services | Projects | Tools | Events | Newsroom | About
 | Internet2 Home > Middleware

Middleware

>Home
>Middleware
   Overview
(PDF)
>Mailing Lists


Minutes From The 6/3/04 Bimonthly Meeting

 


Agenda
Participants
  • Survey - comments from group
    • Russ Hobby - network area comments
    • George Brett - comments on other tool pointer efforts
  • Scenarios from network area and Shibboleth status
    • Steven Carmody - report from last Shib meeting
    • Russ Hobby - finding a candidate for the network scenario
  • Report from "diagnostics big and small" session
  • Pilot Progress
  • George Brett - Internet2 (scribe)
  • Steven Carmody - Brown
  • Chas DiFatta - CMU (chair)
  • Russ Hobby - Internet2
  • Steve Olshansky - Internet2 (flywheel)
  • Mark Poepping - CMU

Discussion

Chas began the call with information about a conversation he had earlier with Eric Boyd, E2E Performance Initiative (E2Epi), how do you map end-to-end performance into the events arena. It was a good session. It was helpful to better understand each other's architecture. E2E basically has two camps: active measurements and passive measurements. Eric's work in more in the area of active measurement. The stuff in the performance are in challenge-response, stimulus response. It will be generating lots of interesting data. To get the data into the e2ed domain, work must be done to design a process to analyze at the performance data, make some decisions, and then inject an event into the MW-Diagnostic Backplane. Russ agreed with this observation. Chas pointed out that there needs to be input from passive monitoring from tools such as NetFlow. Russ said another reason for passive monitoring is to keep the network from being consumed by active tests.

Survey - comments from the group

  • Russ Hobby -- network area comments
  • Russ said he had looked over the network area of the survey and that he had some difficulties understanding how they differ from the tools listed on the E2Epi page. Chas replied that they had started out looking at tools specifically for middleware and found that there are a lot of tools that many middleware tools branch out and include network. So they added them to the list, ending up with tools that were more network specific. Russ suggested having a pointer on the e2ed page to the E2Epi tools.

    Chas said that they had discovered in doing the survey that many network tools are now include in application tools as well, even low level stuff like DNS. Microsoft has released MOM (Microsoft Operations Manager) and over time they will integrate with network based events. CMU is considering deploying MOM on all its desktops. There is a new movement from Microsoft to include processes that collect SNMP based events and other analysis protocols like Cisco NetFlow.

  • George Brett - comments on other tool pointer efforts
  • George described recent activity with group of Internet2 staff about identifying troubleshooting documents (e.g., FAQs, how to's, troubleshooting guides, etc.). This activity is one that should work with Middleware e2e Diagnostics. He pointed out there is a wiki being used to collect and discuss this information. This is a living document, wiki's are designed to be edited by multiple folks. There was brief discussion about figuring out what people will use as a resource.

    George described the issue of how to maintain such a resource once it has started. Chas agreed that such an activity takes resources and human effort. He suggested that Internet2 have an area where Big Tools that are supported by Internet2 and other groups are listed, but also have a more public area where people can freely add information about new tools that might be added to the Big Tools list. An further suggestion was to incorporate a ratings process similar to Amazon or CNet that would help better identify the better tools. This led to a discussion about the value or lack of value of anonymous ratings which most agreed that only people committed to the tools would be likely to rate or comment on them. Chas said he could see Internet2 becoming the UL (Underwriter's Laboratory) of application / network based tool. It's possible that doing such reviews would add stature to the reviewers.

    George asked for feedback and suggestions from the group.

    In closing it was suggested that web stats from the troubleshooting wiki would be helpful. George will see how to get this information.

    Scenarios from network area and Shibboleth status

  • Steven Carmody - report from last Shib meeting
  • Steve reported that Shibboleth has just released a new version and now they're talking about its features and functionality. Shib is at a point to start seeing production level deployments on campuses and at vendor sites. The Shib team has enough experience with big complex apps that they know they'll need to provide tools to the campuses and vendors to support the applications -- helpdesk and back room. OCLC already saying they'll need tools before moving past level 1. It's time to start looking how to modify or enhance aspects of Shib code (logging) to record useful, helpful information to be used by diagnosis people. Shib is setup to log up to nine levels deep. It can be set to record lots of info, but problem right now is that the information being logged was chosen to help developers. Now we need to figure out what do help desks and 2nd & 3rd level people need to figure out issues. We don't have much experience with distributed system like this. There will be issues about access to certain materials in the logs.

    We have lot of experience helping people installing Shib, but little experience in production environments. There are not many. Those folks learned enough about Shib from doing the install that they don't have needs in production. But, once there is wide spread deploy new problems will no doubt emerge.

    At this weeks Shib call, Chas and the MW E2ED team at CMU began to explore with the Shib development team what can be done over next 6 months to reach a stage 1. Action Items include:

    • Shib folks developing Shib focused MW-E2ED based scenarios
    • Talk with people at sites running Shib production networks for a perspective from the operation of the service
    • Incorporate new error messages will be more helpful to include in the logs

    He went on to say that one problem is that (like other projects) Shib depends on other peoples' code. The down side of this is that there are log files all over the place. There are files from Apache, ModSSL, other libraries. Each of these components writing to their own log files -- logging decisions based on how the library would be used. ModSSL logs based on original use cases - people with browsers that access web server with SSL.

    Shib uses this very differently and therefore error messages are not helpful. Suggestion that some one go and fix error messages in ModSSL. He said that one Shib programmer is going to spend significant time to improve information of logs, ways to thread the log messages to better identify Shib transaction.

    That's where Shib stands, it's just being kicked off, and he'll be reporting back on regular basis.

    Chas commented on the actions that came out of the conversation. He said that Steve will get developer to talk to write a scenario. A Very good results - the four levels of users we talk about Developer, Operator, Help Desk, Users - definitely verify they're in this camp. Renee Shuey, Penn State Univ., has one of first production has already identified people for Chas to talk to develop scenario for this aspect of the community and help raise some of the questions they'll need answered.

  • Russ Hobby - finding a candidate for the network scenario
  • Russ had talked with Brent Sweeny at the Abilene NOC who felt this was a good idea. He volunteered either himself or one of his staff to write a scenario. They have good operational experience, so should produce a good document. He noted that they may need some persuasion to complete it. Next step is getting it pulled out of them.

    Chas commented that we're seeing two dimension in network space: Active vs Passive and Long-haul WAN vs Short-haul LAN and that it will be helpful to have people write scenarios that fit each of the four camps. He pointed out that Abilene folks have very different view than people at local campuses. We need information from the Passive measurement (NetFlow) side as well as the Active measurement side (E2Epi). Russ agreed and asked if there was a template available to give to people to fill in with their scenarios. Chas said he has one and will send it to Russ.

    [AI] Chas and Russ will email each other with how to 1) follow up with folks identified and 2)engage new folks in other camps.

    Report from "diagnostics big and small" session

    Chas said it's been really hard to get people to participate, Matt, Russ, Eric been really busy. But, we need to get back to Cheryl & Ken very soon to come up with a road map to see if diagnostics are outside middleware. If this is so, the question is how to coordinate with specific groups to keep from reinventing wheel to leverage as much value as possible.

    In discussion with Eric today - 3 points to make with respect to E2Epi:

    • Every 3-6 months have meeting with the two groups to discuss current activities and how they might fit.
    • Agreed that the efforts are loosely coupled - but there are touch points
    • There is a whole other camp in networking diagnostics - passive Since Eric and E2Epi are in the Active measurement camp, the question is how to engage people in passive side as well.

    George mentioned that E2Epi has worked with the NLANR Measurement Network Analysis Group and they might be a good contact.

    Pilot Progress

    Chas updated the group about progress on the pilot. The work study developer is now full time, and the team now includes a have a seasoned developer. They had a meeting of the last week in Pittsburgh and are now studying two tools to use as a foundation for development so they can concentrate on the goals of the pilot, and not reinvent the wheel if others had done so. The candidates are NetLogger from NBL (Brian Tierney) and AirCert from then CERT, both are very interesting. AirCert log files have provisions for NetFlow (Cisco's) and are tied to a database. Should have a decision by next week to go with one of them with out much modification, augment it, or do we just take small high value pieces of them and mostly roll our own.

    He said they are coming to grips with defining the event record further. It still same concept as of the metatag that holds the correlation data, but raw event on the other end will have 5 schemas -

    • Applications (Shibboleth, DNS)
    • Network based flow events or passive measurement (NetFlow)
    • System oriented events (re-boot, memory error, userD messages)
    • Security (intrusion detection systems, needed access)

    In order to kick off pilot and scaling issues will not be addressed at this point. To keep things simple, the event information from log files will be kept in XML form so it can be operated upon easily and quickly. Once within the backplane, one requirement is that the event information has to have the ability to be de-compiled from XML and returned back to its raw form. Scott Cantor said the same thing on the Shib call this week, i.e. "Have to keep raw data the same as it came in." The next check point in couple weeks. Chas will update us development milestones at that time.

    There was a brief discussion about the details of the schema and how the backplane would log system configuration changes such as on a router. Chas said that would be a system event, where a router can be looked as a host with a specialized application running on it. Mark said we'd have separate event from errors. Chas said that this is a first exercise to run log files through to see how works and then search against them to find what was difficult.

    In other business Russ talked about a proposed session for Fall Internet2 Member Meeting. There are aspect that pertain the MW e2e Diagnostic group. He said that he had no one specially in mind yet, but this came out of Applications Strategy Council. That is, what characteristics are important and how to design for them. He will be sending the proposal to the MW E2E Diagnostics list for discussion.

    [AI] Chas and George will work out the list of action items.

     

    © 1996 - 2008 Internet2 - All rights reserved | Terms of Use | Privacy | Contact Us
    1000 Oakbrook Drive, Suite 300, Ann Arbor MI 48104 | Phone: +1-734-913-4250