Prabhu Saiprabhu
- Dec 14, 2022
- 9 min read

AI Drives Collaboration among Data Science Models

Here is where we talked about the consistency framework that is the foundation we need for this work:

Arrivaattral inspiration

Arrivaattral can be equated to a state, like a state of Arrivaattral. In a simplified view of this state, with respect to our focus area of multiple data science models, we seek features that enable evolving cognitive capabilities, primarily focusing on how knowledge is represented and organized, frameworks that allow reasoning with knowledge using predicate and first order logic, etc., and rely on nondeterministic orchestrations that adapt with experience, including refinements based on reviews from explanations, etc. In a typical data science ecosystem, models evolve, phase out, induct newer ones, and employ additional mechanisms to combine the knowledge, etc., and the Arrivaattral state requires that we consider frameworks such as symbolic processing, etc., and handle them. Aided by this wide inspiration, let's look at our IROPS example through the Arrivaattral lens.

DATASETS (from previous article):

We will pick a day’s worth of on-time-performance data from Bureau of Transportation Statistics, https://www.transtats.bts.gov/DL_SelectFields.aspx?gnoyr_VQ=FGJ&QO_fu146_anzr=b0-gvzr. From this data we will construct flight schedules and use them as operational facts

We will generate synthetic booking data for some flights spanning multiple cities and make arbitrary limits on total seats, used seats and available seats, etc.

We will also generate synthetic customer importance/loyalty/expectation measures

Let’s begin with synthetic customer data based on a format shown below. We have defined three tier levels of customer importance. In an enterprise, we will have a good number of measures attributed to customers and many will be outcomes and/or drivers of data science models. Using the format, we have defined 7 customers in each tier with varied measures. The customers are named as Tier1_1, Tier3_5, and Tier2_7, etc., so that in visuals they are easily recognizable without additional details.

Next, we have pulled a day’s worth of OTP data and selected the following attributes and called that dataset as both scheduled and operating flights. Airlines begin with such schedules and as the day grows, address various events, IROPS included.

With flight schedule and customer data taken care, let's book all customers on the same itinerary to simplify our conversations, while providing a comprehensive coverage. All of them leave LAX at 5:43 on flight 194, transferring at ATL to flight 172, that departs at 13:22 and eventually arriving at BOS at 15:53. We have used the final arrival time as a cost measure although it is often driven by other factors. Such cost values come in handy to reason with impacts. A sample screen shot of the booking data is shown below using tuple structures.

Since we are using knowledge representation frameworks and logic based processing, we are able to use the same core Arrivaattral capabilities to generate synthetic customer booking data. We did this through eliminating all constraints other than flight operating schedules. So, if there is an optimal flight route is available, it will be used. Since we have declared sufficient seats as open in all flights, all of our customers have been booked on the same flights. To this environment, we will then add, remove, evolve data science model behavior impacts as constraints later.

The logical components and platforms needed for the Arrivaattral state are depicted at the right. As we reviewed earlier, Kafka plays a central role in managing real-time data exchanges. To this environment we will now add the Arrivaattral capabilities. We rely heavily on knowledge representation frameworks, including First Order Logic, Frames & Scripts, and Belief Networks, etc. The knowledge representation frameworks are chosen based on the inference platforms and programming methodologies supported by them. We then use a graph database platform, such as Neo4J for downstream analytics. In this demo we will use Neo4J for reviewing the results from our IROPS use case.

Calm before storm view

Let’s first see the current state, where our 21 passengers, 7 in each tier, have booked on flights to travel from LAX to BOS. The itineraries are actually loaded/obtained from a Neo4J graph database as shown with annotations.

It is worthwhile to stress the importance of modeling for graph databases, which is different compared to other paradigms. As shown on the right, we will use a simple structure of customer and airport as nodes and then associate customers to airports based on actions like booked or re-accommodated etc. Each node and relationship, will have a bunch of useful items associated with them. For example, a BOOKED relationship between customer and airport node could have items such as flight number, arrival or departure or both times when customers connect through an airport, etc. In practical applications key information like wheelchair assistance, unaccompanied minors, etc can also be very valuable contexts for data science models or operations staff, looking to re-accommodate impacted customers. Likewise, customers and airports nodes too will have items associated that are not tied to a trip, and we can query them merrily.

When we discuss the transparency framework in next paper, we will introduce other nodes and relationships, such as crew, aircraft, cargo, etc. Rest assured that for this collaborative framework discussion, they are actively contributing and remaining in background.

The graph is not intended to visualize the flow of flights, customers and such. Neo4J serves a major purpose on analytics and several downstream processes. Still, we can query/select nodes and relationships to get an idea of the itinerary as shown by the Cypher query in the pictures. For example, when we select the Tier1_1 customer node by clicking on the node in Neo4J browser, we can see useful details about that customer. Similarly, when the BOOKED relationship between Tier1_1 and LAX nodes is selected, one can see that customer Tier1_1 will depart from LAX through flight 194 @ 5:43. The arrival item being ‘no’ indicates that they are only departing. So from the perspective of both LAX airport and Tier_1 customer the relationship items provide a good value. Obviously LAX airport and Tier1_1 customer will have many other relationships with other nodes. Platforms like Neo4J are quite powerful in this regard. For example, if Tier1_1 drives to LAX to catch this flight, you would create a new relationship between them, perhaps, calling it as DRIVE relationship and associate arrival items, etc., and even further distinguish them as recommended, no-later-than, etc. Now you can see what such technologies can do.

IROPS view

So, for our IROPS example, let’s create a weather situation at ATL airport. In our platform, this will be a message sent out through Kafka where our handlers we reviewed in consistency framework will be on the lookout for. Handlers have access to a full set of knowledge about customers’ actions such as booked to connect through ATL, etc., along with other customer experience related items that have been initialized and updated from both historical and current decisions. Each data science model then goes to work in mostly non-deterministic manner, and provide their recommendations related to customers, planes, crew and cargo, which also appear as Kafka messages.

Arrivaattral components view such recommendations through multiple lenses, such as options, constraints, etc. While constraints are valid until revised, options do have time elements associated. For example, weather related out of service of ATL airport is a constraint until it is resolved. On the other hand, customer or cargo re-routes may be provided as options to customers so that they may choose the one that suits them. Also, such options are often associated with prioritization criteria that is determined by historical values such as tier levels or current situation, such as IROPS-impact-score-now attribute. With widespread use of digital tools, customers and employees could share best information available to them in a collaborative and satisfying manner.

Fuzzy orchestration

The schematic on the right illustrates a potential sequence of actions resulting from our IROPS event. Though the impact begins with ATL being out of reach, Ground Ops and Crew data science models went to work and came back recommending not to divert to additional airports to meet compliance and capacity, etc.

Arrivaattral’s behavior is patterned after how humans tend to solve problems where a fuzzy set of rules, employed by the handlers triggering necessary models driven by events. While the consistency manager ensures that all models receive the best information, Arrivaattral state ensures orchestration that is not always deterministic, but can reasonably solve.

PAX re-accommodations are generally provided as options, along with contexts such as facilities at the new connecting airports, etc., thereby allowing customers to interactively work them. It is conceivable to extend the solution landscape to where customers’ selections could be routed through other customer experience models to reflect/standardize their current IROPS impacts. With that, subsequent interactions they may have with employees during the current journey will have the best information. Arrivaattral state should have facilities to accommodate such an operation or evolution. Complex systems built through traditionaltechnologies could address some of the needs defined above. However, Artificial Intelligence frameworks and technologies enable building them reliably at lower costs and in shorter time frames.

Let's now take a look at the outcomes from the re-accommodation decision. As you can see, customers are accommodated in multiple ways based on their value and system capacity, etc. You may recall that we chose to use final arrival time as our cost of impact and so critical customers will have minimal impacts. The final arrival times are marked in yellow.

Keep in mind that prioritization is often flexible in states like Arrivaattral. Knowledge representation structures play a key role here. For example, a frame-based knowledge fragment could have slots that hold lists of thresholds and logical operators to be used against corresponding current values. Such lists enable the flexibility. However, some complex extensions will require updates to frame slots, changes to logic processors, etc. No short cuts.

As the examples show, a customer at a higher tier level is able to arrive at the destination earlier than the lower tier level customers. This is a behavior that has been desired by default. Keep in mind, that higher tier customers may be presented with options and may choose a later arrival one due to other considerations. So, as the IROPS guidebook indicates, relevant information when communicated timely will be valuable.

In addition to arrival times, we can also notice that number of stops/hops may also play a role when capacity, such as available seats need to be taken into consideration. In this example, we have forced all flights to have 5 open seats at the onset of the IROPS event so that 21 passengers are accommodated in a manner to demonstrate the behavior. This capacity is routinely computed/made available via events from operational applications.

It is worthwhile to note that the same Arrivaattral state handled both 'clam-before-storm' and IROPS situation, simply differentiated by what was in the context state and designing handlers that work with them. This is a very desirable situation, where field operators could use a consistent interface during normal and abnormal situations.

Why data science models need the state of Arrivaattral?

The following table summarizes how the state of Arrivaattral assists a data science model eco system. Without the formal consideration of such a state, enterprises tend to solve for requirements of the Arrivaattral state piece-meal or even project or model based. Eventually such piece-meal strategy would create a resistive load for further innovation. This is the bane for what is famously termed as 'tech-debt'.

Challenges associated with adopting Arrivaattral state

There are a few challenges to align with the mechanisms needed for Arrivaattral state in large enterprises. In our experience, we see most enterprises agreeing on or even desiring to benefit from the concepts demonstrated in this discussion, but many face challenges in building and adopting a roadmap. We see a few blocker themes surface often and have grouped them into six categories, ranging from philosophical to architecture and delivery. Based on this categorization, we will outline considerations for defining roadmaps for adoption. While overall path of a roadmap may be shared, milestones in them are often specific to customers.

What would a roadmap to Arrivaattral state systems look like?

Obviously current state plays a role in determining a starting point for any roadmap. The following table provides some considerations that may provide some basis for how to think about a roadmap. A few items are needed to be defined associated with the roapmap build approach.

Roadmap complexity is classified into three simple categories. Dwelling too much into detailed estimates may at the end, result in variations that map to such categorization

When we say data science model ecosystem, we think of capabilities that multiple data science models share in their life-cycle. The capabilities include development processes, technologies, MLOps, governed and approved data sets in use for model training, model accuracy monitoring, and obviously the contexts that we demonstrated in this paper, etc.

In a Platform centrist approach, we see organizations carve out a general set of tool-sets, frameworks, vendor centrist/agnostic services, and organize around delivery teams that are focused on enabling shareable assets and teams that consume them with zeal. This is, in fact, a harder but beneficial approach

We distinguish AI core principles from technologies so that roadmap build can be effective. Roadmaps in some cases are simply not useful when wider set of criteria is used in estimates, dependencies for sequencing in the roadmap, training, etc.

What next?

In the next paper we will use specific AI techniques to implement the transparency behavior. State of Arrivaattral requires capabilities to improve or select appropriate decision frameworks and so explanations are essenoviding means to accomplish this. Handling explanations to refine decision frameworks is not straightforward and is an evolving area.

There are no readily available frameworks to assist with our explanations need. We use areas of machine learning such as Explanation Based Learning and Reinforcement Learing to make a difference.

Let the fun times continue!