Tutorial: Generating Fact Scenarios For Testing and Search

Tutorial: Generating Fact Scenarios For Testing and Search

To follow along with this Tutorial, you will want to have gone through the Advanced Tutorial: Encoding LSAT Questions.

In that tutorial, we created an ontology for the LSAT practice exam questions 6 through 10, and showed you how to encode question 6 to find out if specified schedules are valid.

Questions 7 through 10 are most easily solved if the reasoner already knows what all the valid schedules are and can search through them. We have already told it what makes a schedule valid, so what we need to do now is generate a set of possible schedules that includes all the valid schedules, and let our existing rules find the valid schedules among them.

Generating that list of possible schedules, that includes at least all the valid schedules, is what this tutorial covers. What you might use this technique for in the real world would be to test legislation, contracts, or regulations, or an encoding of them, against a variety of fact scenarios to ensure that they behave the way you are expecting, or to find desired or undesired outcomes.

In this example, we’re trying to cover all possible scenarios, because we are interested in making sure we know everything that can happen. In other more complicated scenarios that may not be possible, and it will be beneficial to use other quality assurance techniques, like randomly generating scenarios and measuring the results.

The process for generating scenarios is the same as for encoding other things in Blawx: First, explain the “things” you are talking about. Then, explain what the rules are, and go back and forth as necessary when you realize things are missing. Lastly, ask your question. What’s different in scenario building is that the rules create facts.

Finding a Balance

What we want to do is to generate an object that represents every possibly valid schedule. Now, we could generate millions of invalid ones, but searching through them would slow Blawx down when it was trying to answer questions.

We could also generate only valid ones, but that would require a really high level of effort on our part. Way past the point of diminishing returns. And we’re lazy.

So what we’re trying to do here is come up with a set of scenarios that is a) relatively easy to generate, and b) doesn’t include too many invalid options.

We could approach this either from the top down or the bottom up. Let’s try bottom up.

First, we will create a list of possible schedules for a single day of the festival. Second, we will figure out which one of those days is valid for each of Thursday, Friday, and Saturday. Third, we will create a possible festival schedule for every combination of potentially valid Thursdays, Fridays, and Saturdays.

For each step in this process, we will go through the first to phases of encoding things in Blawx: describing the categories we’re talking about, and then explaining the rules.

Possible Daily Schedules

Step 1: What are we talking about?

The first Category we are going to create is a “PossibleDay”. A possible day is a list of movies in a certain order. We can create that by giving a possible day a “screenings” parameter. Because a PossibleDay might be used as the Thursday, Friday, or Saturday of a Possible Schedule, we’re going to leave the “day” section of the Screening object blank.

Here’s what the PossibleDay looks like. We call the attribute “pd_screenings” in order to make it distinct from the attribute “screenings” that we gave to the Schedule category.

Now, we know that one of the rules is that there must be a movie every day, and another of the rules is that no movie can be shown more than once on a given day, and that there are only three movies. From that we know that there are three types of PossibleDays: Days with one movie, days with two, and days with three.

So, we will create three rules that will create PossibleDay objects of those kinds.

Step 2: What are the rules?

The first rule is simple: If there is a Film, there is a PossibleDay where that film is shown first, and nothing else is shown.

This rule needs to be broken down into parts that we can represent with Objects that fit into our Categories and Attributes. Here’s what it looks like.

Usually, rules have more in the conditions on the bottom, and less in the conclusions on the top. But when you are generating scenarios, the opposite is true. The only condition in this rule is that there is an object in the Category Film. The conditions will be true three times, once for each of the three films.

Each time the conditions are true, the conclusion will also be true. As you can see, the conclusion creates a PossibleDay object, categorizes it, creates a Screening object, categorizes it, sets the Film and the Slot for the screening, and adds the screening to the PossibleDay. Usually we do those things in a Fact block, but you can do them in the conclusions of a rule, too.

What’s with ‘pd(?A)’?

This is a little less user-friendly than we would like, so we’re going to call it an “advanced Blawx technique.” That sounds better.

When you use the variable block “A”:

In the background the reasoner is using the text ‘?A’ to represent that variable. As it turns out, if we include the text ‘?A’ in the name of an object, the reasoner will say “Hey, that looks like a variable!” It will then take the part of the object name that looks like a variable, and replace it with the value of that variable in the instance of the rule that made the conclusion true.

This is also the reason that you shouldn’t use question marks in the name of an object unless you are explicitly trying to get this effect.

So if we say “if an object “A” is a film, there is an object pd(?A)” Blawx will go through all the objects it knows about to find any that are in the Category “Film”. Each time it finds a valid object, the conclusion of the rule will be triggered, and three objects will be created named pd(Greed), pd(Limelight), and pd(Harvest).

Now that we have done it for a possible day with only one screening, we’re going to do the same thing for a possible day with two, and a possible day with three.

This rule is exactly the same as the one above, except it does the same thing twice, for two screenings and two films. Because we are not including a condition that A and B are not the same film, we are creating possible days where the same film is shown twice. Those days aren’t valid, and if we build a schedule out of those days, it won’t be valid, either. But that’s OK, because Blawx will be able to tell the different, and it gives us the opportunity to show how many incorrect schedules Blawx is able to figure out to ignore.

But if this was something being used with a lot of data, in production, you would want to add the block

to the conditions.

The same rule for the three-film possible days looks like this:

You can see that the possible days that we are generating have gone from pd(?A) to pd(?A,?B,?C). The one-day rule was generating objects that will be named things like “pd(Greed)”. The three-day rule generates objects that will be named things like “pd(Greed,Greed,Greed)”.

Generating Possible Festival Schedules

Now, we could just generate a possible festival schedule for every possible day. But there are 39 possible days, and a possible festival needs three, so that would be generating 393 possible festival schedules, which is nearly 60,000 schedules. That’s a lot of data to create, only to have to eventually ignore most of it, so the balance between simplicity and speed favours trying to cut that down to a better size.

So what we will do before we generate possible festival schedules is we will create a Category for a Possible Day that would be valid if it occurred on a Thursday, Friday, or Saturday. We don’t need to store any new information about this, it’s just a new category that a Possible Day may or may not fit into.

Again, as always, we start with describing what we are talking about. So the declaration looks like this:

Now we need to create the rules that will figure out whether or not a Possible Day would be valid for each of the three days, and put it into that category if it would.

We already have rules that figure out whether or not the Thursday, Friday, and Saturday of a Schedule are compliant with the rules. So let’s take a look at our Thursday Rule, and see if we can just use that.

Ok, so the conclusion will have to be that the object A belongs in the category “ValidThursday”, we need to look at “PossibleDay” objects, not “Schedule” objects, we need to use the “pd_screening” attribute instead of the “screening” attribute, but all of those are easy to do.

The real problem is “last_thursday”. We have calculated what the last film of the day is per Schedule, not per day. A schedule has a “last_thursday”, “last_friday”, and “last_saturday”, but a possible day can’t use any of those because its screenings don’t have any “day” attributes defined.

So we’re going to have to go back to Step 1, and create an attribute of a possible day that is that possible day’s last film.

Here’s what the redefined Category looks like.

Now, we need to define the pd_last attribute for every possible day we create. The easy way to do that is to add code to the three rules we created above for generating possible days. Here’s what the extra block looks like in the three-film rule:

OK, now we have a “pd_last” attribute that we can use to test whether or not a possible day would be valid for a Thursday. Here’s what our new rule looks like:

We can do the same process of looking at our Friday and Saturday rules and create new ones that will work for testing Possible Days. Here’s what they look like:

So now we have a Category of Possible Days that are valid for Thursday, another for Friday, and another for Saturday, and each of these will have far fewer than 39 possible days in them. Thursday has 13, and Friday and Saturday have 14 each. So we have gone from nearly 60,000 possible schedules to around 2500. By writing these rules, it’s safe to say we have made the software around 20 times faster.

So now we can create our possible schedule objects.

Again, step 1 is describing what we’re talking about. That’s easy. A Possible Schedule is just a kind of Schedule.

Now, for step 2, we make the rules that create possible schedules. We’re going to do this in two phases. First, we will say “for every combination of a valid Thursday possible day, a valid Friday possible day, and a valid Saturday possible day, there is a possible schedule object.”

You can see that we are using our “advanced Blawx technique” of including the variables in the condition in the name of the object that is created in the conclusion. Because “A” is going to be referring to a valid Thursday schedule, it might refer to the object named pd(Greed,Greed,Harvest). Which means that the actual name of the object created by this rule might be something like ps(pd(Greed,Greed,Harvest),pd(Limelight),pd(Limelight, Harvest)).

Lucky for us, we don’t need to know the name that Blawx is generating in the background. We just need to know that it’s unique.

Now, that first step doesn’t completely solve the problem, because our PossibleSchedule object still doesn’t have any “screening” attributes with “day” attributes on them. We need that before the rest of the rules will work properly. So we need to create rules that converts the information in a set of possible days to regular Screening objects.

We will create three such rules, one for each day of the festival.

The rule for thursday is going to take whatever PossibleDay object was put in ?A for the PossibleSchedule named pd(?A,?B,?C), and add its contents to the “screenings” parameter of the PossibleDay.

Warning… this is going to get messy.

Here we have two more advanced Blawx techniques. You can see we are using the same trick we described above to create an object called screening_thursday_ps(?A,?B,?C,?D,?E). But now the value of the variables A, B, and C is being found in the conditions with our object declaration block.

So advanced Blawx technique #2 is this: if you use an object declaration block in the conditions of a Rule, its meaning changes. It is no longer a block that creates the object you are naming. It is now a block that checks to see whether or not that object exists.

And advanced Blawx technique #3 is this: if you use an object declaration block in the conditions of a Rule, and it has variables in its name, Blawx will find all the objects with names that match the variable pattern, and put those values into those variables.

So you can see that the conditions of this rule can be read as “find all the objects that have a name that matches the pattern ps(?A,?B,?C), find all the pd_screenings of the object whose name is in the variable ?A, and get the film and the slot value of that pd_screening.”

Then, the conclusion says “create an object with a name that is unique to this possible schedule (?A,?B,?C) and this specific screening (“thursday”, ?D, ?E). Make it a Screening object. Set it’s day, film, and slot attributes properly, and then attach it to the screening attribute of the possible schedule object.”

Now we create two more rules for Friday and Saturday that do the same thing:

Step 3: Ask your question.

We’re really generating this data to be able to answer questions 7 through 10 of the LSAT, so we don’t have a specific question, yet. But just to see what we’ve ended up with, why don’t we ask for a list of all the possible schedules that are also valid.

Here’s what Blawx tells us:

X = ps(pd(Harvest),pd(Greed),pd(Limelight,Greed))
X = ps(pd(Harvest),pd(Greed),pd(Limelight,Harvest))
X = ps(pd(Harvest),pd(Limelight),pd(Greed))
X = ps(pd(Harvest),pd(Limelight),pd(Limelight,Greed))
X = ps(pd(Harvest),pd(Harvest,Greed),pd(Limelight,Greed))
X = ps(pd(Harvest),pd(Harvest,Greed),pd(Limelight,Harvest))
X = ps(pd(Harvest),pd(Harvest,Limelight),pd(Greed))
X = ps(pd(Harvest),pd(Harvest,Limelight),pd(Limelight,Greed))
X = ps(pd(Greed,Harvest),pd(Greed),pd(Limelight,Greed))
X = ps(pd(Greed,Harvest),pd(Greed),pd(Limelight,Harvest))
X = ps(pd(Greed,Harvest),pd(Limelight),pd(Greed))
X = ps(pd(Greed,Harvest),pd(Limelight),pd(Harvest))
X = ps(pd(Greed,Harvest),pd(Limelight),pd(Limelight,Greed))
X = ps(pd(Greed,Harvest),pd(Limelight),pd(Limelight,Harvest))
X = ps(pd(Greed,Harvest),pd(Harvest,Greed),pd(Limelight,Greed))
X = ps(pd(Greed,Harvest),pd(Harvest,Greed),pd(Limelight,Harvest))
X = ps(pd(Greed,Harvest),pd(Harvest,Limelight),pd(Greed))
X = ps(pd(Greed,Harvest),pd(Harvest,Limelight),pd(Harvest))
X = ps(pd(Greed,Harvest),pd(Harvest,Limelight),pd(Limelight,Greed))
X = ps(pd(Greed,Harvest),pd(Harvest,Limelight),pd(Limelight,Harvest))
X = ps(pd(Limelight,Harvest),pd(Greed),pd(Greed))
X = ps(pd(Limelight,Harvest),pd(Greed),pd(Harvest))
X = ps(pd(Limelight,Harvest),pd(Greed),pd(Limelight,Greed))
X = ps(pd(Limelight,Harvest),pd(Greed),pd(Limelight,Harvest))
X = ps(pd(Limelight,Harvest),pd(Limelight),pd(Greed))
X = ps(pd(Limelight,Harvest),pd(Limelight),pd(Limelight,Greed))
X = ps(pd(Limelight,Harvest),pd(Harvest,Greed),pd(Greed))
X = ps(pd(Limelight,Harvest),pd(Harvest,Greed),pd(Harvest))
X = ps(pd(Limelight,Harvest),pd(Harvest,Greed),pd(Limelight,Greed))
X = ps(pd(Limelight,Harvest),pd(Harvest,Greed),pd(Limelight,Harvest))
X = ps(pd(Limelight,Harvest),pd(Harvest,Limelight),pd(Greed))
X = ps(pd(Limelight,Harvest),pd(Harvest,Limelight),pd(Limelight,Greed))

The Importance of Optimization

Our Blawx code was able to generate 39 possible single-day schedules, reduce those to 13 possible Thursdays, 14 possible Fridays, and 14 possible Saturdays, then combine those to create 2549 possible three-day schedules, and search all of those to find the 32 specific schedules that are valid.

It did that in about 30 seconds of processing time. Not too bad.

Remember we said we were leaving out the inequality operators in order to illustrate the effect on optimizing the code? Let’s do that experiment now.

If we put them in to the two-day and three-day rules to prevent the same Film from being shown twice on the same day in a possible one-day schedule, the number of valid possible Thursdays goes down to 5, Fridays to 4, and Saturdays to 4. That’s only 80 possible schedules total, a 98% reduction. The additional processing time required to get rid of the invalid days means that the processing speed only goes down by 90%, from about 30 seconds to about 2.5. But that’s still an increase in speed of a factor of more than 10, by adding 5 inequality blocks to the code.

If we had not gotten rid of the days that were not valid under the Thursday, Friday, and Saturday rules, the software would have had to churn through 60,000 possible schedules, and it would probably have taken upwards of 10 minutes.

If you are only ever going to need the answer to the question once, letting it take 10 minutes might be the right solution, because it may take longer than that to do the optimizations. But if you’re writing code that’s going to be used even a handful of times, going from 10 minutes to 2.5 seconds is well worth the effort.

That goes to show how important it can be to find the right balance between something that will work, and something that will work quickly.

Good luck, and happy Blawxing!

No Comments

Add your comment