Bob Marinier : controlling access to shared resources

Introduction

Sometimes a system needs to control access to shared resources. For example, Soar's episodic and semantic memory can only have one query at a time – e.g., if two different goals put queries on the same semantic memory link at the same time, the result will be a bad command. There may be some data structure in working memory that you need to ensure will not change until some goal has finished with it. In either case, the agent needs a way to ensure that its processing will be uninterrupted by something else.

This issue is most common in NGS systems, which have many active goals at once, but can also arise in Michigan style systems (although it is probably very rare as there is only one goal stack active at a time). In NGS systems, the goals are often operating independently of each other, and thus aren't coordinating over who gets access to resources, which can lead to collisions (which probably aren't even detected and simply result in bad behavior or a stuck agent). In Michigan style approaches, it can also happen that some goal is using a shared resource, and then gets interrupted by something else which may want to take over that resource (most likely a special data structure).

There are at least a couple approaches:

  • Mutexes (to mark a structure as in-use)
  • Substates (to prevent any other operators from executing)

Approach 1: Mutexes

Soar doesn't have a built-in mutex, but you can create one on your own. The key is that there is nothing other than convention enforcing the mutex – if someone writes code to access the shared structure without getting ownership of the mutex, things will still break.

Here's an example of creating and using a mutex to control access to a data structure (the mutex is implemented here as a structure with an owner):

sp "my-goal*propose*obtain-access
   [NGS_match-active-goal my-goal <goal> <s>]
   (state <s> ^restricted-data-mutex <m>)
  -(<m> ^owner)
-->
   [NGS_create-operator obtain-access <goal> <o>]
   (<o> ^restricted-data-mutex <m>)
"

sp "my-goal*apply*obtain-access
   [NGS_match-active-goal my-goal <goal> <s>]
   [NGS_match-operator obtain-access <o> <goal>]
   (<o> ^restricted-data-mutex <m>)
-->
   (<m> ^owner <goal>)
"

sp "my-goal*propose*use-restricted-data
   [NGS_match-active-goal my-goal <goal> <s>]
   (state <s> ^restricted-data-mutex <m>)
   (<m> ^owner <goal>)
-->
   # do something
"

sp "my-goal*propose*release-access
   [NGS_match-active-goal my-goal <goal> <s>]
   (state <s> ^restricted-data-mutex <m>)
   (<m> ^owner <goal>)
   # other conditions met
-->
   [NGS_create-operator release-access <goal> <o>]
   (<o> ^restricted-data-mutex <m>)
"
sp "my-goal*apply*release-access
   [NGS_match-active-goal my-goal <goal> <s>]
   [NGS_match-operator release-access <o> <goal>]
   (<o> ^restricted-data-mutex <m>)
   (<m> ^owner <goal>)
-->
   (<m> ^owner <goal> -)
"

sp "assert*restricted-data-mutex*multiple-owners
   (state <s> ^restricted-data-mutex.owner <o1> {<o2> <> <o1>})
-->
   (write (crlf) |ERROR: restricted-data-mutex has multiple owners: |<o1>| and |<o2>)
   (interrupt)
"

This basically checks to see if the mutex is available, and grabs it if possible. Since we're using operators to do this, even if many goals try to grab it at once, only one will succeed. Note that it's up to the programmer to remember to get the mutex and to check that the goal has the mutex before doing things with the restricted data. I've added an "assert" rule to at least catch errors where multiple goals try to grab the mutex, but this hardly catches everything. This pattern could probably be wrapped up in a set of more general Tcl macros if desired.

Approach 2: Use architectural subgoals

The reason why this issue is rare in Michigan style goals is because typically only one goal is active at once. That is, Michigan style is always using architectural goals. But there's no reason why this can't be done in an NGS system as well. Indeed, the effect of an architectural subgoal in an NGS system is to force the agent to stop interleaving operators at the top level – in essence, to stop processing on all other goals. This makes it safe for the agent to do whatever protected processing it needs to in the architectural subgoal without worrying about some other goal jumping in and screwing things up. Even if many goals simultaneously propose operators to access the restricted data, only one will win, and when it impasses, the others will be forced to wait.

Here's an example of using an architectural subgoal to access restricted data (this assumes the rules that copy operator arguments down to subgoals from General guidance are loaded):

sp "my-goal*propose*obtain-access
   [NGS_match-active-goal my-goal <goal> <s>]
   (state <s> ^restricted-data <data>)
   # other conditions
-->
   [NGS_create-operator obtain-access <goal> <o>]
   (<o> ^restricted-data <data>)
"

# no apply rule, so will impasse

sp "my-goal*propose*use-restricted-data
   (state <s> ^name obtain-access
              ^args.restricted-data <data>)
-->
   # do something
"

In this example, we are assuming that the processing in the architectural subgoal will eventually trigger a change that causes the subgoal to retract. It is also likely that you will want to pass the goal structure down to the subgoal, as it may have data attached to it that you need (especially if this is a generic subgoal).

The potential downside of this approach is that it locks out ALL other goals, and not just the ones that might interfere with the processing. If the processing is short, this is probably not a big deal. But if it is long, it could mean that the agent is falling behind on other important goals, or becoming unresponsive (e.g., not replying to messages, etc.). On the flip side, since the agent is fully focused on this one goal for the duration of the impasse, this technique could also be used for performance sensitive processing, where the agent needs to finish something as quickly as possible.