Actor reentrancy in Swift defined – Donny Wals


If you begin studying about actors in Swift, you’ll discover that explanations will all the time include one thing alongside the traces of “Actors shield shared mutable state by ensuring the actor solely does one factor at a time”. As a single sentence abstract of actors, that is nice nevertheless it misses an vital nuance. Whereas it’s true that actors do just one factor at a time, they don’t all the time execute perform calls atomically.

On this publish, we’ll discover the next:

  • Exploring what actor reentrancy is
  • Understanding why async features in actors may be problematic

Typically talking, you’ll use actors for objects that should maintain mutable state whereas additionally being protected to move round in duties. In different phrases, objects that maintain mutable state, are handed by reference, and have a have to be Sendable are nice candidates for being actors.

Implementing a easy actor

A quite simple instance of an actor is an object that caches information. Right here’s how which may look:

actor DataCache {
  var cache: [UUID: Data] = [:]
}

We will immediately entry the cache property on this actor with out worrying about introducing information races. We all know that the actor will make it possible for we gained’t run into information races once we get and set values in our cache from a number of duties in parallel.

If wanted, we are able to make the cache non-public and write separate learn and write strategies for our cache:

actor DataCache {
  non-public var cache: [UUID: Data] = [:]

  func learn(_ key: UUID) -> Knowledge? {
    return cache[key]
  }

  func write(_ key: UUID, information: Knowledge) {
    cache[key] = information
  }
}

All the pieces nonetheless works completely positive within the code above. We’ve managed to restrict entry to our caching dictionary and customers of this actor can work together with the cache by a devoted learn and write technique.

Now let’s make issues somewhat extra sophisticated.

Including a distant cache function to our actor

Let’s think about that our cached values can both exist within the cache dictionary or remotely on a server. If we are able to’t discover a particular key regionally our plan is to ship a request to a server to see if the server has information for the cache key that we’re searching for. Once we get information again we cache it regionally and if we don’t we return nil from our learn perform.

Let’s replace the actor to have a learn perform that’s async and makes an attempt to learn information from a server:

actor DataCache {
  non-public var cache: [UUID: Data] = [:]

  func learn(_ key: UUID) async -> Knowledge? {
    print(" cache learn known as for (key)")
    defer {
      print(" cache learn completed for (key)")
    }

    if let information = cache[key] {
      return information
    }

    do {
      print(" try to learn distant cache for (key)")
      let url = URL(string: "http://localhost:8080/(key)")!
      let (information, response) = attempt await URLSession.shared.information(from: url)

      guard let httpResponse = response as? HTTPURLResponse,
              httpResponse.statusCode == 200 else {
        print(" distant cache MISS for (key)")
        return nil
      }

      cache[key] = information
      print(" distant cache HIT for (key)")
      return information
    } catch {
      print(" distant cache MISS for (key)")
      return nil
    }
  }

  func write(_ key: UUID, information: Knowledge) {
    cache[key] = information
  }
}

Our perform is quite a bit longer now nevertheless it does precisely what we got down to do; verify if information exists regionally, try to learn it from the server if wanted and cache the consequence.

In case you run and take a look at this code it can more than likely work precisely such as you’ve meant, nicely carried out!

Nonetheless, when you introduce concurrent calls to your learn and write strategies you’ll discover that outcomes can get somewhat unusual…

For this publish, I’m operating a quite simple webserver that I’ve pre-warmed with a few values. After I make a handful of concurrent requests to learn a price that’s cached remotely however not regionally, right here’s what I see within the console:

 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 try to learn distant cache for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 try to learn distant cache for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 try to learn distant cache for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 try to learn distant cache for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 try to learn distant cache for DDFA2377-C10F-4324-BBA3-68126B49EB00
 distant cache HIT for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00
 distant cache HIT for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00
 distant cache HIT for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00
 distant cache HIT for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00
 distant cache HIT for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00

As you’ll be able to see, executing a number of learn operations leads to having plenty of requests to the server, even when the information exists and also you anticipated to have the information cached after your first name.

Our code is written in a method that ensures that we all the time write a brand new worth to our native cache after we seize it from the distant so we actually shouldn’t anticipate to be going to the server this typically.

Moreover, we’ve made our cache an actor so why is it operating a number of calls to our learn perform concurrently? Aren’t actors imagined to solely do one factor at a time?

The issue with awaiting within an actor

The code that we’re utilizing to seize data from a distant information supply really forces us right into a scenario the place actor reentrancy bites us.

Actors solely do one factor at a time, that’s a truth and we are able to belief that actors shield our mutable state by by no means having concurrent learn and write entry occur on mutable state that it owns.

That mentioned, actors don’t like to take a seat round and do nothing. Once we name a synchronous perform on an actor that perform will run begin to finish with no interruptions; the actor solely does one factor at a time.

Nonetheless, once we introduce an async perform that has a suspension level the actor is not going to sit round and watch for the suspension level to renew. As an alternative, the actor will seize the subsequent message in its “mailbox” and begin making progress on that as a substitute. When the factor we have been awaiting returns, the actor will proceed engaged on our authentic perform.

Actors don’t like to take a seat round and do nothing after they have messages of their mailbox. They’ll decide up the subsequent job to carry out at any time when an energetic job is suspended.

The truth that actors can do that is known as actor reentrancy and it could actually trigger fascinating bugs and challenges for us.

Fixing actor reentrancy generally is a difficult drawback. In our case, we are able to clear up the reentrancy subject by creating and retaining duties for every community name that we’re about to make. That method, reentrant calls to learn can see that we have already got an in progress job that we’re awaiting and people calls may even await the identical job’s consequence. This ensures we solely make a single community name. The code beneath exhibits your complete DataCache implementation. Discover how we’ve modified the cache dictionary in order that it could actually both maintain a fetch job or our Knowledge object:

actor DataCache {
  enum LoadingTask {
    case inProgress(Activity<Knowledge?, Error>)
    case loaded(Knowledge)
  }

  non-public var cache: [UUID: LoadingTask] = [:]
  non-public let remoteCache: RemoteCache

  init(remoteCache: RemoteCache) {
    self.remoteCache = remoteCache
  }

  func learn(_ key: UUID) async -> Knowledge? {
    print(" cache learn known as for (key)")
    defer {
      print(" cache learn completed for (key)")
    }

    // we've the information, no have to go to the community
    if case let .loaded(information) = cache[key] {
      return information
    }

    // a earlier name began loading the information
    if case let .inProgress(job) = cache[key] {
      return attempt? await job.worth
    }

    // we do not have the information and we're not already loading it
    do {
      let job: Activity<Knowledge?, Error> = Activity {
        guard let information = attempt await remoteCache.learn(key) else {
          return nil
        }

        return information
      }

      cache[key] = .inProgress(job)
      if let information = attempt await job.worth {
        cache[key] = .loaded(information)
        return information
      } else {
        cache[key] = nil
        return nil
      }
    } catch {
      return nil
    }
  }

  func write(_ key: UUID, information: Knowledge) async {
    print(" cache write known as for (key)")
    defer {
      print(" cache write completed for (key)")
    }

    do {
      attempt await remoteCache.write(key, information: information)
    } catch {
      // didn't retailer the information on the distant cache
    }
    cache[key] = .loaded(information)
  }
}

I clarify this strategy extra deeply in my publish on constructing a token refresh move with actors in addition to my publish on constructing a customized async picture loader so I gained’t go into an excessive amount of element right here.

Once we run the identical take a look at that we ran earlier than, the consequence appears to be like like this:

 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn known as for DDFA2377-C10F-4324-BBA3-68126B49EB00
 try to learn distant cache for DDFA2377-C10F-4324-BBA3-68126B49EB00
 distant cache HIT for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00
 cache learn completed for DDFA2377-C10F-4324-BBA3-68126B49EB00

We begin a number of cache reads, that is actor reentrancy in motion. However as a result of we’ve retained the loading job so it may be reused, we solely make a single community name. As soon as that decision completes, all of our reentrant cache learn actions will obtain the identical output from the duty we created within the first name.

The purpose is that we are able to depend on actors doing one factor at a time to replace some mutable state earlier than we hit our await. This state will then inform reentrant calls that we’re already engaged on a given job and that we don’t have to make one other (on this case) community name.

Issues develop into trickier whenever you attempt to make your actor right into a serial queue that runs async duties. In a future publish I’d wish to dig into why that’s so difficult and discover doable options.

In Abstract

Actor reentrancy is a function of actors that may result in delicate bugs and surprising outcomes. Resulting from actor reentrancy we have to be very cautious once we’re including async strategies to an actor, and we have to make it possible for we take into consideration what can and will occur when we’ve a number of, reentrant, calls to a particular perform on an actor.

Generally that is utterly positive, different instances it’s wasteful however gained’t trigger issues. Different instances, you’ll run into issues that come up resulting from sure state in your actor being modified whereas your perform was suspended. Each time you await one thing within an actor it’s vital that you simply ask your self whether or not you’ve made any state associated assumptions earlier than your await that you might want to reverify after your await.

The first step to avoiding reentrancy associated points is to know what it’s, and have a way of how one can clear up issues after they come up. Sadly there’s no single answer that fixes each reentrancy associated subject. On this publish you noticed that holding on to a job that encapsulates work can forestall a number of community calls from being made.

Have you ever ever run right into a reentrancy associated drawback your self? And in that case, did you handle to unravel it? I’d love to listen to from you on Twitter or Mastodon!



Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here

Stay on op - Ge the daily news in your inbox