Bolt Download and Offline Behaviour
Table of Contents
Pugpig Bolt aims to strike a balance between ensuring content is available whenever a user wants to read it, and respecting the limitations of data usage and storage space. This document details:
- What content a user should have available offline
- When this content is fetched
- How you can manually make your user’s apps fetch content
What content a user will have available
This differs according to whether the content is an edition or timeline and is detailed below. Any content added to the saved timeline is also proactively downloaded and preserved.
Editions
Editions are asynchronously downloaded or updated when opened or the download icon on the storefront is tapped and are stored indefinitely as long as the edition remains available to the user and the user does not manually delete it.
We don’t impose a limit on how many editions can be stored, and the user is able to trigger multiple edition downloads at once. We have the ability for the user to select a time period after which editions are automatically deleted (auto-archiving) and they can also be manually deleted.
These downloads include:
- Article text
- PDF pages
- Styles
- Scripts
Timelines
Timeline content is saved to the users device when they navigate to a timeline or pull to refresh on a timeline. Any articles the user has read are also saved. This content is cached up to 200 MB and remains available for some time depending on usage. The reason we can’t specify exactly how long is fundamental to our use of the Least Recently Used (LRU) algorithm. See this article from Medium for more info about the LRU algorithm. Additionally, mobile operating systems retain the ability to reduce or remove an app’s stored data when device storage runs low.
This cached data includes:
- Article text
- Styles
- Scripts
Offline article images
In discussion with you, we can alter the prefetch for both collection types to increase or decrease the amount that is fetched. For example, you may want the background downloads to include the article images, because you have a lot of image-centric content. This will substantially increase the data usage for both you and your users, but may offer an improved reading experience.
Dynamic Timelines
Dynamic timelines use the app search index to populate content, meaning that they'll behave a little differently offline compared to editions and regular timelines. The user needs to be online on the timeline for around 10 seconds in order for the content to be available when offline. If the user is on a fresh install, they will need to open at least 1 article in order to fetch and cache the article CSS for future offline browsing.
Cached data includes:
- Article text
- Article images
- Styles
- Scripts
Background fetching of content
In addition to the user-initiated methods of downloading content mentioned above, Bolt uses background fetch in order to maximise the availability of fresh content. This means that the app periodically wakes up when backgrounded and fetches content from the server. What content we fetch is based on a heuristic which prioritises content from timelines the user frequently visits. This means that some timelines might not be refreshed in the background, particularly in the case of apps with a large number of timelines. This is what will be actively fetched:
- The first timeline in the first timelinegroup tab
- The saved timeline
- "Interacted" timelines, but only if you've interacted with them within the last 3 days. You're considered to have interacted with a timeline if:
- You scroll a short distance down that timeline
- You've read 3 articles from that timeline in the last 72 hours
We also default to only prefetching the first 50 articles of a timeline. This should be reduced in the case of apps with a large number of timelines.
The frequency and timing of this content refresh is dictated by the operating system, and is likely to happen more regularly the more a user engages with the app. The volume of content fetched is also controlled by the operating system, as the app is effectively given a certain amount of resource (time, storage and bandwidth) and we’ll do the best we can in that time. Furthermore, the operating system chooses the order in which the queued files are downloaded, hence the articles not necessarily being fetched in the same order as they’re listed on the timeline
Background push
It’s also possible to proactively trigger a background content fetch via a silent push notification, this will have the same effect described above, just at a time of your choosing. This is particularly useful in the case of apps with clear usage patterns (i.e. a morning edition) or shortly after any other notable content drop.
It is possible to trigger a specific edition download using background push. These silent notifications can include one or multiple collection IDs in the payload, and those editions will then be downloaded in full.
For a push notification to trigger background push it must include the ids of the collection(s) to download. See example format below:
--data '{
"audience": "all",
"notification":{
"android": {
"extra": {
"download_ids": "one,two,three"
}
},
"ios": {
"extra": {
"download_ids": "one,two,three"
},
"content_available": true
}
},
"device_types":["android", "ios"]
}'
Note that content-available is an iOS-specific attribute.
A content-available push without any collection IDs won't download any specific collection(s) (but will re-check the top-level feed).
We're planning to allow configuration of this from distribution (for Airship, Firebase and OneSignal).
We're also planning to expand our automated notification capabilities to optionally send such a notification whenever a new edition is published.
Deletion
Downloaded content is removed in a couple of ways, and this varies by the type of content it is. Timeline content is cached up to 200mb, once that limit is reached content will be marked for deletion when the app is foregrounded. Deciding what content is marked is the work of the Least-Recently-Used (LRU) algorithm that marks content that has not been interacted with in some time rather than freshly downloaded content.
Downloaded editions are not subject to this cache limit and will remain downloaded until they are manually marked for deletion or auto-archived. Once marked for deletion they form part of the 200mb cache and will be removed accordingly. This does mean that in the uncommon case of the overall app cache being less than 200mb, an edition marked for deletion may not immediately be removed, this will be reflected on the storefront.
Finally, both iOS and Android themselves retain the ability to delete any data should the device be close to running out of memory. The mechanism by which this is decided is opaque, but generally will mean apps and files that have not recently been interacted with are more likely to be deleted.