-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different feed update frequencies #332
Comments
Logic is easy enough (turning update_after etc. to/from seconds not shown): import random
from dataclasses import dataclass
@dataclass
class Feed:
url: str
update_period: int
update_after: int = 0
# last_updated is only set when the feed is actually updated
# (not when it's not modified, not when there was an exception)
# https://github.com/lemon24/reader/blob/3.12/src/reader/_update.py#L276
# https://github.com/lemon24/reader/blob/3.12/src/reader/_update.py#L445
last_retrieved: int = 0
def get_feeds_for_update(feeds, now):
return [f for f in feeds if f.update_after <= now]
def next_period(feed, now, jitter_ratio=0):
jitter = random.random() * jitter_ratio
current_period_no = now // feed.update_period
return (current_period_no + 1 + jitter) * feed.update_period
def update_feeds(feeds, now, get_update_after=next_period):
to_update = get_feeds_for_update(feeds, now)
for feed in to_update:
feed.last_retrieved = now
feed.update_after = get_update_after(feed, now)
return to_update
def set_update_period(feed, update_period):
feed.update_period = update_period
feed.update_after = next_period(feed, feed.last_retrieved) Tests:from collections import Counter
from functools import partial
import pytest
@pytest.mark.parametrize('old_after, new_after, now', [
(0, 10, 0),
(0, 10, 1),
(0, 10, 9.999),
(0, 20, 10),
(5, 10, 5),
(10, 20, 10),
(105, 110, 109),
(105, 120, 110),
(105, 200, 199.999),
(105, 210, 200),
])
def test_update(old_after, new_after, now):
feeds = [Feed('one', 10, old_after)]
assert len(update_feeds(feeds, now)) == 1
assert feeds == [Feed('one', 10, new_after, now)]
@pytest.mark.parametrize('old_after, now', [
(5, 4),
(10, 9.999),
(20, 19),
])
def test_no_update(old_after, now):
feeds = [Feed('one', 10, old_after)]
assert len(update_feeds(feeds, now)) == 0
assert feeds == [Feed('one', 10, old_after)]
@pytest.mark.parametrize('get_update_after', [
next_period,
# jitter ratio less than 10-1, to account for time step
partial(next_period, jitter_ratio=.9),
])
def test_sweep(get_update_after):
feeds = [Feed('one', 10), Feed('two', 20), Feed('three', 100)]
counts = Counter()
for now in range(100):
for feed in update_feeds(feeds, now, get_update_after):
counts[feed.url] += 1
assert counts == {'one': 10, 'two': 5, 'three': 1}
def test_set_period_up():
feeds = [Feed('one', 10)]
update_feeds(feeds, 5)
set_update_period(feeds[0], 20)
# no update needed, already updated in this period
assert len(update_feeds(feeds, 15)) == 0
def test_set_period_down():
feeds = [Feed('one', 20)]
update_feeds(feeds, 5)
set_update_period(feeds[0], 10)
# update needed, if taking new period into account
assert len(update_feeds(feeds, 15)) == 1
On to the API! Update: We can get rid of last_retrieve and rely on the current time in set_update_period(); all tests pass with minimal changes. |
API Add
In reader 4.0 (#291), we can make To configure the update interval, we can use a
Using tags is good because it allows configuring stuff without additional UI. Possible cons (WIP):
In the (low level) storage API:
|
To do (minimal):
Later:
|
update_after and last_retrieved should go on FeedUpdateIntent, and in turn FeedUpdateIntent should have union of FeedData-with-extra-stuff or exception or None, but I'm having trouble finding a good name for FeedData-with-extra-stuff. For reference, here's all the feed-related data classes and how they're used:
|
Also, use last_retrieved instead of last_updated for the "new" feed filter.
(Pipeline will need the name scheme and get_tag() too.)
Also, use last_retrieved instead of last_updated for the "new" feed filter.
(Pipeline will need the name scheme and get_tag() too.)
#307 needs a way of changing the feed update frequency. Just like #96 (comment), this should be a configurable global/per-feed strategy.
Use cases:
update_feeds(new=True)
update_feeds(updates_enabled=True)
A unified way of updating feeds would be nice as well, e.g. "just call this every minute". Currently, I'm running the following in a cron:
update
update --new-only; search update
The text was updated successfully, but these errors were encountered: