1,035
edits
No edit summary |
|||
(4 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
|Contributors=Jonny | |Contributors=Jonny | ||
|Has Git Repository=https://github.com/NeuromatchAcademy/mastodon | |Has Git Repository=https://github.com/NeuromatchAcademy/mastodon | ||
|Completion Status= | |Completion Status=Completed | ||
|Active Status=Inactive | |Active Status=Inactive | ||
}} | }} | ||
* Pull Request: [[Has Pull Request::https://github.com/NeuromatchAcademy/mastodon/pull/36]] | * Pull Request: [[Has Pull Request::https://github.com/NeuromatchAcademy/mastodon/pull/36]] | ||
* Bugfix - respect local vs remote scope: [[Has Pull Request::https://github.com/NeuromatchAcademy/mastodon/pull/38]] | |||
== Problem == | == Problem == | ||
Line 86: | Line 87: | ||
== Implementation == | == Implementation == | ||
=== Simpler Implementation (good) === | |||
Instead of doing all that garbage below, we can just restate the problem as "we want posts that aren't boosts, or posts that are boosts with the maximum ID for all boosts of a given post" | |||
Like this: | |||
<syntaxhighlight lang="ruby"> | |||
def max_boost_id_scope | |||
Status.where(<<~SQL.squish) | |||
"statuses"."id" = ( | |||
SELECT MAX(id) | |||
FROM "statuses" "s2" | |||
WHERE "s2"."reblog_of_id" = "statuses"."reblog_of_id" | |||
#{'AND ("s2"."local" = true OR "s2"."uri" IS NULL)' if local_only?} | |||
#{'AND "s2"."local" = false AND "s2"."uri" IS NOT NULL' if remote_only?} | |||
) | |||
SQL | |||
end | |||
def without_duplicate_reblogs | |||
Status.where(statuses: { reblog_of_id: nil }) | |||
.or(max_boost_id_scope) | |||
end | |||
</syntaxhighlight> | |||
==== Caveats ==== | |||
Note that we are using string interpolations of the <code>Status.local</code> and <code>Status.remote</code> scopes because there isn't a way, as far as [[Jonny]] can tell, to change the table alias of a scope. This shouldn't be that big of an issue (if, eg. the definition of those scopes changes) because the tests should catch most cases. | |||
==== Example Query ==== | |||
<syntaxhighlight lang="sql"> | |||
SELECT "statuses"."id", | |||
"statuses"."updated_at" | |||
FROM "statuses" | |||
INNER JOIN "accounts" ON "accounts"."id" = "statuses"."account_id" | |||
WHERE "statuses"."visibility" = $1 | |||
AND "accounts"."suspended_at" IS NULL | |||
AND "accounts"."silenced_at" IS NULL | |||
AND (statuses.reply = FALSE | |||
OR statuses.in_reply_to_account_id = statuses.account_id) | |||
AND ("statuses"."reblog_of_id" IS NULL | |||
OR "statuses"."id" = | |||
(SELECT MAX(id) | |||
FROM statuses s2 | |||
WHERE s2.reblog_of_id = statuses.reblog_of_id | |||
AND ("s2"."local" = true OR "s2"."uri" IS NULL) | |||
)) | |||
AND ("statuses"."local" = $2 | |||
OR "statuses"."uri" IS NULL) | |||
AND "statuses"."deleted_at" IS NULL | |||
AND 1=1 | |||
AND "statuses"."id" < 111813463418866657 | |||
ORDER BY "statuses"."id" DESC LIMIT $3 [["visibility", 0], ["local", true], ["LIMIT", 20]] | |||
</syntaxhighlight> | |||
=== Original Implementation (bad) === | |||
Actually pretty damn simple. Add an additional scope in <code>public_feed.rb</code> | Actually pretty damn simple. Add an additional scope in <code>public_feed.rb</code> | ||
Line 156: | Line 215: | ||
.or(Status.where(id: inner_query)) | .or(Status.where(id: inner_query)) | ||
end | end | ||
</syntaxhighlight> | |||
But one MORE problem - since we're just considering one page at a time, we will still get duplicated boosts across pages. So... | |||
* When no minimum ID is provided, we use the "multiply the limit" strategy to avoid querying all statuses from all time | |||
* When a minimum ID is provided, we | |||
** Use no limit | |||
** If a maximum ID is also provided, we add 1 day worth of time to the ID so we also don't query arbitrarily into the future when fetching past pages | |||
<syntaxhighlight lang="ruby"> | |||
def without_duplicate_reblogs(limit, max_id, since_id, min_id) | |||
candidate_statuses = Status.select(:id).reorder(id: :desc) | |||
if min_id.present? | |||
candidate_statuses = candidate_statuses.where(Status.arel_table[:id].gt(min_id)) | |||
elsif since_id.present? | |||
candidate_statuses = candidate_statuses.where(Status.arel_table[:id].gt(since_id)) | |||
elsif limit.present? | |||
limit *= 5 | |||
candidate_statuses = candidate_statuses.limit(limit) | |||
end | |||
if max_id.present? | |||
max_time = Mastodon::Snowflake.to_time(id) | |||
max_time += 1.day | |||
max_id = Mastodon::Snowflake.id_at(max_time) | |||
candidate_statuses = candidate_statuses.where(Status.arel_table[:id].lt(max_id)) | |||
end | |||
inner_query = Status | |||
.where(id: candidate_statuses) | |||
.select('DISTINCT ON (reblog_of_id) statuses.id') | |||
.reorder(reblog_of_id: :desc, id: :desc) | |||
Status.where(statuses: { reblog_of_id: nil }) | |||
.or(Status.where(id: inner_query)) | |||
end | |||
</syntaxhighlight> | |||
==== Example Query ==== | |||
<syntaxhighlight lang="sql"> | |||
SELECT "statuses"."id", "statuses"."updated_at" | |||
FROM "statuses" | |||
INNER JOIN "accounts" ON "accounts"."id" = "statuses"."account_id" | |||
WHERE "statuses"."visibility" = $1 | |||
AND "accounts"."suspended_at" IS NULL | |||
AND "accounts"."silenced_at" IS NULL | |||
AND ( | |||
statuses.reply = FALSE | |||
OR statuses.in_reply_to_account_id = statuses.account_id | |||
) | |||
AND ( | |||
"statuses"."reblog_of_id" IS NULL | |||
OR "statuses"."id" IN ( | |||
SELECT DISTINCT ON (reblog_of_id) statuses.id | |||
FROM "statuses" | |||
WHERE "statuses"."deleted_at" IS NULL | |||
AND "statuses"."id" IN ( | |||
SELECT "statuses"."id" FROM "statuses" | |||
WHERE "statuses"."deleted_at" IS NULL | |||
AND "statuses"."id" < 111819125737828654 | |||
ORDER BY "statuses"."id" DESC | |||
LIMIT $2 | |||
) | |||
ORDER BY "statuses"."reblog_of_id" DESC, "statuses"."id" DESC | |||
) | |||
) | |||
AND ( | |||
"statuses"."local" = $3 | |||
OR "statuses"."uri" IS NULL | |||
) | |||
AND "statuses"."deleted_at" IS NULL | |||
AND 1=1 | |||
AND "statuses"."id" < 111813463418866657 | |||
ORDER BY "statuses"."id" DESC LIMIT $4 | |||
[["visibility", 0], ["LIMIT", 100], ["local", true], ["LIMIT", 20]] | |||
</syntaxhighlight> | </syntaxhighlight> | ||