Filter Duplicate Boosts: Difference between revisions

Line 146: Line 146:
       limit *= 5
       limit *= 5
       candidate_statuses = candidate_statuses.limit(limit)
       candidate_statuses = candidate_statuses.limit(limit)
    end
    inner_query = Status
                  .where(id: candidate_statuses)
                  .select('DISTINCT ON (reblog_of_id) statuses.id')
                  .reorder(reblog_of_id: :desc, id: :desc)
    Status.where(statuses: { reblog_of_id: nil })
          .or(Status.where(id: inner_query))
  end
</syntaxhighlight>
But one MORE problem - since we're just considering one page at a time, we will still get duplicated boosts across pages. So...
* When no minimum ID is provided, we use the "multiply the limit" strategy to avoid querying all statuses from all time
* When a minimum ID is provided, we
** Use no limit
** If a maximum ID is also provided, we add 1 day worth of time to the ID so we also don't query arbitrarily into the future when fetching past pages
<syntaxhighlight lang="ruby">
  def without_duplicate_reblogs(limit, max_id, since_id, min_id)
    candidate_statuses = Status.select(:id).reorder(id: :desc)
    if min_id.present?
      candidate_statuses = candidate_statuses.where(Status.arel_table[:id].gt(min_id))
    elsif since_id.present?
      candidate_statuses = candidate_statuses.where(Status.arel_table[:id].gt(since_id))
    elsif limit.present?
      limit *= 5
      candidate_statuses = candidate_statuses.limit(limit)
    end
    if max_id.present?
      max_time = Mastodon::Snowflake.to_time(id)
      max_time += 1.day
      max_id = Mastodon::Snowflake.id_at(max_time)
      candidate_statuses = candidate_statuses.where(Status.arel_table[:id].lt(max_id))
     end
     end