Search
Search | |
---|---|
Description | Full-text search on neuromatch.social |
Part Of | Mastodon/Hacking |
Contributors | Jonny Saunders
|
Completion Status | Stub |
Active Status | Active |
Approval Status | Draft |
(You may also be looking for ElasticSearch, where config and maintenance info is kept. this is about the search features in masto generally)
Mastodon ElasticSearch
The basic masto full-text search uses ElasticSearch via Chewy.
Indexing
Summary
The chewy indexes are defined in the app/chewy
directory - https://github.com/NeuromatchAcademy/mastodon/tree/main/app/chewy
The actual work of creating the index seems to happen in
app/lib/importer
- https://github.com/NeuromatchAcademy/mastodon/tree/main/app/lib/importer andapp/workers/scheduler
- https://github.com/NeuromatchAcademy/mastodon/blob/main/app/workers/scheduler/indexing_scheduler.rb
The Importers define how the individual objects to index are loaded and added to the search index, and the Scheduler runs the importers periodically. The Indexes themselves describe how Chewy handles each object.
When a relevant object is updated, it is added to the indexing queue using an update_index
method, (eg. Status:update_index). The Scheduler feeds each object to be indexed to the importer.
Filters
Each importer has a set of rules that determine if something should be added to the index. When indexing, the importers also check to see if an object is searchable_by
any accounts - described in the next section.
Note that these filters make it so far less than all posts that the server contains are indexed. Also important is that by default the indexing does not respect noindex
or nobot
tags in profiles.
Those rules, summarized:
- Statuses that Mention a local account
- Statuses that have been Favorited by a local account
- Statuses that include a Poll that has been voted on by a local account
- Statuses that were Boosted by a local account.
- Statuses that were Created by a local account.
- Accounts that are searchable - ie. that are not unapproved, suspended, or moved.
- All hashtags! (except those that are in unlisted posts? -- Manisha)
Querying
The search service is located in services/search_service.rb
It queries indexed objects, and relevant to full-text search also applies the searchable_by
filter in the Status
model
The searchable_by
method returns a list of (local) account IDs that are capable of receiving a given status in a search. Those accounts include:
- The local account that Created the status.
- Local accounts that are Mentioned in the status
- Local accounts that have Favorited the status
- Local accounts that have Boosted the status
- Local accounts that have Bookmarked the status
- Local accounts that have Voted on a Poll in the status.
After searchable_by
is calculated for a given status, a status can also be excluded from search results if they fail the StatusFilter
which removes
- Accounts that are blocked, muted, or on a domain blocked by the post creator
- Accounts that have been silenced by the instance and are not following the creator of the status.
StatusFilter
also removes statuses that fail the StatusPolicy:show
check, which includes
- Remote accounts if the post is local only
- Accounts that are suspended
- Accounts that are not mentioned if the visibility of the post is "direct" or "limited"
- Accounts that are not mentioned and do not follow the posting account if the visibility is set to "private" (followers only)
- Accounts that are blocked, or on a domain that is blocked by the creator of the status.
Other Implementations
- VyrCossant has a parametric search that builds on top of the base masto search: https://github.com/VyrCossont/mastodon/pull/9
See Also
References
- ElasticSearch docs https://docs.joinmastodon.org/admin/optional/elasticsearch/
- Chewy - https://github.com/toptal/chewy
Prior Conversations
- Search thread on Tech WG hacks channel - https://discord.com/channels/1049136631065628772/1094738707086581790/1094738707086581790