|Full-text search on neuromatch.social
(You may also be looking for ElasticSearch, where config and maintenance info is kept. this is about the search features in masto generally)
The chewy indexes are defined in the
app/chewy directory - https://github.com/NeuromatchAcademy/mastodon/tree/main/app/chewy
The actual work of creating the index seems to happen in
app/lib/importer- https://github.com/NeuromatchAcademy/mastodon/tree/main/app/lib/importer and
The Importers define how the individual objects to index are loaded and added to the search index, and the Scheduler runs the importers periodically. The Indexes themselves describe how Chewy handles each object.
When a relevant object is updated, it is added to the indexing queue using an
update_index method, (eg. Status:update_index). The Scheduler feeds each object to be indexed to the importer.
Each importer has a set of rules that determine if something should be added to the index. When indexing, the importers also check to see if an object is
searchable_by any accounts - described in the next section.
Note that these filters make it so far less than all posts that the server contains are indexed. Also important is that by default the indexing does not respect
nobot tags in profiles.
Those rules, summarized:
- Statuses that Mention a local account
- Statuses that have been Favorited by a local account
- Statuses that include a Poll that has been voted on by a local account
- Statuses that were Boosted by a local account.
- Statuses that were Created by a local account.
- Accounts that are searchable - ie. that are not unapproved, suspended, or moved.
- All hashtags! (except those that are in unlisted posts? -- Manisha)
The search service is located in
It queries indexed objects, and relevant to full-text search also applies the
searchable_by filter in the
searchable_by method returns a list of (local) account IDs that are capable of receiving a given status in a search. Those accounts include:
- The local account that Created the status.
- Local accounts that are Mentioned in the status
- Local accounts that have Favorited the status
- Local accounts that have Boosted the status
- Local accounts that have Bookmarked the status
- Local accounts that have Voted on a Poll in the status.
searchable_by is calculated for a given status, a status can also be excluded from search results if they fail the
StatusFilter which removes
- Accounts that are blocked, muted, or on a domain blocked by the post creator
- Accounts that have been silenced by the instance and are not following the creator of the status.
StatusFilter also removes statuses that fail the
StatusPolicy:show check, which includes
- Remote accounts if the post is local only
- Accounts that are suspended
- Accounts that are not mentioned if the visibility of the post is "direct" or "limited"
- Accounts that are not mentioned and do not follow the posting account if the visibility is set to "private" (followers only)
- Accounts that are blocked, or on a domain that is blocked by the creator of the status.
- VyrCossant has a parametric search that builds on top of the base masto search: https://github.com/VyrCossont/mastodon/pull/9
- ElasticSearch docs https://docs.joinmastodon.org/admin/optional/elasticsearch/
- Chewy - https://github.com/toptal/chewy
- Search thread on Tech WG hacks channel - https://discord.com/channels/1049136631065628772/1094738707086581790/1094738707086581790