diff options
author | Haelwenn <contact+git.pleroma.social@hacktivis.me> | 2024-01-21 12:03:29 +0000 |
---|---|---|
committer | Haelwenn <contact+git.pleroma.social@hacktivis.me> | 2024-01-21 12:03:29 +0000 |
commit | ab3f03a04a34f2e267bef649ea7d055df583e43e (patch) | |
tree | 41c555d32f0805a45f553ae348b9bf42158e9426 /docs/configuration/search.md | |
parent | a5f64ffd0cba5e183744cf46506656ba61d36eda (diff) | |
parent | 548434f85ccaa4a77c460fb5bda37153ea6e9c39 (diff) | |
download | pleroma-ab3f03a04a34f2e267bef649ea7d055df583e43e.tar.gz pleroma-ab3f03a04a34f2e267bef649ea7d055df583e43e.zip |
Merge branch 'develop' into 'public-polls'
# Conflicts:
# docs/development/API/differences_in_mastoapi_responses.md
Diffstat (limited to 'docs/configuration/search.md')
-rw-r--r-- | docs/configuration/search.md | 123 |
1 files changed, 123 insertions, 0 deletions
diff --git a/docs/configuration/search.md b/docs/configuration/search.md new file mode 100644 index 000000000..0316c9bf4 --- /dev/null +++ b/docs/configuration/search.md @@ -0,0 +1,123 @@ +# Configuring search + +{! backend/administration/CLI_tasks/general_cli_task_info.include !} + +## Built-in search + +To use built-in search that has no external dependencies, set the search module to `Pleroma.Activity`: + +> config :pleroma, Pleroma.Search, module: Pleroma.Search.DatabaseSearch + +While it has no external dependencies, it has problems with performance and relevancy. + +## Meilisearch + +Note that it's quite a bit more memory hungry than PostgreSQL (around 4-5G for ~1.2 million +posts while idle and up to 7G while indexing initially). The disk usage for this additional index is also +around 4 gigabytes. Like [RUM](./cheatsheet.md#rum-indexing-for-full-text-search) indexes, it offers considerably +higher performance and ordering by timestamp in a reasonable amount of time. +Additionally, the search results seem to be more accurate. + +Due to high memory usage, it may be best to set it up on a different machine, if running pleroma on a low-resource +computer, and use private key authentication to secure the remote search instance. + +To use [meilisearch](https://www.meilisearch.com/), set the search module to `Pleroma.Search.Meilisearch`: + +> config :pleroma, Pleroma.Search, module: Pleroma.Search.Meilisearch + +You then need to set the address of the meilisearch instance, and optionally the private key for authentication. You might +also want to change the `initial_indexing_chunk_size` to be smaller if you're server is not very powerful, but not higher than `100_000`, +because meilisearch will refuse to process it if it's too big. However, in general you want this to be as big as possible, because meilisearch +indexes faster when it can process many posts in a single batch. + +> config :pleroma, Pleroma.Search.Meilisearch, +> url: "http://127.0.0.1:7700/", +> private_key: "private key", +> initial_indexing_chunk_size: 100_000 + +Information about setting up meilisearch can be found in the +[official documentation](https://docs.meilisearch.com/learn/getting_started/installation.html). +You probably want to start it with `MEILI_NO_ANALYTICS=true` environment variable to disable analytics. +At least version 0.25.0 is required, but you are strongly advised to use at least 0.26.0, as it introduces +the `--enable-auto-batching` option which drastically improves performance. Without this option, the search +is hardly usable on a somewhat big instance. + +### Private key authentication (optional) + +To set the private key, use the `MEILI_MASTER_KEY` environment variable when starting. After setting the _master key_, +you have to get the _private key_, which is actually used for authentication. + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch show-keys <your master key here> + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch show-keys <your master key here> + ``` + +You will see a "Default Admin API Key", this is the key you actually put into your configuration file. + +### Initial indexing + +After setting up the configuration, you'll want to index all of your already existing posts. Only public posts are indexed. You'll only +have to do it one time, but it might take a while, depending on the amount of posts your instance has seen. This is also a fairly RAM +consuming process for `meilisearch`, and it will take a lot of RAM when running if you have a lot of posts (seems to be around 5G for ~1.2 +million posts while idle and up to 7G while indexing initially, but your experience may be different). + +The sequence of actions is as follows: + +1. First, change the configuration to use `Pleroma.Search.Meilisearch` as the search backend +2. Restart your instance, at this point it can be used while the search indexing is running, though search won't return anything +3. Start the initial indexing process (as described below with `index`), + and wait until the task says it sent everything from the database to index +4. Wait until everything is actually indexed (by checking with `stats` as described below), + at this point you don't have to do anything, just wait a while. + +To start the initial indexing, run the `index` command: + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch index + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch index + ``` + +This will show you the total amount of posts to index, and then show you the amount of posts indexed currently, until the numbers eventually +become the same. The posts are indexed in big batches and meilisearch will take some time to actually index them, even after you have +inserted all the posts into it. Depending on the amount of posts, this may be as long as several hours. To get information about the status +of indexing and how many posts have actually been indexed, use the `stats` command: + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch stats + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch stats + ``` + +### Clearing the index + +In case you need to clear the index (for example, to re-index from scratch, if that needs to happen for some reason), you can +use the `clear` command: + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch clear + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch clear + ``` + +This will clear **all** the posts from the search index. Note, that deleted posts are also removed from index by the instance itself, so +there is no need to actually clear the whole index, unless you want **all** of it gone. That said, the index does not hold any information +that cannot be re-created from the database, it should also generally be a lot smaller than the size of your database. Still, the size +depends on the amount of text in posts. |