From 07ccab9766a6289326676a4814537564f25f35fa Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Mon, 23 Aug 2021 20:27:16 +0300 Subject: Add search/meilisearch documentation --- docs/configuration/search.md | 99 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 docs/configuration/search.md (limited to 'docs/configuration') diff --git a/docs/configuration/search.md b/docs/configuration/search.md new file mode 100644 index 000000000..14ec2bc63 --- /dev/null +++ b/docs/configuration/search.md @@ -0,0 +1,99 @@ +# Configuring search + +{! backend/administration/CLI_tasks/general_cli_task_info.include !} + +## Built-in search + +To use built-in search that has no external dependencies, set the search module to `Pleroma.Activity`: + +> config :pleroma, Pleroma.Search, module: Pleroma.Activity + +While it has no external dependencies, it has problems with performance and relevancy. + +## Meilisearch + +To use [meilisearch](https://www.meilisearch.com/), set the search module to `Pleroma.Search.Meilisearch`: + +> config :pleroma, Pleroma.Search, module: Pleroma.Search.Meilisearch + +You then need to set the address of the meilisearch instance, and optionally the private key for authentication. + +> config :pleroma, Pleroma.Search.Meilisearch, +> url: "http://127.0.0.1:7700/", +> private_key: "private key" + +Information about setting up meilisearch can be found in the +[official documentation](https://docs.meilisearch.com/learn/getting_started/installation.html). +You probably want to start it with `MEILI_NO_ANALYTICS=true` and `MEILI_NO_CENTRY=true` environment variables, +to disable analytics. + +### Private key authentication (optional) + +To set the private key, use the `MEILI_MASTER_KEY` environment variable when starting. After setting the _master key_, +you have to get the _private key_, which is actually used for authentication. + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch show-private-key + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch show-private-key + ``` + +This is the key you actually put into your configuration file. + +### Initial indexing + +After setting up the configuration, you'll want to index all of your already existsing posts. Only public posts are indexed. You'll only +have to do it one time, but it might take a while, depending on the amount of posts your instance has seen. This is also a fairly RAM +consuming process for `meilisearch`, and it will take a lot of RAM when running if you have a lot of posts (seems to be around 5G for ~1.2 +million posts while idle and up to 7G while indexing initially, but your experience may be different). + +To start te initial indexing, run the `index` command: + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch index + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch index + ``` + +This will show you the total amount of posts to index, and then show you the amount of posts indexed currently, until the numbers eventually +become the same. The posts are indexed in big batches and meilisearch will take some time to actually index them, even after you have +inserted all the posts into it. Depending on the amount of posts, this may be as long as several hours. To get information about the status +of indexing and how many posts have actually been indexed, use the `stats` command: + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch stats + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch stats + ``` + +### Clearing the index + +In case you need to clear the index (for example, to re-index from scratch, if that needs to happen for some reason), you can +use the `clear` command: + +=== "OTP" + ```sh + ./bin/pleroma_ctl search.meilisearch clear + ``` + +=== "From Source" + ```sh + mix pleroma.search.meilisearch clear + ``` + +This will clear **all** the posts from the search index. Note, that deleted posts are also removed from index by the instance itself, so +there is no need to actually clear the whole index, unless you want **all** of it gone. That said, the index does not hold any information +that cannot be re-created from the database, it should also generally be a lot smaller than the size of your database. Still, the size +depends on the amount of text in posts. -- cgit v1.2.3 From c569ad05b3d812c87171e68eac79eec749321033 Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Tue, 12 Oct 2021 19:14:39 +0300 Subject: Add more documentation about rum to meilisearch docs --- docs/configuration/search.md | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'docs/configuration') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index 14ec2bc63..e9743f1a4 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -12,6 +12,15 @@ While it has no external dependencies, it has problems with performance and rele ## Meilisearch +Note that it's quite a bit more memory hungry than PostgreSQL (around 4-5G for ~1.2 million +posts while idle and up to 7G while indexing initially). The disk usage for this additional index is also +around 4 gigabytes. Like [RUM](./cheatsheet.md#rum-indexing-for-full-text-search) indexes, it offers considerably +higher performance and ordering by timestamp in a reasonable amount of time. +Additionally, the search results seem to be more accurate. + +Due to high memory usage, it may be best to set it up on a different machine, if running pleroma on a low-resource +computer, and use private key authentication to secure the remote search instance. + To use [meilisearch](https://www.meilisearch.com/), set the search module to `Pleroma.Search.Meilisearch`: > config :pleroma, Pleroma.Search, module: Pleroma.Search.Meilisearch -- cgit v1.2.3 From 8898b5e927bae27a521e4eadd0faf970ad27c5bc Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Sun, 14 Nov 2021 20:15:12 +0300 Subject: Fix a typo in search docs --- docs/configuration/search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'docs/configuration') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index e9743f1a4..9adc7884f 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -60,7 +60,7 @@ have to do it one time, but it might take a while, depending on the amount of po consuming process for `meilisearch`, and it will take a lot of RAM when running if you have a lot of posts (seems to be around 5G for ~1.2 million posts while idle and up to 7G while indexing initially, but your experience may be different). -To start te initial indexing, run the `index` command: +To start the initial indexing, run the `index` command: === "OTP" ```sh -- cgit v1.2.3 From a6946048fbe049aa223d094d36eb767739ab5ff2 Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Wed, 17 Nov 2021 22:29:49 +0300 Subject: Rename Activity.Search to Search.DatabaseSearch --- docs/configuration/search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'docs/configuration') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index 9adc7884f..c7e77d9c2 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -6,7 +6,7 @@ To use built-in search that has no external dependencies, set the search module to `Pleroma.Activity`: -> config :pleroma, Pleroma.Search, module: Pleroma.Activity +> config :pleroma, Pleroma.Search, module: Pleroma.Search.DatabaseSearch While it has no external dependencies, it has problems with performance and relevancy. -- cgit v1.2.3 From 3412713c5b2fd24605b18933ef70de164ee14f2d Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Mon, 20 Dec 2021 18:16:33 +0300 Subject: Update search.md documentation with meilisearch indexing steps --- docs/configuration/search.md | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'docs/configuration') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index c7e77d9c2..7dbbd3e17 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -60,6 +60,15 @@ have to do it one time, but it might take a while, depending on the amount of po consuming process for `meilisearch`, and it will take a lot of RAM when running if you have a lot of posts (seems to be around 5G for ~1.2 million posts while idle and up to 7G while indexing initially, but your experience may be different). +The sequence of actions is as follows: + +1. First, change the configuration to use `Pleroma.Search.Meilisearch` as the search backend +2. Restart your instance, at this point it can be used while the search indexing is running, though search won't return anything +3. Start the initial indexing process (as described below with `index`), + and wait until the task says it sent everything from the database to index +4. Wait until everything is actually indexed (by checking with `stats` as described below), + at this point you don't have to do anything, just wait a while. + To start the initial indexing, run the `index` command: === "OTP" -- cgit v1.2.3 From 4f2637acc6c46ea39ae38e869903e7ffcc38b34d Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Mon, 20 Dec 2021 19:27:22 +0300 Subject: Add description for initial_indexing_chunk_size --- docs/configuration/search.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) (limited to 'docs/configuration') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index 7dbbd3e17..a785a18ad 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -25,11 +25,15 @@ To use [meilisearch](https://www.meilisearch.com/), set the search module to `Pl > config :pleroma, Pleroma.Search, module: Pleroma.Search.Meilisearch -You then need to set the address of the meilisearch instance, and optionally the private key for authentication. +You then need to set the address of the meilisearch instance, and optionally the private key for authentication. You might +also want to change the `initial_indexing_chunk_size` to be smaller if you're server is not very powerful, but not higher than `100_000`, +because meilisearch will refuse to process it if it's too big. However, in general you want this to be as big as possible, because meilisearch +indexes faster when it can process many posts in a single batch. > config :pleroma, Pleroma.Search.Meilisearch, > url: "http://127.0.0.1:7700/", -> private_key: "private key" +> private_key: "private key", +> initial_indexing_chunk_size: 100_000 Information about setting up meilisearch can be found in the [official documentation](https://docs.meilisearch.com/learn/getting_started/installation.html). -- cgit v1.2.3 From 1e23f527e3e22108b402552a0766e488048ed3f4 Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Tue, 22 Mar 2022 20:29:17 +0300 Subject: Change the meilisearch key auth to conform to 0.25.0 --- docs/configuration/search.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'docs/configuration') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index a785a18ad..82217e5ee 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -47,15 +47,15 @@ you have to get the _private key_, which is actually used for authentication. === "OTP" ```sh - ./bin/pleroma_ctl search.meilisearch show-private-key + ./bin/pleroma_ctl search.meilisearch show-keys ``` === "From Source" ```sh - mix pleroma.search.meilisearch show-private-key + mix pleroma.search.meilisearch show-keys ``` -This is the key you actually put into your configuration file. +You will see a "Default Admin API Key", this is the key you actually put into your configuration file. ### Initial indexing -- cgit v1.2.3 From b150e6f15e0f06c8e23c0ac66aeaf80eb2f8c31a Mon Sep 17 00:00:00 2001 From: Ekaterina Vaartis Date: Wed, 23 Mar 2022 11:36:01 +0300 Subject: Update meilisearch docs --- docs/configuration/search.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'docs/configuration') diff --git a/docs/configuration/search.md b/docs/configuration/search.md index 82217e5ee..f131948a7 100644 --- a/docs/configuration/search.md +++ b/docs/configuration/search.md @@ -37,8 +37,10 @@ indexes faster when it can process many posts in a single batch. Information about setting up meilisearch can be found in the [official documentation](https://docs.meilisearch.com/learn/getting_started/installation.html). -You probably want to start it with `MEILI_NO_ANALYTICS=true` and `MEILI_NO_CENTRY=true` environment variables, -to disable analytics. +You probably want to start it with `MEILI_NO_ANALYTICS=true` environment variable to disable analytics. +At least version 0.25.0 is required, but you are strongly adviced to use at least 0.26.0, as it introduces +the `--enable-auto-batching` option which drastically improves performance. Without this option, the search +is hardly usable on a somewhat big instance. ### Private key authentication (optional) -- cgit v1.2.3